profphd-utils-1.0.10/0000755015075101507510000000000012012401134013653 5ustar lkajanlkajanprofphd-utils-1.0.10/AUTHORS0000644015075101507510000000164512012371464014745 0ustar lkajanlkajanAuthors: See details in source files. Reinhard Schneider Ulrike Goebel C. Sander Burkhard Rost Bug fixes and enhancements by Laszlo Kajan and Guy Yachdav Copyright: See details in source files. Copyright 1998-2011 by Burkhard Rost EMBL, CUBIC (Columbia University, NY, USA) and LION Biosciences (Heidelberg, DE) Copyright 1994 by C. Sander MPIMF (Heidelberg, DE) Copyright 2009-2011 by Guy Yachdav CUBIC (Columbia University, NY, USA), Technical University Munich (Munich, DE), Biosof LLC (USA) Copyright 2009-2011 by Laszlo Kajan Technical University Munich (Munich, DE) Copyright 1988,1991,1997 by Reinhard Schneider LION Biosciences (Heidelberg, DE) Copyright 1997 by Ulrike Goebel LION Biosciences (Heidelberg, DE) profphd-utils-1.0.10/COPYING0000644015075101507510000010451312012371465014727 0ustar lkajanlkajan GNU GENERAL PUBLIC LICENSE Version 3, 29 June 2007 Copyright (C) 2007 Free Software Foundation, Inc. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The GNU General Public License is a free, copyleft license for software and other kinds of works. The licenses for most software and other practical works are designed to take away your freedom to share and change the works. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change all versions of a program--to make sure it remains free software for all its users. We, the Free Software Foundation, use the GNU General Public License for most of our software; it applies also to any other work released this way by its authors. You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things. To protect your rights, we need to prevent others from denying you these rights or asking you to surrender the rights. Therefore, you have certain responsibilities if you distribute copies of the software, or if you modify it: responsibilities to respect the freedom of others. For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. Developers that use the GNU GPL protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License giving you legal permission to copy, distribute and/or modify it. For the developers' and authors' protection, the GPL clearly explains that there is no warranty for this free software. For both users' and authors' sake, the GPL requires that modified versions be marked as changed, so that their problems will not be attributed erroneously to authors of previous versions. Some devices are designed to deny users access to install or run modified versions of the software inside them, although the manufacturer can do so. This is fundamentally incompatible with the aim of protecting users' freedom to change the software. The systematic pattern of such abuse occurs in the area of products for individuals to use, which is precisely where it is most unacceptable. Therefore, we have designed this version of the GPL to prohibit the practice for those products. If such problems arise substantially in other domains, we stand ready to extend this provision to those domains in future versions of the GPL, as needed to protect the freedom of users. Finally, every program is threatened constantly by software patents. States should not allow patents to restrict development and use of software on general-purpose computers, but in those that do, we wish to avoid the special danger that patents applied to a free program could make it effectively proprietary. To prevent this, the GPL assures that patents cannot be used to render the program non-free. The precise terms and conditions for copying, distribution and modification follow. TERMS AND CONDITIONS 0. Definitions. "This License" refers to version 3 of the GNU General Public License. "Copyright" also means copyright-like laws that apply to other kinds of works, such as semiconductor masks. "The Program" refers to any copyrightable work licensed under this License. Each licensee is addressed as "you". "Licensees" and "recipients" may be individuals or organizations. To "modify" a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a "modified version" of the earlier work or a work "based on" the earlier work. A "covered work" means either the unmodified Program or a work based on the Program. To "propagate" a work means to do anything with it that, without permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy. Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well. To "convey" a work means any kind of propagation that enables other parties to make or receive copies. Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying. An interactive user interface displays "Appropriate Legal Notices" to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License. If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion. 1. Source Code. The "source code" for a work means the preferred form of the work for making modifications to it. "Object code" means any non-source form of a work. A "Standard Interface" means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language. The "System Libraries" of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form. A "Major Component", in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it. The "Corresponding Source" for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work's System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work. The Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source. The Corresponding Source for a work in source code form is that same work. 2. Basic Permissions. All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met. This License explicitly affirms your unlimited permission to run the unmodified Program. The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work. This License acknowledges your rights of fair use or other equivalent, as provided by copyright law. You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force. You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright. Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you. Conveying under any other circumstances is permitted solely under the conditions stated below. Sublicensing is not allowed; section 10 makes it unnecessary. 3. Protecting Users' Legal Rights From Anti-Circumvention Law. No covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures. When you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work's users, your or third parties' legal rights to forbid circumvention of technological measures. 4. Conveying Verbatim Copies. You may convey verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program. You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee. 5. Conveying Modified Source Versions. You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions: a) The work must carry prominent notices stating that you modified it, and giving a relevant date. b) The work must carry prominent notices stating that it is released under this License and any conditions added under section 7. This requirement modifies the requirement in section 4 to "keep intact all notices". c) You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy. This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged. This License gives no permission to license the work in any other way, but it does not invalidate such permission if you have separately received it. d) If the work has interactive user interfaces, each must display Appropriate Legal Notices; however, if the Program has interactive interfaces that do not display Appropriate Legal Notices, your work need not make them do so. A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an "aggregate" if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation's users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate. 6. Conveying Non-Source Forms. You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways: a) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange. b) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by a written offer, valid for at least three years and valid for as long as you offer spare parts or customer support for that product model, to give anyone who possesses the object code either (1) a copy of the Corresponding Source for all the software in the product that is covered by this License, on a durable physical medium customarily used for software interchange, for a price no more than your reasonable cost of physically performing this conveying of source, or (2) access to copy the Corresponding Source from a network server at no charge. c) Convey individual copies of the object code with a copy of the written offer to provide the Corresponding Source. This alternative is allowed only occasionally and noncommercially, and only if you received the object code with such an offer, in accord with subsection 6b. d) Convey the object code by offering access from a designated place (gratis or for a charge), and offer equivalent access to the Corresponding Source in the same way through the same place at no further charge. You need not require recipients to copy the Corresponding Source along with the object code. If the place to copy the object code is a network server, the Corresponding Source may be on a different server (operated by you or a third party) that supports equivalent copying facilities, provided you maintain clear directions next to the object code saying where to find the Corresponding Source. Regardless of what server hosts the Corresponding Source, you remain obligated to ensure that it is available for as long as needed to satisfy these requirements. e) Convey the object code using peer-to-peer transmission, provided you inform other peers where the object code and Corresponding Source of the work are being offered to the general public at no charge under subsection 6d. A separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work. A "User Product" is either (1) a "consumer product", which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling. In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage. For a particular product received by a particular user, "normally used" refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product. A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product. "Installation Information" for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made. If you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information. But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM). The requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed. Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network. Corresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying. 7. Additional Terms. "Additional permissions" are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law. If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions. When you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it. (Additional permissions may be written to require their own removal in certain cases when you modify the work.) You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission. Notwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms: a) Disclaiming warranty or limiting liability differently from the terms of sections 15 and 16 of this License; or b) Requiring preservation of specified reasonable legal notices or author attributions in that material or in the Appropriate Legal Notices displayed by works containing it; or c) Prohibiting misrepresentation of the origin of that material, or requiring that modified versions of such material be marked in reasonable ways as different from the original version; or d) Limiting the use for publicity purposes of names of licensors or authors of the material; or e) Declining to grant rights under trademark law for use of some trade names, trademarks, or service marks; or f) Requiring indemnification of licensors and authors of that material by anyone who conveys the material (or modified versions of it) with contractual assumptions of liability to the recipient, for any liability that these contractual assumptions directly impose on those licensors and authors. All other non-permissive additional terms are considered "further restrictions" within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying. If you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms. Additional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way. 8. Termination. You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11). However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation. Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice. Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10. 9. Acceptance Not Required for Having Copies. You are not required to accept this License in order to receive or run a copy of the Program. Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance. However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License. Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so. 10. Automatic Licensing of Downstream Recipients. Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License. You are not responsible for enforcing compliance by third parties with this License. An "entity transaction" is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations. If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party's predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts. You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it. 11. Patents. A "contributor" is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based. The work thus licensed is called the contributor's "contributor version". A contributor's "essential patent claims" are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version. For purposes of this definition, "control" includes the right to grant patent sublicenses in a manner consistent with the requirements of this License. Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor's essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version. In the following three paragraphs, a "patent license" is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement). To "grant" such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party. If you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients. "Knowingly relying" means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient's use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid. If, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it. A patent license is "discriminatory" if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License. You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007. Nothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law. 12. No Surrender of Others' Freedom. If conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not convey it at all. For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program. 13. Use with the GNU Affero General Public License. Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU Affero General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the special requirements of the GNU Affero General Public License, section 13, concerning interaction through a network will apply to the combination as such. 14. Revised Versions of this License. The Free Software Foundation may publish revised and/or new versions of the GNU General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU General Public License, you may choose any version ever published by the Free Software Foundation. If the Program specifies that a proxy can decide which future versions of the GNU General Public License can be used, that proxy's public statement of acceptance of a version permanently authorizes you to choose that version for the Program. Later license versions may give you additional or different permissions. However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version. 15. Disclaimer of Warranty. THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. Limitation of Liability. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 17. Interpretation of Sections 15 and 16. If the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . Also add information on how to contact you by electronic and paper mail. If the program does terminal interaction, make it output a short notice like this when it starts in an interactive mode: Copyright (C) This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, your program's commands might be different; for a GUI interface, you would use an "about box". You should also get your employer (if you work as a programmer) or school, if any, to sign a "copyright disclaimer" for the program, if necessary. For more information on this, and how to apply and follow the GNU GPL, see . The GNU General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. But first, please read . profphd-utils-1.0.10/ChangeLog0000644015075101507510000000167212012376034015445 0ustar lkajanlkajanprofphd-utils (1.0.10) unstable; urgency=low * Removed CONTACT and AVAILABLE fields from output. These were outdated and incorrect. -- Laszlo Kajan Tue, 14 Aug 2012 09:19:02 +0200 profphd-utils (1.0.9) unstable; urgency=low * -fbacktrace in make file * convert_seq.f: increased parameter MAX_NAME_LEN to 1024 * lib-maxhom.f and lib-convert.f: in SUBROUTINE GETCHAR(KCHAR,CHARARR,CTEXT): changed LINE from fixed value to: CHARACTER*(KCHAR) - KCHAR was not used in this SUB before. * fixed gfortran warnings, added LDFLAGS -- Laszlo Kajan Mon, 28 Nov 2011 18:43:44 +0100 profphd-utils (1.0.8) unstable; urgency=low * dash safe Makefile -- Laszlo Kajan Tue, 27 Sep 2011 18:40:48 +0200 profphd-utils (1.0.7) unstable; urgency=low * added man pages for convert_seq and filter_hssp -- Laszlo Kajan Tue, 27 Sep 2011 10:07:47 +0200 profphd-utils-1.0.10/Makefile0000644015075101507510000000430312012376042015324 0ustar lkajanlkajanPACKAGE := profphd-utils VERSION := 1.0.10 DISTDIR := $(PACKAGE)-$(VERSION) mandir := $(prefix)/share/man man1dir := $(mandir)/man1 ARCH = LINUX F77 = gfortran AM_FFLAGS := -O2 -fbounds-check -Wuninitialized -fbacktrace -g #AM_FFLAGS := $(AM_FFLAGS) -Wall -Wno-unused -Wtabs BINARIES=convert_seq filter_hssp MAN1 := convert_seq.1 filter_hssp.1 MANS := $(MAN1) all: $(BINARIES) $(MANS) %.1 : %.pod Makefile pod2man -c 'User Commands' -r "$(VERSION)" -name $(shell echo "$(basename $@)" | tr '[:lower:]' '[:upper:]') "$<" "$@" convert_seq filter_hssp : maxhom.common maxhom.param convert_seq : convert_seq.f lib-maxhom.f lib-sys-$(ARCH).f $(F77) $(CPPFLAGS) $(AM_FFLAGS) $(FFLAGS) $(LDFLAGS) -o $@ convert_seq.f lib-maxhom.f lib-sys-$(ARCH).f filter_hssp : filter_hssp.f lib-maxhom.f lib-sys-$(ARCH).f $(F77) $(CPPFLAGS) $(AM_FFLAGS) $(FFLAGS) $(LDFLAGS) -o $@ filter_hssp.f lib-maxhom.f lib-sys-$(ARCH).f #lib-maxhom.o : maxhom.common maxhom.param install: mkdir -p $(DESTDIR)$(prefix)/bin && \ cp $(BINARIES) $(DESTDIR)$(prefix)/bin/ mkdir -p $(DESTDIR)$(man1dir) && \ cp $(MAN1) $(DESTDIR)$(man1dir)/ clean: rm -f *.o convert_seq filter_hssp convert_seq.1 filter_hssp.1 dist: $(DISTDIR) tar -c -f - "$(DISTDIR)" | gzip -c >$(DISTDIR).tar.gz rm -rf $(DISTDIR) $(DISTDIR): distclean rm -rf $(DISTDIR) && mkdir -p $(DISTDIR) && \ rsync -avC \ --exclude /*-stamp \ --exclude .*.swp \ AUTHORS \ ChangeLog \ compile_macintel.csh \ compile.pl \ convert_seq.f \ convert_seq.pod \ COPYING \ dead.f \ filter_hssp.f \ filter_hssp.pod \ lib-convert.f \ lib-maxhom.f \ lib-maxhom-node-pvm3.f \ lib-metr.f \ lib-metr-sys.f \ lib-sys-LINUX.f \ long.msf \ Makefile \ maxhom.common \ maxhom.default \ maxhom.f \ maxhom.param \ metr2st_make.f \ $(PACKAGE).spec \ ReadMe \ ReadMe-linux \ ReadMe-mac \ $(DISTDIR)/; distclean: clean rm -rf\ $(DISTDIR) \ $(DISTDIR).tar.gz help: @echo "all*: convert_seq filter_hssp" @echo "convert_seq" @echo "filter_hssp" @echo "install" @echo @echo "VARIABLES" @echo "prefix - prefix all paths with 'prefix'" @echo "DESTDIR - installation directory prefix" .PHONY: all clean dist distclean help install profphd-utils-1.0.10/ReadMe0000644015075101507510000000440012012371464014745 0ustar lkajanlkajan*----------------------------------------------------------------------* * Burkhard Rost Aug, 1998 version 0.1 * * EMBL/LION http://www.embl-heidelberg.de/~rost/ * * D-69012 Heidelberg rost@embl-heidelberg.de * *----------------------------------------------------------------------* * ------------------------------ * * This directory contains: * * ------------------------------ * * * * - all FORTRAN tools needed : 1 convert_seq.f * * : 2 filter_hssp.f * * : 3 maxhom.f * * - FORTRAN include files : 1 maxhom.para * * : 2 maxhom.common * * - all FORTRAN libs : * * convert_seq : lib-maxhom.f + lib-sys-ARCH.f * * filter_hssp : lib-maxhom.f + lib-sys-ARCH.f * * maxhom : lib-maxhom.f + lib-sys-ARCH.f * * : + lib-maxhom-node-pvm3.f * * - general perl make script : (compile.pl) * * run this script with no argument for help! * * - make-files : (mat/make_*) * * * * ------------------------------ * * to DO for compile: * * ------------------------------ * * * * compile.pl -> gives help * * compile.pl auto -> DOES it.. * * * * * *----------------------------------------------------------------------* profphd-utils-1.0.10/ReadMe-linux0000644015075101507510000000067312012371464016112 0ustar lkajanlkajan#on oak /usr/pub/pgi/linux86/bin/pgf77 -o maxhom.LINUX -O3 -C maxhom.f lib-sys-LINUX.f lib-maxhom.f lib-maxhom-node-pvm3.f /usr/pub/pgi/linux86/bin/pgf77 -o filter_hssp.LINUX -O3 -C filter_hssp.f lib-sys-LINUX.f lib-maxhom.f /usr/pub/pgi/linux86/bin/pgf77 -o convert_seq.LINUX -O3 -C convert_seq.f lib-sys-LINUX.f lib-convert.f /usr/pub/pgi/linux86/bin/pgf77 -o profile_make.LINUX -O3 -C profile_make.f lib-sys-LINUX.f lib-profile-make.f profphd-utils-1.0.10/ReadMe-mac0000644015075101507510000000232112012371465015504 0ustar lkajanlkajan# gnu gfortran compiler - only on INTEL MAC /usr/local/bin/gfortran -o maxhom.MACINTEL -O3 -C maxhom.f lib-sys-MAC.f lib-maxhom.f lib-maxhom-node-pvm3.f /usr/local/bin/gfortran -o convert_seq.MACINTEL -O3 -C convert_seq.f lib-sys-MAC.f lib-convert.f /usr/local/bin/gfortran -o filter_hssp.MACINTEL -O3 -C filter_hssp.f lib-sys-MAC.f lib-maxhom.f /usr/local/bin/gfortran -o profile_make.MACINTEL -O3 -C profile_make.f lib-sys-MAC.f lib-profile-make.f # gnu g77 compiler g77 -o maxhom.MAC -O3 -C maxhom.f lib-sys-MAC.f lib-maxhom.f lib-maxhom-node-pvm3.f g77 -o convert_seq.MAC -O3 -C convert_seq.f lib-sys-MAC.f lib-convert.f g77 -o filter_hssp.MAC -O3 -C filter_hssp.f lib-sys-MAC.f lib-maxhom.f g77 -o profile_make.MAC -O3 -C profile_make.f lib-sys-MAC.f lib-profile-make.f # IBM compiler setenv PATH /opt/ibmcmp/xlf/8.1/bin:/usr/pub/bin:/usr/local/bin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/etc:/sbin:/usr/local/bin /opt/ibmcmp/xlf/8.1/bin/f77 -o maxhom.IBM -O3 -C maxhom.f lib-sys-MACIBM.f lib-maxhom.f lib-maxhom-node-pvm3.f /opt/ibmcmp/xlf/8.1/bin/f77 -o convert_seq.IBM -O3 -C convert_seq.f lib-sys-MACIBM.f lib-convert.f /opt/ibmcmp/xlf/8.1/bin/f77 -o filter_hssp.IBM -O3 -C filter_hssp.f lib-sys-MACIBM.f lib-maxhom.f profphd-utils-1.0.10/compile.pl0000755015075101507510000001574712012371464015675 0ustar lkajanlkajan#!/bin/env perl ##!/usr/local/bin/perl -w # # $scrName=$0;$scrName=~s/^.*\/|\.pl//g; $scrGoal="compile PHD FORTRAN tools programs (convert_seq, filter_hssp)"; #$scrGoal="compile PHD FORTRAN tools programs (convert_seq, filter_hssp, maxhom ?)"; # # $[ =1 ; $dirMake= "mat/"; # for PHD $fileMake{"convert"}="make_convert_seq.ARCH"; $fileMake{"filter"}= "make_filter_hssp.ARCH"; # for TOPITS $fileMake{"metr2st"}="make_metr2st_make.ARCH"; # for MaxHom $fileMake{"maxhom"}= "make_maxhom.ARCH"; $fileMake{"profile"}="make_profile_make.ARCH"; $exe{"convert"}= "convert_seq.". "ARCH"; $exe{"filter"}= "filter_hssp.". "ARCH"; $exe{"metr2st"}= "metr2st_make.". "ARCH"; $exe{"maxhom"}= "maxhom.". "ARCH"; $exe{"profile"}= "profile_make.". "ARCH"; $LnotMaxhome= 1; #$LnotMaxhome= 0; @all=("convert","filter","metr2st","maxhom","profile"); # ------------------------------ if ($#ARGV<1){ # help print "goal: $scrGoal\n"; print "use: '$scrName '\n"; print "opt: (pass the following arguments like '$scrName arg=val')\n"; # 'keyword' 'value' 'description' printf " %-12s=%-10s %-22s %-s\n","." x 10, "." x 10, "." x 15, "." x 20; printf " %-12s=%-10s %-22s %-s\n","keyword", "value", "default", "explanation"; printf " %-12s=%-10s %-22s %-s\n","." x 10, "." x 10, "." x 15, "." x 20; printf " %-12s=%-10s %-22s %-s\n","ARCH", "ALPHA", "", "system arch: SGI64|SGI32|SGI5|ALPHA|SUNMP"; printf " %-12s=%-10s %-22s %-s\n","dir", "mat",$dirMake, "directory with make file"; foreach $kwd (@all){ next if ($LnotMaxhom && $kwd eq "maxhom"); printf " %-12s=%-10s %-22s %-s\n","exe_".$kwd,"x", $exe{$kwd}, "name of executable"; } foreach $kwd (@all){ next if ($LnotMaxhom && $kwd eq "maxhom"); printf " %-12s=%-10s %-22s %-s\n","make_".$kwd,"x", $fileMake{$kwd},"make file";} foreach $kwd (@all){ next if ($LnotMaxhom && $kwd eq "maxhom"); printf " %-12s %-10s %-22s %-s\n",$kwd, "no value", "1","compiles $kwd";} exit; } foreach $kwd (@all) { $do{$kwd}=0;} $Lopted=0; # ------------------------------ # read command line foreach $arg (@ARGV){ next if ($arg eq $ARGV[1] && $arg =~ /^(auto|do)$/); if ($arg=~/^ARCH=(.*)$/i) { $ARCH=$1;} elsif ($arg=~/^make_(.*)=(.*)$/) { $fileMake{$1}=$2;} elsif ($arg=~/^exe_(.*)=(.*)$/) { $exe{$1}=$2;} elsif ($arg=~/^conv(ert)?$/) { $do{"convert"}=1; $Lopted=1;} elsif ($arg=~/^fil(ter)?$/) { $do{"filter"}= 1; $Lopted=1;} elsif ($arg=~/^metr(2st)?$/) { $do{"metr2st"}=1; $Lopted=1;} elsif ($arg=~/^max(hom)?$/) { $do{"maxhom"}= 1; $Lopted=1;} elsif ($arg=~/^prof(ile)?$/) { $do{"profile"}=1; $Lopted=1;} elsif ($arg=~/^dir=(.*)$/) { $dirMake=$1;} # elsif ($arg=~/^=(.*)$/){ $=$1;} else { print "*** ERROR $scrName: wrong command line arg '$arg'\n"; die;}} $ARCH= $ARCH || $ENV{'ARCH'}; $dirMake.="/" if ($dirMake !~/\// && length($dirMake)>=1); if (! defined $ARCH) { $ansr= &get_in_keyboardLoc("ARCH","",$scrName); $ARCH=$ansr; } if ($ARCH !~ /ALPHA|SGI(64|32|5)|SUNMP|SUN4SOL|LINUX/){ print "--- $scrName: ARCH must be either of the following :\n"; print "--- "." " x length($scrName)." ALPHA|SGI64|SGI32|SGI5|SUNMP|SUN4SOL\n"; $ansr= &get_in_keyboardLoc("ARCH","",$scrName); $ARCH=$ansr; } if ($ARCH !~ /ALPHA|SGI(64|32|5)|SUNMP|SUN4SOL|LINUX/){ print "*** ERROR $scrName: ARCH really must be either of the following :\n"; print "*** "." " x length($scrName)." ALPHA|SGI64|SGI32|SGI5|SUNMP|SUN4SOL\n"; die; } # default: do all 3 compilations if (! $Lopted){ foreach $kwd (@all){ next if ($LnotMaxhom && $kwd eq "maxhom"); $do{$kwd}=1;}} foreach $kwd (@all) { next if (! $do{$kwd}); $fileMake{$kwd}=~s/ARCH/$ARCH/ if ($fileMake{$kwd} =~ /ARCH/); $fileMake{$kwd}=$dirMake.$fileMake{$kwd} if ($fileMake{$kwd} !~ /$dirMake/); $fileMake=$fileMake{$kwd}; undef $exeOut; if (defined $exe{$kwd}) { $exeOut= $exe{$kwd}; $exeOut=~s/ARCH/$ARCH/ if ($exeOut=~/ARCH/);} $exeDef= "convert_seq.". $ARCH if ($kwd eq "convert"); $exeDef= "filter_hssp.". $ARCH if ($kwd eq "filter"); $exeDef= "maxhom." . $ARCH if ($kwd eq "maxhom"); $exeDef= "metr2st_make.".$ARCH if ($kwd eq "metr2st"); $exeDef= "profile_make.".$ARCH if ($kwd eq "profile"); # local copy of make file $fileMakeTmp="make_".$kwd."_tmp_".$$.".".$ARCH; print "--- $scrName: system '\\cp $fileMake $fileMakeTmp '\n"; system("\\cp $fileMake $fileMakeTmp"); # compile print "--- $scrName: system 'make -f $fileMakeTmp'\n"; system("make -f $fileMakeTmp"); if (-e $exeDef && defined $exeOut && ($exeDef ne $exeOut)) { print "--- $scrName: system '\\mv $exeDef $exeOut'\n"; system("\\mv $exeDef $exeOut"); $exeDef=$exeOut; } print "--- $scrName: expected executable: $exeDef\n" if (-e $exeDef); if (! -e $exeDef) { print "*** ERROR $scrName: never made '$exeDef'\n"; print "*** keyword assigned to executable?? check code!\n";} unlink($fileMakeTmp); } exit; #=============================================================================== sub get_in_keyboardLoc { local($des,$def,$pre,$Lmirror)=@_;local($txt); #-------------------------------------------------------------------------------- # get_in_keyboardLoc gets info from keyboard # in: $des : keyword to get # in: $def : default settings # in: $pre : text string beginning screen output # default '--- ' # in: $Lmirror: if true, the default is mirrored # out: $val : value obtained #-------------------------------------------------------------------------------- $pre= "---" if (! defined $pre); $Lmirror=0 if (! defined $Lmirror || ! $Lmirror); $txt=""; # ini printf "%-s %-s\n", $pre,"-" x (79 - length($pre)); printf "%-s %-15s:%-s\n", $pre,"type value for",$des; if (defined $def){ printf "%-s %-15s:%-s\n",$pre,"type RETURN to enter value, or to keep default"; printf "%-s %-15s>%-s\n",$pre,"default value",$def;} else { printf "%-s %-15s>%-s\n",$pre,"type RETURN to enter value"; } $txt=$def if ($Lmirror); # mirror it printf "%-s %-15s>%-s", $pre,"type",$txt; while(){ $txt.=$_; last if ($_=~/\n/);} $txt=~s/^\s+|\s+$//g; $txt=$def if (length($txt) < 1); printf "%-s %-15s>%-s\n", $pre,"--> you chose",$txt; return ($txt); } # end of get_in_keyboardLoc profphd-utils-1.0.10/compile_macintel.csh0000755015075101507510000000203612012371464017676 0ustar lkajanlkajan echo "--- now Maxhom: " echo "/usr/local/bin/gfortran -o maxhom.MACINTEL -O3 -C maxhom.f lib-sys-MAC.f lib-maxhom.f lib-maxhom-node-pvm3.f " /usr/local/bin/gfortran -o maxhom.MACINTEL -O3 -C maxhom.f lib-sys-MAC.f lib-maxhom.f lib-maxhom-node-pvm3.f echo "--- now convert_seq: " echo "/usr/local/bin/gfortran -o convert_seq.MACINTEL -O3 -C convert_seq.f lib-sys-MAC.f lib-convert.f" /usr/local/bin/gfortran -o convert_seq.MACINTEL -O3 -C convert_seq.f lib-sys-MAC.f lib-convert.f echo "--- now filter_hssp:" echo "/usr/local/bin/gfortran -o filter_hssp.MACINTEL -O3 -C filter_hssp.f lib-sys-MAC.f lib-maxhom.f" /usr/local/bin/gfortran -o filter_hssp.MACINTEL -O3 -C filter_hssp.f lib-sys-MAC.f lib-maxhom.f echo "--- exit before profile_make! " echo "--- " echo "--- stay cool IS ok!!" echo "--- " exit echo "--- now profile_make:" echo "/usr/local/bin/gfortran -o profile_make.MACINTEL -O3 -C profile_make.f lib-sys-MAC.f lib-profile-make.f" /usr/local/bin/gfortran -o profile_make.MACINTEL -O3 -C profile_make.f lib-sys-MAC.f lib-profile-make.f profphd-utils-1.0.10/convert_seq.f0000644015075101507510000025632312012371465016402 0ustar lkajanlkajan*----------------------------------------------------------------------* * * * FORTRAN code for program CONVERT_SEQ * * conversion of sequence and alignment formats * * * *----------------------------------------------------------------------* * * * Authors: * * * * Reinhard Schneider Mar, 1991 version 1.0 * * Ulrike Goebel Mar, 1997 version 1.1 * * Reinhard Schneider Mar, 1997 version 2.0 * * LION http://www.lion-ag/ * * D-69120 Heidelberg schneider@lion-ag.de * * * * Burkhard Rost May, 1998 version 2.1 * * Oct, 1998 version 2.2 * * * * EMBL/LION http://www.embl-heidelberg.de/~rost/ * * D-69012 Heidelberg rost@embl-heidelberg.de * * * *----------------------------------------------------------------------* * * * General note: - uses library lib-maxhom.f * * * * * *----------------------------------------------------------------------* PROGRAM CONVERT_SEQ IMPLICIT NONE C---- C---- parameters C---- INTEGER MAXRES,MAXCHAIN,MAXINS,MAXINSBUF,IN,OUT, + CODELEN,MAXALIGNS,MAXCORE,MAXAA,MAX_NAME_LEN PARAMETER (MAXRES= 9999) C PARAMETER (MAXRES= 30011) PARAMETER (MAXALIGNS= 8765) C PARAMETER (MAXALIGNS= 19999) C PARAMETER (MAXINS= 50000) C PARAMETER (MAXINSBUF= 200000) PARAMETER (MAXINS= 100000) PARAMETER (MAXINSBUF= 1000000) PARAMETER (MAXCORE= 30303030) C PARAMETER (MAXCORE= 3213213) C PARAMETER (MAXCORE= 2888888) PARAMETER (MAXCHAIN= 50) PARAMETER (CODELEN= 40) PARAMETER (MAXAA= 20) PARAMETER (MAX_NAME_LEN= 1024) C only used to get rid of INDEX command (CPU time) INTEGER NASCII PARAMETER (NASCII= 256) C files PARAMETER (IN= 12) PARAMETER (OUT= 14) C---- br 2003-08: switch for new (1999) HSSP curve LOGICAL LNEWCURVE PARAMETER (LNEWCURVE= .TRUE.) C PARAMETER (LNEWCURVE= .FALSE.) INTEGER ISTART,ISTOP,NBLOCKS,NSYMBOLS,NBREAKS,NREAD, + IALIGN, CHANGEPOS,NCHAIN,KCHAIN,NRES,CALLS, + NTRIALS,NRESTMP INTEGER RANGE(2),CHBPOS(MAXCHAIN-1) CHARACTER*1 C,CGAPCHAR,CGAPCHARPIR CHARACTER*(4*MAXCHAIN) CCHAIN CHARACTER*(MAXRES) SEQ, STRUC CHARACTER*(MAX_NAME_LEN) FILENAME,BASENAME,OUTNAME,TEMPNAME, + VARMETRIC CHARACTER*130 QUESTION CHARACTER ACCNUM*80,PDBREF*10,TRANS*26,INFORMAT*4, + OUTFORMAT*1 LOGICAL LFILTER,LINSERT,LDELETE,LDOEXP, LDOCLIP, + TRUNCATED, ERROR,MULTFORMAT, DO_INIT_INSBUF C MSFTOSEQ C pointer arrays: ifir(i) begin of PDBSEQ against alignment i INTEGER IFIR(MAXALIGNS),ILAS(MAXALIGNS), + ALIPOINTER(MAXALIGNS), + NALIGN, ALILEN,MSFCHECK,SEQCHECK(MAXALIGNS) CHARACTER*(CODELEN) PDBNAME CHARACTER ALISEQ(MAXCORE) CHARACTER*1 TYPE REAL WEIGHT(MAXALIGNS) C READHSSP LOGICAL LCONSERV,LOLDVERSION,LHSSP_LONG_ID C attributes of sequence with known structure CHARACTER PDBSEQ(MAXRES),CHAINID(MAXRES), + SECSTR_HSSP(MAXRES),SHEETLABEL(MAXRES) CHARACTER*7 COLS(MAXRES) CHARACTER*12 PDBID CHARACTER*40 HEADER CHARACTER*400 COMPND,SOURCE,AUTHOR,CHAINREMARK INTEGER PDBNO(MAXRES),BP1(MAXRES),BP2(MAXRES),ACC(MAXRES) C attributes of aligned sequences C flags (take ' ', else other) CHARACTER EXCLUDEFLAG(MAXALIGNS) CHARACTER*5 STRID(MAXALIGNS) CHARACTER*10 ACCNUM_HSSP(MAXALIGNS) CHARACTER*40 EMBLID(MAXALIGNS) CHARACTER*60 PROTNAME(MAXALIGNS) INTEGER JFIR(MAXALIGNS),JLAS(MAXALIGNS),LALI(MAXALIGNS), + NGAP(MAXALIGNS),LGAP(MAXALIGNS),LENSEQ(MAXALIGNS), + INSNUMBER,INSALI(MAXINS),INSPOINTER(MAXINS), + INSLEN(MAXINS),INSBEG_1(MAXINS),INSBEG_2(MAXINS) REAL IDE(MAXALIGNS),SIM(MAXALIGNS) CHARACTER INSBUFFER(MAXINSBUF) C attributes of profile INTEGER VAR(MAXRES),SEQPROF(MAXRES,MAXAA),NOCC(MAXRES), + NDEL(MAXRES),NINS(MAXRES),RELENT(MAXRES) REAL ENTROPY(MAXRES) REAL CONSWEIGHT(MAXRES),CONSWEIGHT_MIN C threshold INTEGER I C---- C---- C---- end of settings C---- ------------------------------------------------------------------ C---- C---- ------------------------------------------------------------------ C initialisation NTRIALS= 0 CALLS= 1 CGAPCHAR= '.' CGAPCHARPIR= '-' OUTNAME= ' ' LCONSERV= .FALSE. LOLDVERSION= .FALSE. LHSSP_LONG_ID= .FALSE. VARMETRIC= '/home/rost/pub/max/mat/Maxhom_GCG.metric' DO_INIT_INSBUF=.TRUE. TRANS= 'VLIMFWYGAPSTCHRKQENDBZX!-' TEMPNAME= ' ' INSNUMBER= 0 CONSWEIGHT_MIN=0.01 C----------------------------------------------------------------------- c tempname = '/data/hssp/1acx.hssp' 1 CALL GETCHAR(MAX_NAME_LEN, TEMPNAME, + ' Name of input sequence file :') C----------------------------------------------------------------------- CALL EXTRACT_CHAINS(TEMPNAME,FILENAME,MAXCHAIN,CCHAIN) CALL CHECKFORMAT(IN,FILENAME,INFORMAT,ERROR) IF ( ERROR ) THEN IF ( NTRIALS .LT. 3 ) THEN NTRIALS = NTRIALS + 1 GOTO 1 ELSE STOP 'NO VALID SEQUENCE FILE GIVEN ! ' ENDIF ENDIF CALL GETPIDCODE(FILENAME,BASENAME) C----------------------------------------------------------------------- c...read sequence(s) MULTFORMAT = .FALSE. CALL STRPOS(INFORMAT,ISTART,ISTOP) IF ( INFORMAT(ISTART:ISTOP) .EQ. 'HSSP' ) THEN MULTFORMAT = .TRUE. CALL READHSSP(IN,FILENAME,ERROR,MAXRES,MAXALIGNS, + MAXCORE,MAXINS,MAXINSBUF,PDBID,HEADER,COMPND, + SOURCE,AUTHOR,ALILEN,NCHAIN,KCHAIN,CHAINREMARK, + NALIGN,EXCLUDEFLAG,EMBLID,STRID,IDE,SIM,IFIR, + ILAS,JFIR,JLAS,LALI,NGAP,LGAP,LENSEQ,ACCNUM_HSSP, + PROTNAME,PDBNO,PDBSEQ,CHAINID,SECSTR_HSSP, + COLS,SHEETLABEL,BP1,BP2,ACC,NOCC,VAR,ALISEQ, + ALIPOINTER,SEQPROF,NDEL,NINS,ENTROPY,RELENT, + CONSWEIGHT,INSNUMBER,INSALI,INSPOINTER,INSLEN, + INSBEG_1,INSBEG_2,INSBUFFER,LCONSERV, + LHSSP_LONG_ID) IF (ERROR) THEN WRITE(6,*)' *** ERROR reading HSSP-file:',FILENAME STOP '*** ERROR reading HSSP-file' GOTO 1 ENDIF ALILEN = ALILEN + KCHAIN - 1 CALL MARK_DUPLICATES(EMBLID,NALIGN) ELSE IF ( INFORMAT(ISTART:ISTOP) .EQ. 'MSF' ) THEN MULTFORMAT = .TRUE. CALL READ_MSF(IN,FILENAME,MAXALIGNS,MAXCORE,ALISEQ, 1 ALIPOINTER,IFIR,ILAS,JFIR,JLAS,TYPE,EMBLID, 2 WEIGHT,SEQCHECK,MSFCHECK,ALILEN,NALIGN,ERROR) IF (ERROR) THEN WRITE(*,*)'*** ERROR READING MSF-FILE:',FILENAME GOTO 1 STOP '*** ERROR reading MSF-file' ENDIF ELSE SEQ = ' ' CALL GET_SEQ(IN,FILENAME,TRANS,CCHAIN, + COMPND,ACCNUM,PDBREF,PDBNO,NRES,SEQ, + STRUC,ACC,TRUNCATED,ERROR) IF ( ERROR ) THEN WRITE(6,*)'*** ERROR in GET_SEQ for file=',FILENAME STOP ENDIF ENDIF C-------------------------------------------------------------------- RANGE(1) = 1 IF ( MULTFORMAT ) THEN RANGE(2) = ALILEN ELSE RANGE(2) = NRES CALL CHAINBREAKPOS( SEQ,RANGE(1),RANGE(2),MAXCHAIN-1, 1 NBREAKS,CHBPOS, ERROR ) IF ( ERROR ) STOP ENDIF C-------------------select output format------------------------------ C return label for next choice of format 2 CONTINUE IF ( MULTFORMAT ) THEN OUTFORMAT = 'm' CALL GETCHAR (1,OUTFORMAT, + 'Output-format? [*m*]: m = MSF/n' // + ' h = HSSP/n' // + ' p = PIR/n' // + ' f = FASTA/Pearson/n' // + ' d = DAF/n' // + ' y = PHYLIP ' ) ELSE OUTFORMAT = 'g' CALL GETCHAR (1,OUTFORMAT, 1 'Output-format? [*g*]: g = GCG/n' // 2 ' a = Alb/n' // 3 ' k = Klein (old-NBRF-format)/n'// 4 ' p = PIR/n' // 5 ' e = EMBL/SWISS/n' // 6 ' s = star/n' // 7 ' f = FASTA/Pearson ' ) ENDIF CALL LOWTOUP(OUTFORMAT,1) C----------------------------------------------------------------------- IF ( OUTFORMAT .EQ. 'H' ) THEN CALL GETCHAR(MAX_NAME_LEN,VARMETRIC, 1 ' use which metric file for variability calculation ? ' ) CALL STRPOS(INFORMAT,ISTART,ISTOP) IF ( INFORMAT(ISTART:ISTOP) .EQ. 'MSF' ) THEN C = 'N' C LDOCLIP=.FALSE. CALL GETCHAR(1,C, 1 ' treat gaps in master sequence as insertions ? ') CALL LOWTOUP(C,1) C IF (C .EQ. 'Y') LDOCLIP=.TRUE. LDOCLIP = C .EQ. 'Y' ENDIF ELSE c...select residue range RANGE(1) = 1 IF ( MULTFORMAT ) THEN RANGE(2)= ALILEN QUESTION= 1 'Do you want to select a fragment of this alignment?[*N*]' ELSE RANGE(2)= NRES QUESTION= 1 'Do you want to select a fragment of this protein?[*N*]' ENDIF NRESTMP=RANGE(2) CALL STRPOS(INFORMAT,ISTART,ISTOP) C = 'N' CALL GETCHAR(1,C,QUESTION) CALL LOWTOUP(C,1) IF ( C .NE. 'N' .AND. C .NE. ' ') THEN CALL GETINT 1 (2,RANGE, 2 ' Enter residue numbers of fragment start and stop: ') C---- br 98.05: correct for too high IF (RANGE(2).GT.NRESTMP) THEN RANGE(2)= MIN(RANGE(2),NRESTMP) WRITE(6,'(A,I5)')' *** oops, 2nd number too large, '// + 'corrected to: ',RANGE(2) END IF IF (RANGE(1).GT.NRESTMP) THEN RANGE(1)= 1 WRITE(6,'(A,I5)')' *** oops, 1st number too large, '// + 'corrected to: ',RANGE(1) END IF IF (RANGE(1).EQ.RANGE(2)) THEN WRITE(6,'(A,I5,A3,I5)')' *** oops you wanted: ', + RANGE(1),' - ',RANGE(2) RANGE(1)=1 RANGE(2)=NRESTMP WRITE(6,'(A,I5,A3,I5)')' *** was deemed strange and'// + ' corrected to: ',RANGE(1),' - ',RANGE(2) END IF ENDIF ENDIF C----------------------------------------------------------------------- c...make up a default name for outfile CALL STRPOS(BASENAME,ISTART,ISTOP) IF ( OUTFORMAT .EQ. 'M' ) THEN OUTNAME = BASENAME(ISTART:ISTOP) // '.msf' ELSE IF ( OUTFORMAT .EQ. 'H' ) THEN OUTNAME = BASENAME(ISTART:ISTOP) // '.hssp' ELSE IF ( OUTFORMAT .EQ. 'D' ) THEN OUTNAME = BASENAME(ISTART:ISTOP) // '.daf' ELSE IF ( OUTFORMAT .EQ. 'P' ) THEN OUTNAME = BASENAME(ISTART:ISTOP) // '.pir' C else if ( outformat .eq. 'F' ) then C outname = basename(istart:istop) // '.pearson' ELSE IF ( OUTFORMAT .EQ. 'F' ) THEN OUTNAME = BASENAME(ISTART:ISTOP) // '.f' ELSE IF ( OUTFORMAT .EQ. 'S' ) THEN OUTNAME = BASENAME(ISTART:ISTOP) // '.star' ELSE IF ( OUTFORMAT .EQ. 'A' ) THEN OUTNAME = BASENAME(ISTART:ISTOP) // '.alb' ELSE IF ( OUTFORMAT .EQ. 'K' ) THEN OUTNAME = BASENAME(ISTART:ISTOP) // '.klein' ELSE IF ( OUTFORMAT .EQ. 'E' ) THEN OUTNAME = BASENAME(ISTART:ISTOP) // '.embl' ELSE IF ( OUTFORMAT .EQ. 'G' ) THEN OUTNAME = BASENAME(ISTART:ISTOP) // '.gcg' C else if ( outformat .eq. 'Y' ) then C outname = basename(istart:istop) // '.phylip' ELSE IF ( OUTFORMAT .EQ. 'Y' ) THEN OUTNAME = BASENAME(ISTART:ISTOP) // '.phy' ENDIF CALL GETCHAR(MAX_NAME_LEN,OUTNAME,' output file name ?' ) C----------------------------------------------------------------------- CALL OPEN_FILE(OUT,OUTNAME,'new',error) IF ( ERROR ) STOP C----------------------------------------------------------------------- C---- C---- -------------------------------------------------- C---- alignment formats C---- -------------------------------------------------- C---- IF ( MULTFORMAT ) THEN IF ( CALLS .EQ. 1 ) THEN CALL STRPOS(INFORMAT,ISTART,ISTOP) IF ( INFORMAT(ISTART:ISTOP) .EQ. 'HSSP' ) THEN C msftoseq does not care about lowercase letters ! DO I=1,ALILEN IF(PDBSEQ(I) .EQ. '!')THEN PDBSEQ(I)= CGAPCHAR ENDIF ENDDO c call CHARARRAYREPL(pdbseq,alilen,'!',cgapchar) CALL DSSP_NOTATION_TO_CYS(PDBSEQ,ALILEN) C pdbseq will be "changepos"th sequence in aliseq PDBNAME = PDBID CALL LEFTADJUST(PDBNAME,1,CODELEN) CHANGEPOS = 1 LINSERT = .TRUE. LDELETE = .FALSE. CALL CHANGE_ALISEQ(MAXALIGNS,MAXCORE,MAXINS,MAXRES, + LINSERT,LDELETE,CHANGEPOS,NALIGN, 1 PDBSEQ,1,ALILEN,PDBNAME, 2 ALILEN,ALISEQ,ALIPOINTER,IFIR,ILAS,JFIR,JLAS, 3 LALI,NGAP,LGAP,LENSEQ,STRID,IDE,SIM,EXCLUDEFLAG, 4 ACCNUM_HSSP,EMBLID,PROTNAME,INSALI,INSNUMBER) IF (OUTFORMAT .NE. 'D')THEN DO IALIGN = 1,NALIGN WEIGHT(IALIGN) = 1.0 ENDDO ENDIF ENDIF ENDIF IF ( OUTFORMAT .EQ. 'P' ) THEN NSYMBOLS = 30 DO IALIGN = 1,NALIGN CALL GET_SEQ_FROM_ALISEQ(ALISEQ,IFIR,ILAS, 1 ALIPOINTER,ALILEN,IALIGN,SEQ,NREAD, 2 ERROR ) IF ( ERROR ) STOP CALL STRREPLACE(SEQ,ALILEN,CGAPCHAR,CGAPCHARPIR) CALL WRITE_PIR(OUT,SEQ,FILENAME,OUTNAME,ACCNUM, 1 EMBLID(IALIGN),NSYMBOLS,RANGE(1), 2 RANGE(2),ERROR) IF ( ERROR ) STOP ENDDO C---- C---- out HSSP C---- ELSE IF ( OUTFORMAT .EQ. 'H' ) THEN C even if chains are not regularily counted, dont have "nchain=0" C .... in the header! NCHAIN = 1 CALL MAKE_HSSP(MAXRES,MAXALIGNS,MAXCORE, + MAXAA,MAXINS,MAXINSBUF, + OUT,OUTNAME,FILENAME,PDBID,HEADER, + COMPND,SOURCE,AUTHOR,ALILEN,NCHAIN,KCHAIN, + CHAINREMARK,NALIGN,EXCLUDEFLAG,EMBLID,STRID, + IDE,SIM,IFIR,ILAS,JFIR,JLAS,LALI,NGAP,LGAP, + LENSEQ,ACCNUM_HSSP,PROTNAME,PDBNO,PDBSEQ,CHAINID, + SECSTR_HSSP,COLS,SHEETLABEL,BP1,BP2,ACC,NOCC, + VAR,ALISEQ,ALIPOINTER,SEQPROF,NDEL,NINS,ENTROPY, + RELENT,CONSWEIGHT_MIN,CONSWEIGHT,LCONSERV,LOLDVERSION, + CGAPCHAR,LFILTER,VARMETRIC,LDOCLIP, + INSNUMBER,INSALI,INSPOINTER,INSLEN,INSBEG_1, + INSBEG_2,INSBUFFER,DO_INIT_INSBUF,ERROR, $ LHSSP_LONG_ID) IF ( ERROR ) STOP ELSE IF ( OUTFORMAT .EQ. 'Y' ) THEN NBLOCKS = 5 CALL WRITE_PHYLIP(OUT,MAXALIGNS,MAXCORE,RANGE(1), 1 RANGE(2),NBLOCKS,ALISEQ,ALIPOINTER, 2 IFIR,ILAS,EMBLID,NALIGN,ERROR) IF ( ERROR ) STOP ELSE IF ( OUTFORMAT .EQ. 'M' ) THEN TYPE = 'P' NBLOCKS = 5 CALL STRPOS(INFORMAT,ISTART,ISTOP) IF ( INFORMAT(ISTART:ISTOP) .EQ. 'HSSP' ) THEN C = 'N' CALL GETCHAR(1,C, 1 ' Expand sequences according to HSSP insertion list ? ') CALL LOWTOUP(C,1) LDOEXP = C .EQ. 'Y' ELSE LDOEXP = .FALSE. ENDIF C call write_msf(out,filename,outname,MAXALIGNS,MAXRES, CALL WRITE_MSF(OUT,FILENAME,OUTNAME,MAXALIGNS,MAXRES, 1 MAXCORE,MAXINS,MAXINSBUF,RANGE(1),RANGE(2),NBLOCKS, 2 ALISEQ,ALIPOINTER,IFIR,ILAS,TYPE,EMBLID,WEIGHT, 3 SEQCHECK,MSFCHECK,ALILEN,NALIGN,INSNUMBER,INSALI, 4 INSPOINTER,INSLEN,INSBEG_1,INSBUFFER, 5 LDOEXP,ERROR) IF ( ERROR ) STOP ELSE IF ( OUTFORMAT .EQ. 'D' ) THEN TYPE = 'P' NBLOCKS = 1 CALL STRPOS(INFORMAT,ISTART,ISTOP) IF ( INFORMAT(ISTART:ISTOP) .EQ. 'HSSP' ) THEN LDOEXP = .TRUE. ENDIF CALL WRITE_DAF(OUT,FILENAME,OUTNAME,MAXALIGNS,MAXRES, 1 MAXCORE,MAXINS,MAXINSBUF,RANGE(1),RANGE(2), 2 ALISEQ,ALIPOINTER,IDE,IFIR,ILAS,EMBLID,WEIGHT, 3 ALILEN,NALIGN,LALI,INSNUMBER,INSALI, 4 INSPOINTER,INSLEN,INSBEG_1,INSBUFFER, 5 ERROR) IF ( ERROR ) STOP ELSE IF ( OUTFORMAT .EQ. 'F' ) THEN NBLOCKS = 6 DO IALIGN = 1,NALIGN CALL GET_SEQ_FROM_ALISEQ(ALISEQ,IFIR,ILAS, 1 ALIPOINTER,ALILEN,IALIGN,SEQ,NREAD, 2 ERROR ) IF ( ERROR ) STOP CALL WRITE_PEARSON(OUT,OUTNAME,SEQ,NBLOCKS, 1 EMBLID(IALIGN),COMPND,RANGE(1), 2 RANGE(2),ERROR) IF ( ERROR ) STOP ENDDO ELSE WRITE(*,*)' wat is tipp wat richtiges, bloedmann' C GOTO 2 STOP ENDIF ELSE IF ( OUTFORMAT .EQ. 'S' ) THEN NBLOCKS = 7 CALL WRITE_STAR(OUT,SEQ,NBLOCKS,FILENAME,OUTNAME, 1 COMPND,RANGE(1),RANGE(2),ERROR) IF ( ERROR ) STOP ELSE IF ( OUTFORMAT .EQ. 'A' ) THEN NBLOCKS = 5 CALL WRITE_ALB(OUT,OUTNAME,SEQ,NBLOCKS,COMPND,FILENAME, 1 RANGE(1),RANGE(2),CHBPOS,NBREAKS,ERROR) IF ( ERROR ) STOP ELSE IF ( OUTFORMAT .EQ. 'K' ) THEN NBLOCKS = 6 CALL WRITE_KLEIN(OUT,SEQ,NBLOCKS,BASENAME,FILENAME, 1 OUTNAME,COMPND,RANGE(1),RANGE(2), 2 ERROR) IF ( ERROR ) STOP ELSE IF ( OUTFORMAT .EQ. 'P' ) THEN NSYMBOLS = 30 CALL STRREPLACE(SEQ,NRES,CGAPCHAR,CGAPCHARPIR) CALL WRITE_PIR(OUT,SEQ,FILENAME,OUTNAME,ACCNUM,COMPND, 1 NSYMBOLS,RANGE(1),RANGE(2),ERROR) IF ( ERROR ) STOP ELSE IF ( OUTFORMAT .EQ. 'E' ) THEN NBLOCKS = 6 CALL WRITE_EMBL(OUT,SEQ,NBLOCKS,FILENAME,OUTNAME,COMPND, 1 RANGE(1),RANGE(2),ERROR) IF ( ERROR ) STOP ELSE IF ( OUTFORMAT .EQ. 'G' ) THEN NBLOCKS = 5 CALL WRITE_GCG(OUT,SEQ,NBLOCKS,NBREAKS,FILENAME,OUTNAME, 1 COMPND,RANGE(1),RANGE(2),ERROR) IF ( ERROR ) STOP ELSE IF ( OUTFORMAT .EQ. 'F' ) THEN NBLOCKS = 6 CALL WRITE_PEARSON(OUT,OUTNAME,SEQ,NBLOCKS,ACCNUM, 1 COMPND,RANGE(1),RANGE(2),ERROR) IF ( ERROR ) STOP ELSE WRITE(*,*)' wat is tipp wat richtiges, bloedmann' C GOTO 2 STOP ENDIF ENDIF CLOSE(OUT) C = 'N' CALL GETCHAR(1,C,' write another format ?') CALL LOWTOUP(C,1) IF ( C .NE. 'N' ) THEN CALLS = CALLS + 1 DO_INIT_INSBUF = .FALSE. GOTO 2 ENDIF END C END PROGRAM CONVERT C...................................................................... C...................................................................... C SUB CHAINBREAKPOS SUBROUTINE CHAINBREAKPOS( SEQ,BEGIN,END,MAXBREAK,NBREAKS, 1 CHBPOS, ERROR ) IMPLICIT NONE C Import INTEGER MAXBREAK,BEGIN,END CHARACTER*(*) SEQ C Export INTEGER NBREAKS,CHBPOS(MAXBREAK) LOGICAL ERROR C Internal INTEGER IPOS *----------------------------------------------------------------------* NBREAKS = 0 DO IPOS = BEGIN,END IF ( SEQ(IPOS:IPOS) .EQ. '!' ) THEN NBREAKS = NBREAKS + 1 IF ( NBREAKS .GT. MAXBREAK ) THEN ERROR = .TRUE. WRITE(*,'(A)') 1 ' MAXBREAK overflow in chainbreakpos !' RETURN ENDIF CHBPOS(NBREAKS) = IPOS ENDIF ENDDO RETURN END C END CHAINBREAKPOS C...................................................................... C...................................................................... C SUB CHOOSE_NAME SUBROUTINE CHOOSE_NAME(EMBLID,NSEQS,NUMBER) C 21.10.92 IMPLICIT NONE C Import INTEGER NSEQS CHARACTER*(*) EMBLID(NSEQS) C Export INTEGER NUMBER C Internal INTEGER NAMELEN_LOC PARAMETER (NAMELEN_LOC= 13) INTEGER INAME CHARACTER*(NAMELEN_LOC) ANSWER CHARACTER*(NAMELEN_LOC) TEST LOGICAL LFOUND ANSWER = EMBLID(1) DO INAME = 1,NSEQS WRITE(*,'(1X,A)') EMBLID(INAME) ENDDO 1 CONTINUE CALL GETCHAR(NAMELEN_LOC,ANSWER, 1 'Choose name of HSSP master sequence :' ) CALL LOWTOUP(ANSWER,NAMELEN_LOC) C position of sequence in alignment = return information LFOUND = .FALSE. NUMBER = 1 INAME = 1 DO WHILE ( INAME .LE. NSEQS .AND. .NOT. LFOUND ) TEST = EMBLID(INAME) CALL LOWTOUP(TEST,NAMELEN_LOC) IF ( ANSWER .EQ. TEST ) THEN LFOUND = .TRUE. ELSE INAME = INAME + 1 NUMBER = NUMBER + 1 ENDIF ENDDO IF ( .NOT. LFOUND ) THEN WRITE(*,'(1X,A)') ' *** INVALID CHOICE ***' STOP C GOTO 1 ENDIF RETURN END C END CHOOSE_NAME C...................................................................... C...................................................................... C SUB CLIP_ALISEQ SUBROUTINE CLIP_ALISEQ(MAXALIGNS,MAXCORE,MAXINS, 1 MAXINSBUF,ALISEQ,ALIPOINTER,NALIGN,CGAPCHAR,NRES, 2 IFIR,ILAS,JFIR,JLAS,DELPOS,NDEL,INSNUMBER,INSALI, 3 INSPOINTER,INSLEN,INSBEG_1,INSBEG_2,INSBUFFER, 4 ERROR) C 5.7.93 C delete the regions indicated in "delpos" from all sequences of the C passed "aliseq" structure; C copy the regions to "insbuffer" IMPLICIT NONE C Import INTEGER MAXALIGNS, MAXCORE, MAXINS, MAXINSBUF,NALIGN,NRES INTEGER IFIR(MAXALIGNS),ILAS(MAXALIGNS), + JFIR(MAXALIGNS),JLAS(MAXALIGNS), + ALIPOINTER(MAXALIGNS) INTEGER DELPOS(2,*),NDEL CHARACTER*1 CGAPCHAR C Export INTEGER INSNUMBER,INSALI(MAXINS),INSPOINTER(MAXINS), + INSLEN(MAXINS),INSBEG_1(MAXINS),INSBEG_2(MAXINS) CHARACTER ALISEQ(MAXCORE) CHARACTER INSBUFFER(MAXINSBUF) LOGICAL ERROR C Internal INTEGER MAXTEMPBUF PARAMETER (MAXTEMPBUF= 1000) INTEGER I,IALIGN,IDEL,MAXLENDEL,LENDEL, + IPOS,JPOS,IAP,ISP,NTERMDEL, + NEXTPOS_INS,LGAP,NSYMBOLS,LASTPOS,POS,LEN CHARACTER*1 C CHARACTER*(MAXTEMPBUF) TEMPBUF *----------------------------------------------------------------------* INSNUMBER = 0 NEXTPOS_INS = 1 MAXLENDEL = 0 DO IALIGN = 1,NALIGN NTERMDEL = 0 LGAP = 0 LASTPOS = IFIR(IALIGN) LENDEL = 0 IPOS = ALIPOINTER(IALIGN)-1 ISP = JFIR(IALIGN) - 1 DO IDEL = 1,NDEL POS = DELPOS(1,IDEL) LEN = DELPOS(2,IDEL) IF ( POS .LE. IFIR(IALIGN) ) THEN NTERMDEL = NTERMDEL + LEN C IF ( POS+LEN-1 .LT. IFIR(IALIGN) ) THEN C NTERMDEL = NTERMDEL + LEN C ELSE IF ( POS .LT. IFIR(IALIGN) .AND. C 1 pos+len-1 .ge. ifir(ialign) ) then C ntermdel = ntermdel + len ELSE IF ( POS .GT. IFIR(IALIGN) .AND. 1 POS .LE. ILAS(IALIGN) ) THEN DO IAP = LASTPOS,POS-1 IPOS = IPOS + 1 IF ( ALISEQ(IPOS) .NE. CGAPCHAR ) THEN ISP = ISP + 1 ENDIF ENDDO IPOS = IPOS + 1 ISP = ISP + 1 NSYMBOLS = 0 TEMPBUF = ' ' DO JPOS = 1,LEN IF ( ALISEQ(IPOS+JPOS-1) .NE. CGAPCHAR ) THEN NSYMBOLS = NSYMBOLS + 1 TEMPBUF(NSYMBOLS:NSYMBOLS) = 1 ALISEQ(IPOS+JPOS-1) ELSE LGAP = LGAP + 1 ENDIF ENDDO DO JPOS = IPOS,ALIPOINTER(IALIGN)+ 1 (ILAS(IALIGN)-IFIR(IALIGN))-LEN ALISEQ(JPOS) = ALISEQ(JPOS+LEN) ENDDO IF ( NSYMBOLS .GT. 0 ) THEN INSNUMBER = INSNUMBER + 1 IF ( INSNUMBER .GT. MAXINS ) THEN WRITE(*,'(1X,A)') 1 'MAXINS overflow in clip_aliseq !' ERROR = .TRUE. RETURN ENDIF INSPOINTER(INSNUMBER) = NEXTPOS_INS INSBEG_1(INSNUMBER) = POS-LENDEL-1 INSBEG_2(INSNUMBER) = ISP-1 INSALI(INSNUMBER) = IALIGN INSLEN(INSNUMBER) = NSYMBOLS IF ( IPOS .GT. 1 ) THEN C = ALISEQ(IPOS-1) JPOS = IPOS-1 IF ( C .EQ. CGAPCHAR ) THEN DO WHILE ( JPOS .GT. 1 .AND. 1 ALISEQ(JPOS) .EQ. CGAPCHAR ) JPOS = JPOS - 1 C = ALISEQ(JPOS) ENDDO ENDIF CALL UPTOLOW(C,1) ALISEQ(JPOS) = C ENDIF INSBUFFER(NEXTPOS_INS) = C DO I = 1,NSYMBOLS INSBUFFER(NEXTPOS_INS+I) = TEMPBUF(I:I) ENDDO C = ALISEQ(IPOS) JPOS = IPOS IF ( C .EQ. CGAPCHAR ) THEN DO WHILE 1 ( JPOS .LT. 2 ALIPOINTER(IALIGN)+ILAS(IALIGN)-IFIR(IALIGN) 3 .AND. 4 ALISEQ(JPOS) .EQ. CGAPCHAR ) JPOS = JPOS + 1 C = ALISEQ(JPOS) ENDDO ENDIF CALL UPTOLOW(C,1) ALISEQ(JPOS) = C INSBUFFER(NEXTPOS_INS+NSYMBOLS+1) = C NEXTPOS_INS = 1 NEXTPOS_INS + NSYMBOLS + 3 IF ( NEXTPOS_INS .GT. MAXINSBUF ) THEN WRITE(*,'(1X,A)') 1 'MAXINSBUF overflow in clip_aliseq !' ERROR = .TRUE. RETURN ENDIF ENDIF LASTPOS = POS + LEN + 1 LENDEL = LENDEL + LEN ENDIF ENDDO MAXLENDEL = MAX(MAXLENDEL,LENDEL) IFIR(IALIGN) = IFIR(IALIGN)-NTERMDEL IF ( IFIR(IALIGN) .LT. 1 ) THEN C delete n-terminal gaps DO JPOS = ALIPOINTER(IALIGN), 1 ALIPOINTER(IALIGN)+ 2 (ILAS(IALIGN)-1)-(1-IFIR(IALIGN)) ALISEQ(JPOS) = ALISEQ(JPOS+(1-IFIR(IALIGN))) ENDDO IFIR(IALIGN) = 1 ENDIF ILAS(IALIGN) = ILAS(IALIGN)-LENDEL-NTERMDEL JLAS(IALIGN) = JLAS(IALIGN)-LENDEL-NTERMDEL +LGAP ENDDO C>>>>> NRES = NRES - MAXLENDEL RETURN END C END CLIP_ALISEQ C...................................................................... C...................................................................... C SUB CHANGE_ALISEQ SUBROUTINE CHANGE_ALISEQ(MAXALIGNS,MAXCORE,MAXINS,MAXRES, 1 LINSERT,LDELETE,CHANGEPOS,NALIGN, + NEWSEQ,NEWSEQSTART,NEWSEQSTOP,NEWSEQNAME, 2 FILLUP,ALISEQ,ALIPOINTER, 3 IFIR,ILAS,JFIR,JLAS,LALI,NGAP,LGAP,LENSEQ,STRID, 4 IDE,SIM,EXCLUDEFLAG,ACCNUM,EMBLID,PROTNAME,INSALI, 5 INSNUMBER) IMPLICIT NONE C Import INTEGER MAXALIGNS,MAXCORE,MAXINS,MAXRES,CHANGEPOS C mutually exclusive logicals: C ........ if ( linsert ) "newseq" is inserted into aliseq at position C ........ "changepos". Its name, stop/start values and so on are C ........ added to the appropriate arrays. C ........ if ( ldelete ), the "changepos"th sequence is deleted from C ........ aliseq ( so is its name, start/stop values .. from emblid, C ........ ifir, ilas .. ) and is returned in "newseq". Its name is C ........ returned in "newseqname". LOGICAL LINSERT, LDELETE C Import / Export INTEGER NALIGN,FILLUP,ALIPOINTER(MAXALIGNS) INTEGER NEWSEQSTART,NEWSEQSTOP CHARACTER ALISEQ(MAXCORE),NEWSEQ(MAXRES),NEWSEQNAME*(*) C attributes of aligned sequences C---- first and last residue of guide (ifir), and aligned (jfir) INTEGER IFIR(MAXALIGNS),ILAS(MAXALIGNS), + JFIR(MAXALIGNS),JLAS(MAXALIGNS) INTEGER LALI(MAXALIGNS),NGAP(MAXALIGNS), + LGAP(MAXALIGNS),LENSEQ(MAXALIGNS), + INSALI(MAXINS),INSNUMBER CHARACTER*(*) STRID(MAXALIGNS),ACCNUM(MAXALIGNS), + EMBLID(MAXALIGNS),PROTNAME(MAXALIGNS) C---- ' ' if no flag, '*' if to exclude CHARACTER EXCLUDEFLAG(MAXALIGNS) C---- percentage identity similarity REAL IDE(MAXALIGNS),SIM(MAXALIGNS) C Internal INTEGER IALIGN,IINS,IPOS,JPOS,LEN,IBEG,IEND,ITMP,I INTEGER NEWSEQLEN CHARACTER*1 CGAPCHAR CGAPCHAR = '.' C---------------------------------------------------------------------- IF ( LINSERT ) THEN C---------------------------------------------------------------------- NEWSEQLEN = NEWSEQSTOP-NEWSEQSTART+1 IF ( CHANGEPOS .GT. NALIGN+1 ) THEN WRITE(*,'(A,I4)') ' cannot add after position ', NALIGN+1 RETURN ELSE IF ( CHANGEPOS .EQ. NALIGN+1 ) THEN ALIPOINTER(CHANGEPOS) = 1 ALIPOINTER(NALIGN)+ILAS(NALIGN)-IFIR(NALIGN)+1 ELSE IF ( CHANGEPOS .LE. NALIGN ) THEN DO IPOS = ALIPOINTER(NALIGN)+ILAS(NALIGN)-IFIR(NALIGN)+1, 1 ALIPOINTER(CHANGEPOS),-1 IF ( IPOS+NEWSEQLEN+1 .LE. MAXCORE ) THEN C shift by newseqlen+1, because a '/' is to be inserted after newseq ALISEQ(IPOS+NEWSEQLEN+1 ) = ALISEQ(IPOS) ELSE STOP 'MAXCORE OVERFLOW IN ADD_SEQ_TO_ALISEQ !' ENDIF ENDDO C insert new member into arrays ifir .. at position changepos C and push following members by one DO IINS = 1,INSNUMBER IF ( INSALI(IINS) .GE. CHANGEPOS ) THEN INSALI(IINS)=INSALI(IINS)+1 ENDIF ENDDO DO IALIGN = NALIGN,CHANGEPOS,-1 IFIR(IALIGN+1)=IFIR(IALIGN) ILAS(IALIGN+1)=ILAS(IALIGN) JFIR(IALIGN+1)=JFIR(IALIGN) JLAS(IALIGN+1)=JLAS(IALIGN) LALI(IALIGN+1)=LALI(IALIGN) NGAP(IALIGN+1)=NGAP(IALIGN) LGAP(IALIGN+1)=LGAP(IALIGN) LENSEQ(IALIGN+1)=LENSEQ(IALIGN) STRID(IALIGN+1)=STRID(IALIGN) ACCNUM(IALIGN+1)=ACCNUM(IALIGN) EMBLID(IALIGN+1)=EMBLID(IALIGN) PROTNAME(IALIGN+1)=PROTNAME(IALIGN) EXCLUDEFLAG(IALIGN+1)=EXCLUDEFLAG(IALIGN) IDE(IALIGN+1)=IDE(IALIGN) SIM(IALIGN+1)=SIM(IALIGN) ALIPOINTER(IALIGN+1) = ALIPOINTER(IALIGN)+NEWSEQLEN+1 ENDDO ENDIF C insert newseq into aliseq C changepos............last res, '/' C = +1 LEN = NEWSEQSTART DO IPOS = ALIPOINTER(CHANGEPOS), 1 (ALIPOINTER(CHANGEPOS)+NEWSEQLEN-1) IF (IPOS.GT.0) THEN ALISEQ(IPOS)= NEWSEQ(LEN) LEN= LEN + 1 ENDIF ENDDO ALISEQ(IPOS) = '/' IFIR(CHANGEPOS)=1 ILAS(CHANGEPOS)=NEWSEQLEN JFIR(CHANGEPOS)=1 JLAS(CHANGEPOS)=NEWSEQLEN LALI(CHANGEPOS)=NEWSEQLEN NGAP(CHANGEPOS)=0 LGAP(CHANGEPOS)=0 LENSEQ(CHANGEPOS)=NEWSEQLEN STRID(CHANGEPOS )= ' ' ACCNUM(CHANGEPOS) = ' ' EMBLID(CHANGEPOS)= NEWSEQNAME PROTNAME(CHANGEPOS)= ' ' EXCLUDEFLAG(CHANGEPOS)= ' ' IDE(CHANGEPOS)=0.0 SIM(CHANGEPOS)=0.0 NALIGN = NALIGN + 1 C------- br 99-03: watch names IF (EMBLID(1).EQ.EMBLID(CHANGEPOS)) THEN CALL STRPOS(EMBLID(1),IBEG,IEND) EMBLID(1)((IEND+1):(IEND+1))="0" ENDIF C----------------------------------------------------------------------- ELSE IF ( LDELETE .AND. CHANGEPOS.NE.0) THEN C----------------------------------------------------------------------- C save sequence in "newseq" JPOS=0 C---- all before to gap (really what you want?) IF (IFIR(CHANGEPOS) .GE. 2) THEN DO JPOS=1,(IFIR(CHANGEPOS)-1) NEWSEQ(JPOS)= CGAPCHAR ENDDO JPOS= JPOS - 1 ENDIF NEWSEQLEN= 0 IBEG= ALIPOINTER(CHANGEPOS) IEND= IBEG+ILAS(CHANGEPOS)-IFIR(CHANGEPOS) C---- read real sequence for the respective range DO IPOS=IBEG,IEND Cxx write(6,*)'xx ipos=',ipos,' len=',lenseq(changepos) IF (IPOS.GT.0) THEN NEWSEQLEN= NEWSEQLEN+1 NEWSEQ(NEWSEQLEN+JPOS)=ALISEQ(IPOS) ENDIF ENDDO C---- fill up with GAP DO IPOS=(ILAS(CHANGEPOS)+1),FILLUP NEWSEQ(IPOS)= CGAPCHAR ENDDO NEWSEQSTART= IFIR(CHANGEPOS) NEWSEQSTOP= ILAS(CHANGEPOS) IF ( CHANGEPOS .LT. NALIGN ) THEN C if the last sequence is to be deleted, nothing has actually to be C ...done to "aliseq". Otherwise: pull all following sequences and C ...array entries by one, overwriting sequence No. "changepos" DO IPOS= ALIPOINTER(CHANGEPOS+1), 1 ALIPOINTER(NALIGN)+ILAS(NALIGN)-IFIR(NALIGN)+1 C shift by newseqlen+1, because a '/' is to be inserted after newseq ALISEQ(IPOS-NEWSEQLEN-1 ) = ALISEQ(IPOS) ENDDO NEWSEQNAME= EMBLID(CHANGEPOS) DO IINS = 1,INSNUMBER IF (INSALI(IINS) .GE. CHANGEPOS) THEN INSALI(IINS)=INSALI(IINS)-1 ENDIF ENDDO DO IALIGN=CHANGEPOS,(NALIGN-1) IFIR(IALIGN)= IFIR(IALIGN+1) ILAS(IALIGN)= ILAS(IALIGN+1) JFIR(IALIGN)= JFIR(IALIGN+1) JLAS(IALIGN)= JLAS(IALIGN+1) LALI(IALIGN)= LALI(IALIGN+1) NGAP(IALIGN)= NGAP(IALIGN+1) LGAP(IALIGN)= LGAP(IALIGN+1) LENSEQ(IALIGN)= LENSEQ(IALIGN+1) STRID(IALIGN)= STRID(IALIGN+1) ACCNUM(IALIGN)= ACCNUM(IALIGN+1) EMBLID(IALIGN)= EMBLID(IALIGN+1) PROTNAME(IALIGN)= PROTNAME(IALIGN+1) EXCLUDEFLAG(IALIGN)=EXCLUDEFLAG(IALIGN+1) IDE(IALIGN)= IDE(IALIGN+1) SIM(IALIGN)= SIM(IALIGN+1) ALIPOINTER(IALIGN)= ALIPOINTER(IALIGN+1)-NEWSEQLEN-1 ENDDO ENDIF C---- reduce number of alignments (since CHANGEPOSnth seq removed) NALIGN=NALIGN-1 C---------------------------------------------------------------------- ENDIF C---------------------------------------------------------------------- RETURN END C END CHANGE_ALISEQ C...................................................................... C...................................................................... C SUB DSSP_NOTATION_TO_CYS SUBROUTINE DSSP_NOTATION_TO_CYS(SEQIN,LEN) IMPLICIT NONE C 20.3.92 C Import: CHARACTER SEQIN(*) INTEGER LEN C Internal: C only used to get rid of INDEX command (CPU time) INTEGER NASCII PARAMETER (NASCII= 256) INTEGER LOWERPOS(NASCII),IRES, I CHARACTER*26 LOWER CHARACTER*1 C *----------------------------------------------------------------------* C used to convert lower case characters from the DSSP-seq to 'C' (Cys) LOWER='abcdefghijklmnopqrstuvwxyz' CALL GETPOS(LOWER,LOWERPOS,NASCII) IRES = 1 DO WHILE ( IRES .LE. LEN) C = SEQIN(IRES) CALL GETINDEX(C,LOWERPOS,I) IF(I.NE.0) C='C' SEQIN(IRES) = C IRES = IRES + 1 ENDDO RETURN END C END DSSP_NOTATION_TO_CYS C...................................................................... C...................................................................... C SUB EXTRACT_CHAINS SUBROUTINE EXTRACT_CHAINS(TEMPNAME,FILENAME,MAXCHAINS, 1 CCHAIN) IMPLICIT NONE C chain is given as an appendix to filename as follows: C filename_!_chain C C IF CHAINID IS A DIGIT, IT IS TAKEN TO BE THE "DIGIT"TH CHAIN ! C C Import INTEGER MAXCHAINS CHARACTER*(*) TEMPNAME C Export CHARACTER*(*) FILENAME CHARACTER*(*) CCHAIN C Internal INTEGER I,II,J CHARACTER*32 OUTFORMAT CCHAIN = ' ' I = INDEX(TEMPNAME,'!') IF ( I .NE. 0 ) THEN FILENAME = TEMPNAME(1:I-2) ELSE FILENAME = TEMPNAME WRITE(OUTFORMAT,'(A,I4,A)') '( ',MAXCHAINS, '(I3,1X))' WRITE(CCHAIN,OUTFORMAT) (J,J=1,MAXCHAINS) ENDIF J = 1 II = I DO WHILE ( II .NE. 0 ) IF ( J .EQ. 1 ) THEN CCHAIN(1:1) = TEMPNAME(I+2:I+2) ELSE CCHAIN(2*J-1:2*J-1) = TEMPNAME(I+1:I+1) ENDIF II = INDEX(TEMPNAME(I+1:),',') I = I + II J = J + 1 ENDDO RETURN END C END EXTRACT_CHAINS C...................................................................... C...................................................................... C SUB GETCONSWEIGHT_BR SUBROUTINE GETCONSWEIGHT_BR(MAXSTRSTATES,MAXIOSTATES, + MAXRES,MAXALIGNS,MAXCORE,NTRANS,NASCII,NALIGN,NRES, + IDE,IFIR,ILAS,LALI,ALIPOINTER,PDBSEQ,ALISEQ, + EXCLUDEFLAG,CTRANS,MATRIX,SIMCONSERV,MATPOS, + CONSWEIGHT_MIN,CONSWEIGHT) C---- C---- BR May 98 + Oct 98 C---- C---- Calculate the conservation weight, taking into account the C---- EXCLUDEFLAG(iali)='*' if the identity of alignment number C---- 'iali' to any other sequence is above IDEMAX (threshold input) C---- C---- C---- out: output is the conservation weight C---- C---- ------------------------------------------------------------------ C---- C---- formula: C---- nali C---- ----- C---- \ (1 - IDE(s,a,b)) * SIM(s,a,b) C---- CW(s)= > ----------------------------- C---- / DISTSUM C---- ----- C---- a,b C---- C---- with: CW(s): conservation weight for residue s C---- a,b: alignment between protein a and b C---- IDE(s,a,b): identity of residue s in a and s in b C---- = 0 || 1 C---- SIM(s,a,b): similarity of residue s in a and s in b C---- C---- DISTSUM(s): sum over all distances at position s C---- and the following definition: C---- C---- nali nres = overlap (a,b) C---- ----- ---- C---- \ \ delta(s,a,b) C---- DISTSUM(s)= > > ------------------------------ C---- / / number of overlapping residues C---- ----- ---- C---- a,b s C---- C---- with: delta(s,a,b) = 1 if residue s in a = residue s in b C---- 0 else C---- C---- normalise: C---- finally normalised weights to have an average of 1: C---- C---- nres C---- ----- C---- \ C---- NORM(s)= > CW(s) C---- / C---- ----- C---- s C---- C---- CW_NORM(s)= CW(s) / NORM C---- C---- C---- ------------------------------------------------------------------ C---- IMPLICIT NONE C---- ------------------------------ C---- import C---- ------------------------------ C---- C parameters for array INTEGER MAXSTRSTATES,MAXIOSTATES,MAXRES,MAXALIGNS,MAXCORE, + NTRANS,NASCII C actual values INTEGER NALIGN,NRES C pointer arrays INTEGER IFIR(MAXALIGNS),ILAS(MAXALIGNS), + ALIPOINTER(MAXALIGNS) C alignment length lali(i) length of ali between PDBSEQ and ali i INTEGER LALI(MAXALIGNS) C guide sequence CHARACTER PDBSEQ(MAXRES) C all aligned sequences CHARACTER ALISEQ(MAXCORE) C CHARACTER*(*) ALISEQ(MAXCORE) C flags (take ' ', else other) CHARACTER EXCLUDEFLAG(MAXALIGNS) C percentage sequence identity REAL IDE(MAXALIGNS),IDEMAX C comparison metric REAL MATRIX(NTRANS,NTRANS,MAXSTRSTATES,MAXIOSTATES, + MAXSTRSTATES,MAXIOSTATES) C normalised to sum over 0 (by SETCONSERVATION) REAL SIMCONSERV(NTRANS,NTRANS) C allowed sequence symbols CHARACTER CTRANS*26 C minimal conservation weight REAL CONSWEIGHT_MIN C only used to get rid of INDEX command (CPU time) INTEGER MATPOS(NASCII) C---- C---- output from sbr C---- C conservation weight REAL CONSWEIGHT(MAXRES) C---- ------------------------------------------------------------------ C---- internal C---- ------------------------------------------------------------------ C---- C INTEGER NASCII,MAXLEN INTEGER MAXRES_LOC,MAXLEN PARAMETER (MAXRES_LOC= 9999) C PARAMETER (MAXRES_LOC= 10000) C PARAMETER (MAXRES_LOC= 30011) PARAMETER (MAXLEN= 20000) INTEGER IALIGN_SEQ(MAXRES_LOC),JALIGN_SEQ(MAXRES_LOC), + I,J,IALIGN,JALIGN,ILEN,IRES, + IPOS,JPOS,IBEG,IEND,KPOS,IAGR,ISTEP,ISAFERANGE, + IALIGN_SEQTMP,JALIGN_SEQTMP INTEGER NOCC(MAXRES_LOC) C---- big: seq ide distance for pair i,j REAL SUM,MEAN,XVAL INTEGER NPOS REAL SIMDIST_POS(MAXRES_LOC) REAL SEQDIST_POS(MAXRES_LOC) REAL SEQDIST CHARACTER LINE*(MAXLEN) LOGICAL LERROR C---- ------------------------------------------------------------------ C---- C---- defaults, ini C---- C---- FORMULA+ISAFERANGE -> include into averaging ISAFERANGE= 5 C---- br: make 'safer' for weights ISAFERANGE= 5 C---- ------------------------------------------------------------------ C---- now work on it C---- ------------------------------------------------------------------ WRITE(6,*) '--- GETCONSWEIGHT_BR begin' C---- C---- calculate variability only for the 22 (BZ) amino acids C---- DO I=1,NASCII MATPOS(I)=0 ENDDO CALL GETPOS(CTRANS(1:22),MATPOS,NASCII) IF (NRES .GT. MAXRES) THEN WRITE(6,*)'*** ERROR: NRES .GT. MAXRES IN GETCONSWEIGHT_BR' WRITE(6,*)'*** INCREASE MAXRES ****' STOP ENDIF C---- C---- ERROR! C---- IF (NRES .LE. 0) RETURN C---- C---- initialize C---- DO I=1,NRES IALIGN_SEQ(I)= 0 JALIGN_SEQ(I)= 0 SIMDIST_POS(I)=0.0 SEQDIST_POS(I)=0.0 NOCC(I)= 0 ENDDO C WRITE(6,*)'XX in getCONS PDB=' C WRITE(6,'(40A1)')(PDBSEQ(I),I=1,NRES) C---- ------------------------------------------------------------------ C---- guide against all C---- ------------------------------------------------------------------ C---- store PDB sequence in integer array DO IRES=1,NRES C write(6,*)'xx ires=',ires,' pdb=',pdbseq(ires),',' IALIGN_SEQ(IRES)=MATPOS(ICHAR(PDBSEQ(IRES))) IF ( IALIGN_SEQ(IRES) .EQ. 0) THEN IF ( PDBSEQ(IRES) .GE. 'a' .AND. + PDBSEQ(IRES) .LE. 'z') THEN IALIGN_SEQ(IRES)= + MATPOS(ICHAR(PDBSEQ(IRES))-32) ENDIF ENDIF ENDDO C---- C---- loop over all aligned sequences C---- DO JALIGN=1,NALIGN C---- hack br 1999-12: problem with 'treat gaps as insertions' C---- somehow the thing has IFIR(JALIGN) not defined!! IF ( EXCLUDEFLAG(JALIGN) .EQ. ' ') THEN IF (IFIR(JALIGN).EQ.0 .OR. ILAS(JALIGN).EQ.0) THEN EXCLUDEFLAG(JALIGN)="*" END IF END IF C---- end of hack br 1999-12 IF ( EXCLUDEFLAG(JALIGN) .EQ. ' ') THEN C---- store alignment sequence in integer array (for j) JPOS=ALIPOINTER(JALIGN)-IFIR(JALIGN) DO IRES=IFIR(JALIGN),ILAS(JALIGN) JALIGN_SEQ(IRES)=MATPOS(ICHAR(ALISEQ(JPOS+IRES))) IF ( JALIGN_SEQ(IRES) .EQ. 0) THEN IF (ALISEQ(JPOS+IRES) .GE. 'a' .AND. + ALISEQ(JPOS+IRES) .LE. 'z') THEN JALIGN_SEQ(IRES)= + MATPOS(ICHAR(ALISEQ(JPOS+IRES))-32) ENDIF ENDIF ENDDO C---- C---- passed into SBR: IDE(i) C---- SEQDIST= 1.0 - IDE(JALIGN) C---- C---- get overlap of aligned sequences C---- IBEG= MAX(1, IFIR(JALIGN)) IEND= MIN(NRES,ILAS(JALIGN)) DO IRES= IBEG,IEND IF ( (IALIGN_SEQ(IRES) .NE. 0) .AND. + (JALIGN_SEQ(IRES) .NE. 0) ) THEN C---- C---- count up number of pairs found for current residue C---- NOCC(IRES)=NOCC(IRES)+1 C---- C---- position specific distances C---- SEQDIST_POS(IRES)= + SEQDIST_POS(IRES) + SEQDIST SIMDIST_POS(IRES)= + SIMDIST_POS(IRES) + + SEQDIST * SIMCONSERV(IALIGN_SEQ(IRES), + JALIGN_SEQ(IRES)) END IF END DO ENDIF C---- end of: exclude? ENDDO C---- end loop over all sequences aligned to guide C---- ------------------------------------------------------------------ C---- all against all: get all pair distances C---- ------------------------------------------------------------------ DO IALIGN=1,NALIGN C---- to exclude? IF ( EXCLUDEFLAG(IALIGN) .EQ. ' ') THEN C---- store alignment sequence in integer array (for i) IPOS=ALIPOINTER(IALIGN)-IFIR(IALIGN) DO IRES=IFIR(IALIGN),ILAS(IALIGN) IALIGN_SEQ(IRES)=MATPOS( ICHAR( ALISEQ(IPOS+IRES) ) ) IF ( IALIGN_SEQ(IRES) .EQ. 0) THEN IF (ALISEQ(IPOS+IRES) .GE. 'a' .AND. + ALISEQ(IPOS+IRES) .LE. 'z') THEN IALIGN_SEQ(IRES)= + MATPOS(ICHAR(ALISEQ(IPOS+IRES))-32) ENDIF ENDIF ENDDO C---- C---- loop over pair partners C---- DO JALIGN=IALIGN+1,NALIGN IF ( EXCLUDEFLAG(JALIGN) .EQ. ' ') THEN C---- store alignment sequence in integer array (for j) JPOS=ALIPOINTER(JALIGN)-IFIR(JALIGN) DO IRES=IFIR(JALIGN),ILAS(JALIGN) JALIGN_SEQ(IRES)=MATPOS(ICHAR(ALISEQ(JPOS+IRES))) IF ( JALIGN_SEQ(IRES) .EQ. 0) THEN IF (ALISEQ(JPOS+IRES) .GE. 'a' .AND. + ALISEQ(JPOS+IRES) .LE. 'z') THEN JALIGN_SEQ(IRES)= + MATPOS(ICHAR(ALISEQ(JPOS+IRES))-32) ENDIF ENDIF ENDDO C---- C---- get distance between overlap of aligned sequences C---- IAGR=0 ILEN=0 IBEG= MAX(IFIR(IALIGN),IFIR(JALIGN)) IEND= MIN(ILAS(IALIGN),ILAS(JALIGN)) DO IRES=IBEG,IEND IF ( IALIGN_SEQ(IRES) .NE. 0 .AND. + JALIGN_SEQ(IRES) .NE. 0) THEN C---- count up number of pairs found for current residue NOCC(IRES)=NOCC(IRES)+1 IF (IALIGN_SEQ(IRES).EQ.JALIGN_SEQ(IRES)) THEN IAGR=IAGR+1 ENDIF ILEN=ILEN+1 ENDIF ENDDO IF (ILEN .NE. 0) THEN SEQDIST= 1.0 - (FLOAT(IAGR)/ILEN) DO IRES=IBEG,IEND IF ( IALIGN_SEQ(IRES).NE.0 .AND. + JALIGN_SEQ(IRES).NE.0 ) THEN SEQDIST_POS(IRES)= + SEQDIST_POS(IRES) + SEQDIST SIMDIST_POS(IRES)= + SIMDIST_POS(IRES) + + SEQDIST * SIMCONSERV(IALIGN_SEQ(IRES), + JALIGN_SEQ(IRES)) ENDIF END DO END IF ENDIF C---- end of: exclude? ENDDO C---- end loop over pairs ENDIF ENDDO C---- end loop over all alignments C---- ------------------------------------------------------------------ C---- assign conservation weights C---- ------------------------------------------------------------------ DO IRES=1,NRES IF ((SEQDIST_POS(IRES) .GT. 0) .AND. + (NOCC(IRES) .GT. 0) ) THEN CONSWEIGHT(IRES)= + SIMDIST_POS(IRES)/SEQDIST_POS(IRES) ELSE CONSWEIGHT(IRES)=1.0 END IF C---- C---- no negative values for conservation weight C---- IF (CONSWEIGHT(IRES) .LT. CONSWEIGHT_MIN) THEN CONSWEIGHT(IRES)=CONSWEIGHT_MIN ENDIF END DO C---- ------------------------------------------------------------------ C---- weight conservation weights (average = 1) C---- ------------------------------------------------------------------ NPOS= 0 MEAN= 1.0 SUM= 0.0 DO IRES=1,NRES IF (NOCC(IRES).NE.0) THEN SUM = SUM + CONSWEIGHT(IRES) NPOS=NPOS+1 ENDIF ENDDO IF (NPOS .NE. 0) THEN MEAN=SUM/NPOS ENDIF WRITE(6,*)'GETCONSWEIGHT: SUM,MEAN ',SUM,MEAN IF (MEAN.GT. 0.99 .AND. MEAN .LT. 1.01) RETURN XVAL=1.0-MEAN DO IRES=1,NRES IF (NOCC(IRES).NE.0) CONSWEIGHT(IRES)=CONSWEIGHT(IRES)+XVAL ENDDO WRITE(6,*) '--- GETCONSWEIGHT_BR ended' RETURN END C END GETCONSWEIGHT_BR C...................................................................... C...................................................................... C SUB GETSEQSIM_BR SUBROUTINE GETSEQSIM_BR(MAXSTRSTATES,MAXIOSTATES, + MAXRES,MAXALIGNS,NTRANS,NASCII,NALIGN,NRES, + IFIR,ILAS,LALI,ALIPOINTER,PDBSEQ,ALISEQ, + EXCLUDEFLAG,CTRANS,MATRIX,SIMCONSERV,MATPOS,SIM,CONSWEIGHT) C---- C---- BR 99-03 C---- C---- Calculate the sequence similarity. C---- EXCLUDEFLAG(iali)='*' if the identity of alignment number C---- 'iali' to any other sequence is above IDEMAX (threshold input) C---- C---- C---- out: output is the conservation weight C---- C---- ------------------------------------------------------------------ C---- C---- C---- ------------------------------------------------------------------ C---- IMPLICIT NONE C---- C---- import C---- C parameters for array INTEGER MAXSTRSTATES,MAXIOSTATES,MAXRES,MAXALIGNS, + NTRANS,NASCII C actual values INTEGER NALIGN,NRES C pointer arrays INTEGER IFIR(MAXALIGNS),ILAS(MAXALIGNS), + ALIPOINTER(MAXALIGNS) C alignment length lali(i) length of ali between PDBSEQ and ali i INTEGER LALI(MAXALIGNS) C guide sequence CHARACTER PDBSEQ(MAXRES) C all aligned sequences CHARACTER*(*) ALISEQ(*) C flags (take ' ', else other) CHARACTER EXCLUDEFLAG(MAXALIGNS) C percentage sequence identity C REAL IDE(MAXALIGNS),IDEMAX C comparison metric REAL MATRIX(NTRANS,NTRANS,MAXSTRSTATES,MAXIOSTATES, + MAXSTRSTATES,MAXIOSTATES) C normalised to sum over 0 (by SETCONSERVATION) REAL SIMCONSERV(NTRANS,NTRANS) C allowed sequence symbols CHARACTER CTRANS*26 C only used to get rid of INDEX command (CPU time) INTEGER MATPOS(NASCII) C conservation weight REAL CONSWEIGHT(MAXRES) C---- C---- output from sbr C---- C percentage sequence similarity REAL SIM(MAXRES) C---- ------------------------------------------------------------------ C---- internal C---- ------------------------------------------------------------------ C---- C INTEGER NASCII,MAXLEN INTEGER MAXRES_LOC,MAXLEN PARAMETER (MAXRES_LOC= 9999) C PARAMETER (MAXRES_LOC= 10000) C PARAMETER (MAXRES_LOC= 30011) PARAMETER (MAXLEN= 20000) INTEGER IALIGN_SEQ(MAXRES_LOC),JALIGN_SEQ(MAXRES_LOC), + I,J,IALIGN,JALIGN,ILEN,IRES, + IPOS,JPOS,IBEG,IEND,KPOS,IAGR,ISTEP,ISAFERANGE, + IALIGN_SEQTMP,JALIGN_SEQTMP C---- big: seq ide distance for pair i,j INTEGER NPOS,NRES_OVERLAP REAL SIM_XX(1000),SIM_XX2(1000) CHARACTER LINE*(MAXLEN) LOGICAL LERROR C---- ------------------------------------------------------------------ C---- C---- defaults, ini C---- ------------------------------------------------------------------ C---- now work on it C---- ------------------------------------------------------------------ WRITE(6,*) '--- GETSEQSIM_BR begin' C---- C---- calculate variability only for the 22 (BZ) amino acids C---- DO I=1,NASCII MATPOS(I)=0 ENDDO CALL GETPOS(CTRANS(1:22),MATPOS,NASCII) IF (NRES .GT. MAXRES) THEN WRITE(*,*)'*** ERROR: NRES .GT. MAXRES IN GETSEQSIM_BR' WRITE(*,*)'*** INCREASE MAXRES ****' STOP ENDIF C---- C---- ERROR! C---- IF (NRES .LE. 0) RETURN C---- ------------------------------------------------------------------ C---- guide against all C---- ------------------------------------------------------------------ C---- store PDB sequence in integer array DO IRES=1,NRES IALIGN_SEQ(IRES)=MATPOS(ICHAR(PDBSEQ(IRES))) IF ( IALIGN_SEQ(IRES) .EQ. 0) THEN IF ( PDBSEQ(IRES) .GE. 'a' .AND. + PDBSEQ(IRES) .LE. 'z') THEN IALIGN_SEQ(IRES)= + MATPOS(ICHAR(PDBSEQ(IRES))-32) ENDIF ENDIF ENDDO C---- C---- loop over all aligned sequences C---- DO JALIGN=1,NALIGN IF ( EXCLUDEFLAG(JALIGN) .EQ. ' ') THEN C---- store alignment sequence in integer array (for j) JPOS=ALIPOINTER(JALIGN)-IFIR(JALIGN) DO IRES=IFIR(JALIGN),ILAS(JALIGN) JALIGN_SEQ(IRES)=MATPOS(ICHAR(ALISEQ(JPOS+IRES))) IF ( JALIGN_SEQ(IRES) .EQ. 0) THEN IF (ALISEQ(JPOS+IRES) .GE. 'a' .AND. + ALISEQ(JPOS+IRES) .LE. 'z') THEN JALIGN_SEQ(IRES)= + MATPOS(ICHAR(ALISEQ(JPOS+IRES))-32) ENDIF ENDIF ENDDO C---- C---- get overlap of aligned sequences C---- IBEG= MAX(1, IFIR(JALIGN)) IEND= MIN(NRES,ILAS(JALIGN)) C---- ini new value for sequence similarity NRES_OVERLAP=0 SIM(JALIGN)= 0 SIM_XX(JALIGN)= 0 DO IRES=IBEG,IEND IF ( (JALIGN_SEQ(IRES) .NE. 0) .AND. + (JALIGN_SEQ(IRES) .NE. 0) ) THEN NRES_OVERLAP=NRES_OVERLAP+1 C---- C---- sequence similarity C---- C SIM(JALIGN)=SIM(JALIGN)+ C + MATRIX(IALIGN_SEQ(IRES),JALIGN_SEQ(IRES),1,1,1,1) C SIM(JALIGN)=SIM(JALIGN)+ C + CONSWEIGHT(IRES)* C + MATRIX(IALIGN_SEQ(IRES),JALIGN_SEQ(IRES),1,1,1,1) SIM(JALIGN)=SIM(JALIGN)+ + CONSWEIGHT(IRES)* + SIMCONSERV(IALIGN_SEQ(IRES),JALIGN_SEQ(IRES)) C SIM_XX(JALIGN)=SIM_XX(JALIGN)+ C + MATRIX(IALIGN_SEQ(IRES),JALIGN_SEQ(IRES),1,1,1,1) SIM_XX(JALIGN)=SIM_XX(JALIGN)+ + SIMCONSERV(IALIGN_SEQ(IRES),JALIGN_SEQ(IRES)) END IF END DO C---- C---- normalise sequence similarity to ratio C---- SIM_XX2(JALIGN)=SIM_XX(JALIGN) IF (NRES_OVERLAP.GT.0) + SIM_XX2(JALIGN)= + SIM_XX2(JALIGN)/REAL(NRES_OVERLAP) NRES_OVERLAP=1+ILAS(JALIGN)-IFIR(JALIGN) IF (NRES_OVERLAP.GT.0) + SIM(JALIGN)= + SIM(JALIGN)/REAL(NRES_OVERLAP) IF (NRES_OVERLAP.GT.0) + SIM_XX(JALIGN)= + SIM_XX(JALIGN)/REAL(NRES_OVERLAP) ENDIF C---- end of: exclude? ENDDO C---- end loop over all sequences aligned to guide write(6,*)'xx upon leaving got:' DO I=1,NALIGN write(6,'(i3,a,f5.2,a,f5.2,a,f5.2)') + i,'xx sim=',SIM(i),' xx=',SIM_XX(i), + ' nnt=',sim_xx2(i) END DO do i=1,22 C write(6,'(i2,22i3)')i,(int(100*MATRIX(i,j,1,1,1,1)),j=1,22) write(6,'(i2,22i3)')i,(int(100*SIMCONSERV(i,j)),j=1,22) enddo stop write(6,*)'xx dia' write(6,'(22i4)')(int(100*MATRIX(j,j,1,1,1,1)),j=1,22) stop Cxxx WRITE(6,*) '--- GETSEQSIM_BR ended' RETURN END C END GETSEQSIM_BR C...................................................................... C...................................................................... C SUB IDENTITY SUBROUTINE IDENTITY(BEGIN,END,CGAPCHAR,SEQ1,SEQ2,IDE1,NOVERLAP) C 26.10.92 IMPLICIT NONE C Import INTEGER BEGIN,END CHARACTER*1 CGAPCHAR CHARACTER*(*) SEQ1,SEQ2 C Export INTEGER NOVERLAP REAL IDE1 C Internal INTEGER I,NIDE CHARACTER*1 C1, C2 NOVERLAP= 0 NIDE= 0 DO I= BEGIN,END C1= SEQ1(I:I) C2= SEQ2(I:I) CALL LOWTOUP(C1,1) CALL LOWTOUP(C2,1) IF ( C1 .NE. CGAPCHAR .AND. C2 .NE. CGAPCHAR ) THEN NOVERLAP = NOVERLAP + 1 IF ( C1 .EQ. C2 ) NIDE = NIDE + 1 ENDIF ENDDO IF ( NOVERLAP .GT. 0 ) THEN IDE1 = FLOAT(NIDE) / FLOAT(NOVERLAP) ELSE WRITE(*,'(A)') 1 ' ** sequences with no overlap : %ide set to 0 **' IDE1 = 0.0 ENDIF RETURN END C END IDENTITY C...................................................................... C...................................................................... C SUB MAKE_HSSP SUBROUTINE MAKE_HSSP(MAXRES,MAXALIGNS, + MAXCORE,MAXAA,MAXINS,MAXINSBUF, + KOUT,OUTNAME,INFILE,PDBID,HEADER, + COMPOUND,SOURCE,AUTHOR,ALILEN,NCHAIN,KCHAIN, + CHAINREMARK,NALIGN,EXCLUDEFLAG,EMBLID,STRID, + IDE,SIM,IFIR,ILAS,JFIR,JLAS,LALI,NGAP,LGAP, + LENSEQ,ACCNUM_HSSP,PROTNAME,PDBNO,PDBSEQ,CHAINID, + SECSTR_HSSP,COLS,SHEETLABEL,BP1,BP2,ACC,NOCC, + VAR,ALISEQ,ALIPOINTER,SEQPROF,NDEL,NINS,ENTROPY, + RELENT,CONSWEIGHT_MIN,CONSWEIGHT,LCONSERV,LOLDVERSION, + CGAPCHAR,LFILTER,METRICFILE, LDOCLIP, + INSNUMBER,INSALI,INSPOINTER,INSLEN,INSBEG_1, + INSBEG_2,INSBUFFER,DO_INIT_INSBUF,ERROR, + LHSSP_LONG_ID) C 26.10.92 C 5.7.93 write insertion lists C---- ------------------------------------------------------------------ IMPLICIT NONE C Import INTEGER MAXRES,MAXALIGNS,MAXCORE,MAXAA,MAXINS,MAXINSBUF, + NALIGN C HSSP descricptiption variables LOGICAL LCONSERV,LOLDVERSION,LHSSP_LONG_ID C attributes of sequence with known structure CHARACTER PDBSEQ(MAXRES),CHAINID(MAXRES), + SECSTR_HSSP(MAXRES),SHEETLABEL(MAXRES) CHARACTER*(*) PDBID,COLS(MAXRES),HEADER, + COMPOUND,SOURCE,AUTHOR,CHAINREMARK INTEGER NCHAIN, KCHAIN, NREAD,ACC(MAXRES), + PDBNO(MAXRES),BP1(MAXRES),BP2(MAXRES) C attributes of aligned sequences INTEGER IFIR(MAXALIGNS), ILAS(MAXALIGNS), + ALIPOINTER(MAXALIGNS), + INSNUMBER,INSALI(MAXINS),INSPOINTER(MAXINS), + INSLEN(MAXINS),INSBEG_1(MAXINS),INSBEG_2(MAXINS) CHARACTER INSBUFFER(MAXINSBUF),ALISEQ(MAXCORE), + EXCLUDEFLAG(MAXALIGNS) CHARACTER*(*) STRID(MAXALIGNS),ACCNUM_HSSP(MAXALIGNS), + EMBLID(MAXALIGNS),PROTNAME(MAXALIGNS) INTEGER JFIR(MAXALIGNS),JLAS(MAXALIGNS), + LALI(MAXALIGNS),NGAP(MAXALIGNS), + LGAP(MAXALIGNS),LENSEQ(MAXALIGNS) REAL IDE(MAXALIGNS),SIM(MAXALIGNS) C attributes of profile INTEGER VAR(MAXRES),SEQPROF(MAXRES,MAXAA),RELENT(MAXRES), + NOCC(MAXRES),NDEL(MAXRES),NINS(MAXRES) REAL ENTROPY(MAXRES) C threshold INTEGER ISOLEN(100),NSTEP,ISAFE LOGICAL LFORMULA,LALL REAL CONSWEIGHT(MAXRES),ISOIDE(100),CONSWEIGHT_MIN C .. and others INTEGER KOUT,ALILEN CHARACTER CGAPCHAR CHARACTER*(*) OUTNAME,INFILE,METRICFILE LOGICAL LFILTER, LDOCLIP, DO_INIT_INSBUF C Export C .. output to unit kout LOGICAL ERROR C Internal INTEGER KSIM,CODELEN_LOC,NTRANS,MAXRES_LOC, + MAXINS_LOC,MAXSTRSTATES,MAXIOSTATES PARAMETER (KSIM= 98) PARAMETER (CODELEN_LOC= 40) PARAMETER (MAXINS_LOC= 50000) PARAMETER (MAXRES_LOC= 9999) C PARAMETER (MAXRES_LOC= 10000) C PARAMETER (MAXRES_LOC= 30011) PARAMETER (NTRANS= 26) PARAMETER (MAXSTRSTATES= 3) PARAMETER (MAXIOSTATES= 4) INTEGER I, J, ISTART, IFIN,NPARALINE, CHANGEPOS, + NSTRSTATES,NIOSTATES,NOVERLAP, + SEQSTART,SEQSTOP, NDELETED, + DELPOS(2,MAXINS_LOC),NDELPOS,NRES C comparison metric REAL MATRIX (NTRANS,NTRANS,MAXSTRSTATES,MAXIOSTATES, + MAXSTRSTATES,MAXIOSTATES) C normalised to sum over 0 (by SETCONSERVATION) REAL SIMCONSERV(NTRANS,NTRANS) C REAL IORANGE(MAXSTRSTATES,MAXIOSTATES,2), + SMIN, SMAX, MAPLOW, MAPHIGH, THISIDE CHARACTER*(MAXRES_LOC) SEQ1, SEQ2 CHARACTER*(NTRANS) TRANS CHARACTER*(CODELEN_LOC) PDBNAME CHARACTER*9 CDATE CHARACTER*80 ISOSIGFILE,DATABASE CHARACTER*132 CPARAMETER(5),HSSPLINE CHARACTER CSTRSTATES*(MAXSTRSTATES),CIOSTATES*(MAXIOSTATES) LOGICAL LINSERT, LDELETE C---- C---- only used to get rid of INDEX command (CPU time) C---- INTEGER NASCII PARAMETER (NASCII= 256) INTEGER MATPOS(NASCII) C---- ------------------------------------------------------------------ C---- C---- initialise C---- LFORMULA= .FALSE. LALL= .TRUE. LCONSERV= .FALSE. LOLDVERSION= .FALSE. C---- SMIN= 0.0 SMAX= 1.0 MAPLOW= 0.0 MAPHIGH= 0.0 CSTRSTATES= ' ' CIOSTATES= ' ' C---- HSSPLINE='HSSP HOMOLOGY DERIVED SECONDARY STRUCTURE ' // 1 'OF PROTEINS , VERSION 1.0 1991' CALL STRPOS(INFILE,ISTART,IFIN) DATABASE= 'SEQBASE ' // INFILE(ISTART:IFIN) HEADER= ' ' COMPOUND= ' ' SOURCE= ' ' AUTHOR= ' ' CHAINREMARK= ' ' ISOSIGFILE= 'ALL' NPARALINE= 1 LFILTER= .FALSE. C---- TRANS='VLIMFWYGAPSTCHRKQENDBZX!-.' C---- CALL CHOOSE_NAME(EMBLID,NALIGN,CHANGEPOS) LINSERT= .FALSE. LDELETE= .TRUE. SEQSTART= 0 SEQSTOP= 0 PDBSEQ(1)= ' ' PDBNAME= ' ' C---- C---- ------------------------------------------------------------------ C---- C---- C delete the master sequence from aliseq data structure C ( later restore it .. could be done more clever !! ) C---- also returns PDBSEQ= CALL CHANGE_ALISEQ(MAXALIGNS,MAXCORE,MAXINS,MAXRES, 1 LINSERT,LDELETE,CHANGEPOS,NALIGN, + PDBSEQ,SEQSTART,SEQSTOP,PDBNAME, 2 ALILEN,ALISEQ,ALIPOINTER, + IFIR,ILAS,JFIR,JLAS,LALI,NGAP,LGAP,LENSEQ,STRID, 3 IDE,SIM,EXCLUDEFLAG,ACCNUM_HSSP,EMBLID,PROTNAME,INSALI, 5 INSNUMBER) C write(6,'(a)')'xx after change_aliseq pdb=' C write(6,'(40a1)')(PDBSEQ(I),I=SEQSTART,SEQSTOP) C---- br 99-03: tried to be smart, failed C EXCLUDEFLAG(CHANGEPOS)='*' IF ( LDOCLIP ) THEN C remove gaps from master sequence LDELETE= .TRUE. C hack br: 99-12 DO I=1,ALILEN SEQ1(I:I)=PDBSEQ(I) END DO CALL MARK_LOC_RUNS(LDELETE,SEQ1,1,ALILEN,CGAPCHAR, 1 DELPOS,NDELETED,NDELPOS) C .. and delete gapped stretches in ALL alignments; C... save them in "insbuffer" C >>> THIS IS A IRREVERSIBLE CHANGE TO "ALISEQ" <<<<<<<<<<<<<<<<<<<<< IF (NDELPOS .GT. 0) THEN CALL CLIP_ALISEQ(MAXALIGNS,MAXCORE,MAXINS, 1 MAXINSBUF,ALISEQ,ALIPOINTER,NALIGN,CGAPCHAR, 2 ALILEN,IFIR,ILAS,JFIR,JLAS,DELPOS,NDELPOS, 3 INSNUMBER,INSALI,INSPOINTER,INSLEN,INSBEG_1, 4 INSBEG_2,INSBUFFER,ERROR) IF ( ERROR ) STOP DO J = 1,ALILEN PDBSEQ(J)= SEQ1(J:J) ENDDO ELSE C write(6,*)'xx clip 2 alilen=',alilen DO J = 1,ALILEN SEQ1(J:J)=PDBSEQ(J) C write(6,*)'xx pdbseq(j=',j,')=',pdbseq(j) ENDDO ENDIF C WRITE(6,*)'XX AFTER CLIP STRPOS PDB=' C WRITE(6,'(40A1)')(PDBSEQ(I),I=1,ALILEN) ENDIF CALL STRPOS(PDBNAME,ISTART,IFIN) PDBID = PDBNAME(ISTART:IFIN) CPARAMETER(1) = ' CONVERTSEQ of ' // pdbname(istart:IFIN) IF (LHSSP_LONG_ID) THEN NPARALINE=NPARALINE+1 CPARAMETER(NPARALINE)=' LONG-ID : YES' ENDIF LDELETE = .FALSE. DO I= 1,NALIGN SEQ2 = ' ' CALL GET_SEQ_FROM_ALISEQ(ALISEQ,IFIR,ILAS,ALIPOINTER, 1 ALILEN,I,SEQ2,NREAD,ERROR) IF ( ERROR ) STOP CALL IDENTITY(IFIR(I),ILAS(I),CGAPCHAR,SEQ1,SEQ2,THISIDE, 1 NOVERLAP) IDE(I)= THISIDE LALI(I)= ILAS(I)-IFIR(I)+1 CALL MARK_LOC_RUNS(LDELETE,SEQ2,IFIR(I),ILAS(I), 1 CGAPCHAR,DELPOS,LGAP(I),NGAP(I)) SEQ2 = ' ' JFIR(I)= 1 JLAS(I)= LALI(I)-LGAP(I) LENSEQ(I)= JLAS(I) PROTNAME(I)= ' ' EXCLUDEFLAG(I)= ' ' STRID(I)= ' ' ACCNUM_HSSP(I)= ' ' ENDDO DO I = 1,ALILEN SECSTR_HSSP(I)= ' ' SHEETLABEL(I)= ' ' CHAINID(I)= ' ' PDBNO(I)= I COLS(I)= ' ' ACC(I)= 0 BP1(I)= 0 BP2(I)= 0 CONSWEIGHT(I)= 1.0 ENDDO CALL GETDATE(CDATE) ISAFE= 0 NRES= ALILEN C---- C---- get similarity metric and scale according to smin/smax C---- CALL GETSIMMETRIC(NTRANS,TRANS,MAXSTRSTATES, + MAXIOSTATES,NSTRSTATES,NIOSTATES, + NSTRSTATES,NIOSTATES,CSTRSTATES, + CIOSTATES,IORANGE,KSIM,METRICFILE,MATRIX) C---- C---- 98-10: br & rs C---- normalise MATRIX -> SIMCONSERV C---- such that SIMCONSERV has an average of 0 C---- CALL SETCONSERVATION(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES,NIOSTATES,CSTRSTATES,CIOSTATES,IORANGE,KSIM, + METRICFILE,MATRIX,SIMCONSERV) C---- C---- 98-10: br C---- 98-10: br & rs C---- compile conservation weights C---- 99-03: br add sequence similarity C---- C WRITE(6,*)'XX BEFORE CONS PDB=' C WRITE(6,'(40A1)')(PDBSEQ(I),I=1,NRES) CALL GETCONSWEIGHT_BR(MAXSTRSTATES,MAXIOSTATES, + MAXRES,MAXALIGNS,MAXCORE,NTRANS,NASCII,NALIGN,NRES, + IDE,IFIR,ILAS,LALI,ALIPOINTER,PDBSEQ,ALISEQ, + EXCLUDEFLAG,TRANS,MATRIX,SIMCONSERV,MATPOS, + CONSWEIGHT_MIN,CONSWEIGHT) C STOP 'xx after get cons' C---- C---- rescale metric C---- CALL SCALEMETRIC(NTRANS,TRANS,MAXSTRSTATES, + MAXIOSTATES,MATRIX,SMIN,SMAX,MAPLOW,MAPHIGH) C---- xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx C---- C---- 99-03: br add sequence similarity Cxxxxxxxxx wanted to, failed... C---- xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx C---- C CALL GETSEQSIM_BR(MAXSTRSTATES,MAXIOSTATES, C + MAXRES,MAXALIGNS,NTRANS,NASCII,NALIGN,NRES, C + IFIR,ILAS,LALI,ALIPOINTER,PDBSEQ,ALISEQ, C + EXCLUDEFLAG,TRANS,MATRIX,SIMCONSERV,MATPOS,SIM, C + CONSWEIGHT) C---- xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx C---- C---- calculate variability C---- CALL CALC_VAR(NALIGN,ALILEN,PDBSEQ,IDE,IFIR, + ILAS,ALIPOINTER,ALISEQ,EXCLUDEFLAG, + MAXSTRSTATES,MAXIOSTATES,NTRANS,TRANS, + MATRIX,VAR) C---- C---- profiles C---- CALL CALC_PROF(MAXRES,MAXAA,ALILEN,PDBSEQ,NALIGN, + EXCLUDEFLAG,IDE,IFIR,ILAS, + ALISEQ,ALIPOINTER,TRANS, + SEQPROF,NOCC,NDEL,NINS,ENTROPY,RELENT) C---- C---- finally write the HSSP file C---- C---- write header CALL HSSPHEADER(KOUT,OUTNAME,HSSPLINE,PDBID,CDATE, + DATABASE,CPARAMETER,NPARALINE, + ISOSIGFILE,ISAFE,LFORMULA, + HEADER,COMPOUND,SOURCE,AUTHOR,ALILEN, + NCHAIN,KCHAIN,CHAINREMARK,NALIGN) C---- write data (alignments) CALL WRITE_HSSP(KOUT,MAXRES,NALIGN,ALILEN,EMBLID,STRID, + ACCNUM_HSSP,IDE,SIM,IFIR,ILAS,JFIR,JLAS, + LALI,NGAP,LGAP,LENSEQ,PROTNAME,ALIPOINTER, + ALISEQ,PDBNO,CHAINID,PDBSEQ,SECSTR_HSSP,COLS, + BP1,BP2,SHEETLABEL,ACC,NOCC,VAR,SEQPROF, + NDEL,NINS,ENTROPY,RELENT,CONSWEIGHT, + INSNUMBER,INSALI,INSPOINTER,INSLEN, + INSBEG_1,INSBEG_2,INSBUFFER,ISOLEN, + ISOIDE,NSTEP,LFORMULA,LALL,ISAFE, + EXCLUDEFLAG,LCONSERV,LHSSP_LONG_ID) C finally restore aliseq data structure (! THINK OF SOMETHING BETTER ! ) LINSERT = .TRUE. LDELETE = .FALSE. CALL CHANGE_ALISEQ(MAXALIGNS,MAXCORE,MAXINS,MAXRES,LINSERT, 1 LDELETE,CHANGEPOS,NALIGN,PDBSEQ,SEQSTART,SEQSTOP, 2 PDBNAME,ALILEN,ALISEQ,ALIPOINTER,IFIR,ILAS,JFIR, 3 JLAS,LALI,NGAP,LGAP,LENSEQ,STRID,IDE,SIM, 4 EXCLUDEFLAG,ACCNUM_HSSP,EMBLID,PROTNAME,INSALI, 5 INSNUMBER) RETURN END C END MAKE_HSSP C...................................................................... C...................................................................... C SUB MARK_DUPLICATES SUBROUTINE MARK_DUPLICATES(LIST,NENTRIES) C 12.8.93 IMPLICIT NONE C Import INTEGER NENTRIES C Import / Export CHARACTER*(*) LIST(*) C Internal INTEGER ENTRYLEN PARAMETER (ENTRYLEN= 41) INTEGER I,J,ISTART,ISTOP,JSTART,JSTOP,NTIMES CHARACTER*(ENTRYLEN) CSTRING *----------------------------------------------------------------------* DO J = 1,NENTRIES-1 NTIMES = 1 CALL STRPOS(LIST(J),JSTART,JSTOP) DO I = J+1,NENTRIES CALL STRPOS(LIST(I),ISTART,ISTOP) IF ( LIST(J)(JSTART:JSTOP) .EQ. 1 LIST(I)(ISTART:ISTOP) ) THEN NTIMES = NTIMES + 1 CALL CONCAT_STRING_INT(LIST(I),NTIMES,CSTRING) LIST(I)(1:40) = CSTRING(1:40) ENDIF ENDDO ENDDO RETURN END C END MARK_DUPLICATES C...................................................................... C...................................................................... C SUB MARK_LOC_RUNS ***** ------------------------------------------------------------------ ***** SUB MARK_LOC_RUNS ***** ------------------------------------------------------------------ C---- C---- NAME : MARK_LOC_RUNS C---- ARG : 1 LDELETE= logical if true: delete from sequence C---- ARG : 2 SEQUENCE= string C---- ARG : 3 IBEG= first residue position in SEQUENCE C---- ARG : 4 IEND= last residue position in SEQUENCE C---- ARG : --> SEQUENCE(IBEG:IEND) C---- ARG : 5 SYMBOL character to search (count and delete) C---- ARG : 6 RUNPOS C---- ARG : 7 RUNLEN C---- ARG : 8 NRUN number of occurrences of SYMBOL in C---- ARG : SEQUENCE C---- DES : Checks how often SYMBOL occurs in SEQUENCE, possibly C---- DES : deletes it (e.g. remove insertions!) C---- *----------------------------------------------------------------------* SUBROUTINE MARK_LOC_RUNS(LDELETE,SEQUENCE,IBEG,IEND, 1 SYMBOL,RUNPOS,RUNLEN,NRUN) C 5.7.93 C 20.10. only internal runs IMPLICIT NONE C Import C .. "begin", "end" is left unchanged, C even if some positions are actually deleted !! INTEGER IBEG, IEND INTEGER RUNLEN CHARACTER*1 SYMBOL LOGICAL LDELETE C Import/Export CHARACTER*(*) SEQUENCE C Export INTEGER NRUN INTEGER RUNPOS(2,*) C Internal INTEGER LAST_NONSYMBOL,IPOS,JPOS,RLEN,IRUN LOGICAL INSIDE, IS_LOC_RUN ******------------------------------*-----------------------------****** C---- ini NRUN= 0 RUNLEN= 0 INSIDE= .FALSE. IS_LOC_RUN= .FALSE. LAST_NONSYMBOL=IEND C---- C---- loop over all residues of sequence C---- C---- NRUN= number of occurrences of SYMBOL C---- RUNPOS(1,i)= [i=1,NRUN] the position of the ith occurrence C---- RUNPOS(2,i)= [i=1,NRUN] ?? C---- LAST_NONSYMBOL=position of the last NON-hit (counts up) C---- DO IPOS=IBEG,IEND C------- is insertion ('.') IF ( SEQUENCE(IPOS:IPOS) .EQ. SYMBOL .AND. + IS_LOC_RUN .EQV. .TRUE. ) THEN IF ( .NOT. INSIDE ) THEN INSIDE= .TRUE. NRUN= NRUN + 1 RUNPOS(1,NRUN)= IPOS RUNPOS(2,NRUN)= 0 ENDIF RUNPOS(2,NRUN)= RUNPOS(2,NRUN) + 1 RUNLEN= RUNLEN + 1 C------- is NOT insertion ELSE IF ( SEQUENCE(IPOS:IPOS) .NE. SYMBOL ) THEN C---------- reset flag IF ( INSIDE ) INSIDE = .FALSE. IS_LOC_RUN= .TRUE. LAST_NONSYMBOL= IPOS ENDIF ENDDO C---- trailing insertion - DO NOT COUNT C---- -> reduce by 1! IF (NRUN.GT.0) THEN IF ( RUNPOS(1,NRUN) .EQ. LAST_NONSYMBOL+1 ) THEN NRUN=NRUN - 1 ENDIF ENDIF IF ( LDELETE .AND. (NRUN.GT.0) ) THEN RLEN = 0 IRUN = 1 DO IPOS = IBEG,IEND IF ( IPOS .EQ. RUNPOS(1,IRUN) ) THEN DO JPOS = IPOS-RLEN,IEND-RUNPOS(2,IRUN) SEQUENCE(JPOS:JPOS) = 1 SEQUENCE(JPOS+RUNPOS(2,IRUN):JPOS+RUNPOS(2,IRUN)) ENDDO RLEN = RLEN + RUNPOS(2,IRUN) IRUN = IRUN + 1 ENDIF ENDDO ENDIF RETURN END C END MARK_LOC_RUNS C...................................................................... C...................................................................... C SUB SETCONSERVATION SUBROUTINE SETCONSERVATION(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES,NIOSTATES,CSTRSTATES,CIOSTATES,IORANGE,KSIM, + METRIC_FILENAME,MATRIX,SIMCONSERV) C 1. set conservation weights to 1.0 C 2. rescale matrix for the 22 amino residues such that the sum over C the matrix is 0.0 (or near) C this matrix is used to calculate the conservation weights (SIMCONSERV) c implicit none C import CHARACTER*(*) METRIC_FILENAME REAL MATRIX(NTRANS,NTRANS,MAXSTRSTATES,MAXIOSTATES, + MAXSTRSTATES,MAXIOSTATES) REAL SIMCONSERV(1:26,1:26) INTEGER NSTRSTATES,NIOSTATES,KSIM CHARACTER*(*) CSTRSTATES,CIOSTATES REAL IORANGE(MAXSTRSTATES,MAXIOSTATES) C internal INTEGER NACID,MAXSQ PARAMETER (NACID= 22) PARAMETER (MAXSQ= 9999) C PARAMETER (MAXSQ= 30111) C PARAMETER (MAXSQ= 100000) CHARACTER TRANS*26 INTEGER I,J REAL XLOW,XHIGH,XMAX,XMIN,XFACTOR,SUMMAT REAL CONSWEIGHT_1(MAXSQ),CONSWEIGHT_2(MAXSQ),CONSMIN LOGICAL LFIRSTWEIGHT *----------------------------------------------------------------------* C DO I=1,MAXSQ CONSWEIGHT_1(I)=1.0 ENDDO LFIRSTWEIGHT=.TRUE. C get metric C 98-10 br: already done C CALL GETSIMMETRIC(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, C + NSTRSTATES,NIOSTATES,NSTRSTATES,NIOSTATES, C + CSTRSTATES,CIOSTATES,IORANGE,KSIM,METRIC_FILENAME,MATRIX) c rescale matrix that the sum over matrix is +- 0.0 XLOW= 0.0 XHIGH= 0.0 XMAX= 1.0 XMIN= -1.0 XFACTOR=100.0 C (re)store original values in simconserv() 20 DO J=1,NTRANS DO I=1,NTRANS SIMCONSERV(I,J)=MATRIX(I,J,1,1,1,1) ENDDO ENDDO c scale with xmin/xmax CALL SCALEINTERVAL(SIMCONSERV,NTRANS**2,XMIN,XMAX,XLOW,XHIGH) C RESEt the values for 'X' '!' and '-' I=INDEX(TRANS,'X') DO J=1,NTRANS SIMCONSERV(I,J)=0.0 SIMCONSERV(J,I)=0.0 ENDDO I=INDEX(TRANS,'!') DO J=1,NTRANS SIMCONSERV(I,J)=0.0 SIMCONSERV(J,I)=0.0 ENDDO I=INDEX(TRANS,'-') DO J=1,NTRANS SIMCONSERV(I,J)=0.0 SIMCONSERV(J,I)=0.0 ENDDO I=INDEX(TRANS,'.') DO J=1,NTRANS SIMCONSERV(I,J)=0.0 SIMCONSERV(J,I)=0.0 ENDDO C calculate sum over matrix (22 amino acids) after scaling SUMMAT=0.0 DO I=1,NACID DO J=1,NACID SUMMAT=SUMMAT+SIMCONSERV(I,J) ENDDO ENDDO cd write(*,*)' sum: ',summat,xmin c check sum=0.0 (+- 0.01) ; if not xmin=xmin/2 ; scale again IF (SUMMAT .GT. 0.01) THEN XMIN=XMIN+(XMIN/XFACTOR) ELSE IF (SUMMAT .LT. -0.01) THEN XMIN=XMIN-(XMIN/XFACTOR) ELSE WRITE(*,*)' SETCONSERVATION: sum over matrix: ',summat WRITE(*,*)' smin is : ',xmin RETURN ENDIF GOTO 20 END C END SETCONSERVATION C...................................................................... C...................................................................... C SUB STRREPLACE SUBROUTINE STRREPLACE(STRING,LENGTH,C1,C2) c Implicit None C replaces all occurences of c1 by c2 C Import INTEGER LENGTH CHARACTER*1 C1, C2 C Import/Export CHARACTER*(*) STRING C Internal INTEGER IPOS DO IPOS = 1,LENGTH IF ( STRING(IPOS:IPOS) .EQ. C1 ) STRING(IPOS:IPOS) = C2 ENDDO RETURN END C END STRREPLACE C...................................................................... C...................................................................... C SUB WRITE_DAF SUBROUTINE WRITE_DAF(KUNIT,INFILE,OUTFILE,MAXALIGNS,MAXRES, 1 MAXCORE,MAXINS,MAXINSBUF,BEGIN,END,ALISEQ, 2 ALIPOINTER,IDE,IFIR,ILAS,EMBLID,WEIGHT, 3 ALILEN,NSEQ,LALI,INSNUMBER,INSALI,INSPOINTER, 4 INSLEN,INSBEG_1,INSBUFFER,ERROR) IMPLICIT NONE C---- import INTEGER MAXALIGNS,MAXRES,MAXCORE,MAXINS,MAXINSBUF, + KUNIT,BEGIN,END,NSEQ,ALILEN,INSNUMBER INTEGER ALIPOINTER(MAXALIGNS),LALI(MAXALIGNS), + IFIR(MAXALIGNS),ILAS(MAXALIGNS), + INSALI(MAXINS),INSPOINTER(MAXINS), + INSLEN(MAXINS),INSBEG_1(MAXINS) CHARACTER*(*) INFILE, OUTFILE CHARACTER*(*) EMBLID(MAXALIGNS) CHARACTER ALISEQ(MAXCORE),INSBUFFER(MAXINSBUF) REAL WEIGHT(MAXALIGNS),IDE(MAXALIGNS) C---- export LOGICAL ERROR C---- internal INTEGER LINELEN,MAXRES_INTERN,MAXALIGNS_INTERN PARAMETER (MAXRES_INTERN= 9999) C PARAMETER (MAXRES_INTERN= 10000) C PARAMETER (MAXRES_INTERN= 30011) PARAMETER (LINELEN= 22000) PARAMETER (MAXALIGNS_INTERN= 8765) C PARAMETER (MAXALIGNS_INTERN= 10000) INTEGER CODELEN,I,J,ISTART,ISTOP,ISEQ,IINS,K, + IPOS,JPOS,KPOS,LENSEQ(MAXALIGNS_INTERN) CHARACTER*8 TIMESTRING CHARACTER*9 DATESTRING CHARACTER*(LINELEN) LINE CHARACTER*(MAXRES_INTERN) STRAND,SEQ1_STRING CHARACTER*50 TRANS C---- ------------------------------------------------------------------ C---- ini ERROR= .FALSE. TRANS= 'VLIMFWYGAPSTCHRKQENDBZX.' C---- try to open outfile; return if unsuccessful CALL OPEN_FILE(KUNIT,OUTFILE,'new,recl=22000',error) C---- error messages are alredy issued by OPEN_FILE IF (ERROR) RETURN IF (NSEQ.GT.MAXALIGNS) THEN WRITE(*,'(1X,A)') 'MAXALIGNS (global) overflow in write_daf !' ERROR = .TRUE. RETURN ENDIF IF (NSEQ.GT.MAXALIGNS_INTERN) THEN WRITE(*,'(1X,A)') 'MAXALIGNS (intern) overflow in write_daf !' ERROR= .TRUE. RETURN ENDIF IF (ALILEN.GT.MAXRES) THEN WRITE(*,'(1X,A)') 'MAXRES (global) overflow in write_daf !' ERROR= .TRUE. RETURN ENDIF IF (ALILEN.GT.MAXRES_INTERN ) THEN WRITE(*,'(1X,A)') 'MAXRES (intern) overflow in write_daf !' ERROR= .TRUE. RETURN ENDIF C---- ------------------------------------------------------------------ C---- start CODELEN = 1 DO I=1,NSEQ CALL STRPOS(EMBLID(I),ISTART,ISTOP) IF (ISTOP.GT.CODELEN) CODELEN=ISTOP+2 ENDDO IF (CODELEN.GT.LEN(EMBLID(1))) CODELEN=LEN(EMBLID(1)) C---- C---- header C---- WRITE(KUNIT,'(A)') '# DAF (dirty alignment format)' CALL STRPOS(INFILE,ISTART,ISTOP) WRITE(KUNIT,'(A,A,A,I4,A,I4)') + '# SOURCE: converted from ', + infile(istart:istop),' from: ',begin,' to: ',end C---- get current date and time CALL GETDATE(DATESTRING) CALL GETTIME(TIMESTRING) WRITE(KUNIT,'(A,A,A)') 1 '# DATE: ',DATESTRING,TIMESTRING WRITE(KUNIT,'(A,A)') '# ALISYM: ',TRANS WRITE(KUNIT,'(A,I6)') '# NPAIRS: ',NSEQ WRITE(KUNIT,'(A)') '#' WRITE(KUNIT,'(A)') '#' WRITE(KUNIT,'(A)') '# ALIGNMENTS' WRITE(KUNIT,'(A)') 'idSeq idStr lenSeq lenStr '// + 'lenAli pide seq str' C---- C---- sequence information C---- DO ISEQ=2,NSEQ SEQ1_STRING=' ' STRAND= ' ' J = 1 JPOS = ALIPOINTER(ISEQ) LENSEQ(ISEQ)=ILAS(ISEQ)-IFIR(ISEQ)+1 DO IPOS = IFIR(ISEQ),ILAS(ISEQ) IF ( ICHAR(ALISEQ(JPOS)) .GT. 96 .AND. + ICHAR(ALISEQ(JPOS)) .LT. 123) THEN STRAND(J:J) = ALISEQ(JPOS) CALL LOWTOUP(STRAND(J:J),1) SEQ1_STRING(J:J) = ALISEQ(IPOS) J = J + 1 JPOS=JPOS+1 DO IINS=1,INSNUMBER IF(INSALI(IINS) .EQ. ISEQ .AND. + INSBEG_1(IINS) .EQ. IPOS)THEN KPOS=INSPOINTER(IINS)+1 DO K=1,INSLEN(IINS) STRAND(J:J) = INSBUFFER(KPOS) SEQ1_STRING(J:J) = '.' J=J+1 KPOS=KPOS+1 ENDDO ENDIF ENDDO ELSE STRAND(J:J) = ALISEQ(JPOS) SEQ1_STRING(J:J) = ALISEQ(IPOS) J = J + 1 JPOS=JPOS+1 ENDIF ENDDO CALL STRPOS(STRAND,I,J) C WRITE(KUNIT,'(A,1X,A,1X,F7.1,1X,A,A,A)') WRITE(KUNIT,'(A,1X,A,1X,I6,I6,I6,F7.1,1X,A,A,A)') + EMBLID(1)(1:CODELEN), + EMBLID(ISEQ)(1:CODELEN), + ALILEN,LALI(ISEQ),LENSEQ(ISEQ), + (IDE(ISEQ)*100.0),SEQ1_STRING(I:J),' ',STRAND(I:J) ENDDO CLOSE(KUNIT) RETURN END C END WRITE_DAF................................................... profphd-utils-1.0.10/convert_seq.pod0000644015075101507510000000242512012371465016727 0ustar lkajanlkajan=pod =head1 NAME convert_seq - conversion of sequence and alignment formats =head1 SYNOPSIS convert_seq [OPTION] =head1 DESCRIPTION convert_seq is used for sequence and alignment format conversion. Instead of using command line arguments it starts a conversation with the user where the desired function can be invoked by answering questions. This program is used by the prof(1) secondary structure, accessibility and transmembrane helix predictor from Burkhard Rost. =head1 OPTIONS None. Answer questions printed on STDOUT by typing your choice into STDIN. =head1 AUTHORS =over =item Reinhard Schneider Mar, 1991 version 1.0 LION http://www.lion-ag/ Heidelberg =item Ulrike Goebel Mar, 1997 version 1.1 LION http://www.lion-ag/ Heidelberg =item Reinhard Schneider Mar, 1997 version 2.0 LION http://www.lion-ag/ Heidelberg =item Burkhard Rost May, 1998 version 2.1 EMBL/LION Heidelberg =item Burkhard Rost Oct, 1998 version 2.2 =item Laszlo Kajan TU Muenchen, Germany =item Guy Yachdav CUBIC (Columbia University, NY, USA), Technical University Munich (Munich, DE), BioSof LLC (USA) =back =head1 SEE ALSO prof(1), filter_hssp(1) =cut profphd-utils-1.0.10/dead.f0000644015075101507510000017445012012371464014746 0ustar lkajanlkajanC....................................................................... C SUB COPY_FIELD SUBROUTINE COPY_FIELD(CIN,COUT,IFIELD,MAXFIELD) c implicit none CHARACTER*(*) CIN(*),COUT INTEGER IFIELD,MAXFIELD C internal INTEGER IBEG,IEND IF (IFIELD+1 .GT. MAXFIELD) THEN CALL STRPOS(CIN(IFIELD),IBEG,IEND) WRITE(*,*)'**** NO VALUE GIVEN FOR: ',CIN(IFIELD)(IBEG:IEND) ELSE CALL STRPOS(CIN(IFIELD+1),IBEG,IEND) IF (IEND .GE. 1) THEN COUT=CIN(IFIELD+1)(IBEG:IEND) IFIELD=IFIELD+2 ENDIF ENDIF RETURN END C END COPY_FIELD C....................................................................... C....................................................................... C SUB CALC_PROFILE SUBROUTINE CALC_PROFILE(MAXSQ,MAXAA,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES_1,NIOSTATES_1,NSTRSTATES_2, + NIOSTATES_2, + SCALE_FACTOR,LOG_BASE,SIGMA,BETA, + NRES,NALIGN, + AL_EXCLUDEFLAG,AL_IFIRST,AL_ILAST, + SEQBUFFER,ISEQPOINTER,NTRANS,TRANS, + SEQ_WEIGHT,OPEN_1,ELONG_1, + GAPOPEN_1, + GAPELONG_1,SIMORG,SIMMETRIC_1) INTEGER NSTRSTATES_1,NIOSTATES_1,NSTRSTATES_2,NIOSTATES_2 c common/strstates/ nstrstates_1,niostates_1,nstrstates_2, c + niostates_2 C import INTEGER MAXSQ,MAXAA,NRES,NALIGN,NTRANS REAL SCALE_FACTOR,LOG_BASE REAL SIGMA,BETA INTEGER AL_IFIRST(*),AL_ILAST(*),ISEQPOINTER(*) CHARACTER SEQBUFFER(*) CHARACTER TRANS*(*),AL_EXCLUDEFLAG(*) REAL SEQ_WEIGHT(*),OPEN_1,ELONG_1,GAPOPEN_1(*), + GAPELONG_1(*) REAL SIMORG(NTRANS,NTRANS,MAXSTRSTATES,MAXIOSTATES, + MAXSTRSTATES,MAXIOSTATES) C export REAL SIMMETRIC_1(MAXSQ,NTRANS) C internal INTEGER MAXTRANS PARAMETER (MAXTRANS=26) INTEGER IRES,IALIGN,I,J REAL FREQUENCY(MAXTRANS) REAL BTOD,BTON,ZTOE,ZTOQ REAL XOCC,XINS,XDEL CHARACTER C1 REAL SIM_COPY(MAXTRANS,MAXTRANS) REAL INV(MAXTRANS,MAXTRANS) INTEGER INDX(MAXTRANS) REAL PROB_I(MAXTRANS) C================ CAUTION: pass the following values from outside CAUTION only for BLOSUM c scale_factor = 0.5 c log_base = 2.0 c scale_factor = 1.0 c log_base = 10.0 c beta=1.0 ; sigma=1.0 C 'B' and 'Z' are assigned as well to the acid as to the amide form C with respect to their frequency in EMBL/SWISSPROT 21.0 BTOD=0.524 BTON=0.445 ZTOE=0.626 ZTOQ=0.407 ILEN=LEN(TRANS) WRITE(*,*)'calc_profile' C check if MAXHOM tries do more than we implemented here :-) IF (NTRANS .GT. MAXTRANS) THEN WRITE(*,*)' WARNING: NTRANS GT MAXTRANS' WRITE(*,*)' update routine: calc_profile !!!' STOP ENDIF IF (NSTRSTATES_1 .GT. 1 .OR. NIOSTATES_1 .GT. 1 .OR. + NSTRSTATES_2 .GT. 1 .OR. NIOSTATES_2 .GT. 1) THEN WRITE(*,*)' WARNING: routine calc_profile not' WRITE(*,*)' working with STR and/or I/O dependent' WRITE(*,*)' metrices, update routine !!!' STOP ENDIF C copy "simorg" in "sim_copy" so "simorg" will be unchanged ! DO I=1,NTRANS DO J=1,NTRANS SIM_COPY(I,J)= SIMORG(I,J,1,1,1,1) ENDDO ENDDO C scale metric if necessary DO I=1,MAXTRANS DO J=1,MAXTRANS SIM_COPY(I,J)=SIM_COPY(I,J) * SCALE_FACTOR ENDDO ENDDO C de-log the matrix to get the ( P(i,j) / ( P(i) * P(j) ) ) DO I=1,MAXTRANS DO J=1,MAXTRANS SIM_COPY(I,J)= LOG_BASE ** SIM_COPY(I,J) ENDDO ENDDO C build diagonal matrix DO I=1,NTRANS DO J=1,NTRANS INV(I,J)=0.0 ENDDO INV(I,I) =1.0 ENDDO C invert matrix C NOTE: sim_copy gets changed CALL LUDCMP(SIM_COPY,MAXAA,MAXTRANS,INDX,D) DO I=1,MAXAA CALL LUBKSB(SIM_COPY,MAXAA,MAXTRANS,INDX,INV(1,I)) ENDDO C normalize to 1.0 to get the P(i) DO I=1,MAXAA SUM=0.0 DO J=1,MAXAA SUM= SUM + INV(I,J) ENDDO PROB_I(I)=SUM DO J=1,MAXAA INV(I,J)=INV(I,J) /SUM ENDDO C check SUM=0.0 DO J=1,MAXAA SUM= SUM + INV(I,J) ENDDO CALL CHECKREALEQUALITY(SUM,1.0,0.002,'sum','calc_profile') ENDDO C restore sim_copy (changed by matrix inverse) DO I=1,MAXTRANS DO J=1,MAXTRANS SIM_COPY(I,J)= SIMORG(I,J,1,1,1,1) ENDDO ENDDO C scale metric DO I=1,MAXTRANS DO J=1,MAXTRANS SIM_COPY(I,J)=SIM_COPY(I,J) * SCALE_FACTOR ENDDO ENDDO C de-log the matrix and multiply by P(i) to get the conditional probabilities: C ( P(i,j) | P(j) ) DO I=1,MAXTRANS DO J=1,MAXTRANS SIM_COPY(I,J)= ( LOG_BASE ** SIM_COPY(I,J) ) * PROB_I(I) ENDDO ENDDO C check sum rule DO J=1,MAXAA SIM=0.0 DO I=1,MAXAA SIM = SIM + SIM_COPY(I,J) ENDDO c write(*,*)'sum P(i,j) | P(j): ',j,sim CALL CHECKREALEQUALITY(SIM,1.0,0.002,'sim','calc_profile') ENDDO C calculate sequence profile DO IRES=1,NRES XOCC=0.0 XINS=0.0 XDEL=0.0 DO I=1,MAXTRANS FREQUENCY(I)=0.0 ENDDO DO IALIGN=1,NALIGN IF (IRES .GE. AL_IFIRST(IALIGN) .AND. + IRES .LE. AL_ILAST(IALIGN) + .AND. AL_EXCLUDEFLAG(IALIGN) .EQ. ' ') THEN C1=SEQBUFFER( ISEQPOINTER(IALIGN) + + IRES-AL_IFIRST(IALIGN) ) C if lower case character: insertions IF (C1 .GE. 'a' .AND. C1 .LE. 'z') THEN C1=CHAR( ICHAR(C1)-32 ) XINS=XINS + SEQ_WEIGHT(IALIGN) ENDIF IF (C1 .NE. '.') THEN XOCC=XOCC + SEQ_WEIGHT(IALIGN) IF (INDEX('BZ',C1).EQ.0) THEN I=INDEX(TRANS,C1) IF (I .LE. 0 .OR. I .GT. ILEN) THEN WRITE(*,*)' GETFREQUENCY: UNKNOWN RES : ',C1 ELSE FREQUENCY(I)=FREQUENCY(I) + SEQ_WEIGHT(IALIGN) ENDIF ELSE IF (C1 .EQ. 'B') THEN WRITE(*,*)' GETFREQUENCY: convert B' I=INDEX(TRANS,'D') J=INDEX(TRANS,'N') FREQUENCY(I)=FREQUENCY(I)+(BTOD*SEQ_WEIGHT(IALIGN)) FREQUENCY(J)=FREQUENCY(J)+(BTON*SEQ_WEIGHT(IALIGN)) ELSE IF (C1 .EQ. 'Z') THEN WRITE(*,*)' GETFREQUENCY: convert Z' I=INDEX(TRANS,'E') J=INDEX(TRANS,'Q') FREQUENCY(I)=FREQUENCY(I)+(ZTOE*SEQ_WEIGHT(IALIGN)) FREQUENCY(J)=FREQUENCY(J)+(ZTOQ*SEQ_WEIGHT(IALIGN)) ENDIF ELSE C if '.' : deletion XDEL=XDEL+ SEQ_WEIGHT(IALIGN) ENDIF ENDIF ENDDO C====================== C profile SUM= 0.0 DO I=1,MAXAA SUM = SUM + FREQUENCY(I) ENDDO IF (SUM .NE. 0.0) THEN DO I=1,MAXAA FREQUENCY(I)= FREQUENCY(I) / SUM ENDDO C check sum rule for frequencies X=0.0 DO I=1,MAXAA X = X + FREQUENCY(I) ENDDO CALL CHECKREALEQUALITY(X,1.0,0.002,'freq','calc_profile') C smooth the profile C sigma: smooth dependent on the number of alignments C beta: mixing of the two models (expected <--> observed) SMOOTH= ( SUM / (SUM +SIGMA)) * BETA C do for each of the AA types in a row DO I=1,MAXAA SIM=0.0 C sum up the conditional probabilities DO J=1,MAXAA SIM = SIM + ( FREQUENCY(J) * SIM_COPY(I,J) ) ENDDO C add the observed frequencies and smooth SIMMETRIC_1(IRES,I)=( (1-SMOOTH) * SIM) + + (SMOOTH * FREQUENCY(I) ) C divide by the expected probability SIMMETRIC_1(IRES,I)=SIMMETRIC_1(IRES,I)/PROB_I(I) c simmetric_1(ires,i)= frequency(i) /prob_i(i) C log-odd IF (SIMMETRIC_1(IRES,I) .LE. 10E-3) THEN SIMMETRIC_1(IRES,I)=10E-3 ENDIF SIMMETRIC_1(IRES,I)=LOG10 ( SIMMETRIC_1(IRES,I) ) c write(*,*)ires,trans(i:i),sum,frequency(i), c + sim,smooth,simmetric_1(ires,i) C gap-weights GAPOPEN_1(IRES) =OPEN_1 / (1.0 + ((XINS+XDEL)/SUM)) GAPELONG_1(IRES)=ELONG_1 / (1.0 + ((XINS+XDEL)/SUM)) ENDDO ELSE WRITE(*,*)'CALC_PROFILE: position not occupied !' C1=SEQBUFFER( ISEQPOINTER(1)+IRES-AL_IFIRST(1) ) WRITE(*,*)' sequence symbol of first sequence: ',c1 WRITE(*,*)' set profile row to 0.0' DO I=1,MAXAA SIMMETRIC_1(IRES,I)=0.0 ENDDO GAPOPEN_1(IRES) = 0.0 GAPELONG_1(IRES)= 0.0 ENDIF ENDDO C set value for chain breaks etc... to 0.0 C later there are refilled in MAXHOM (like "!" = -200.0) IX=INDEX(TRANS,'X') IB=INDEX(TRANS,'B') IZ=INDEX(TRANS,'Z') I1=INDEX(TRANS,'!') I2=INDEX(TRANS,'-') DO IRES=1,NRES SIMMETRIC_1(IRES,IX)=0.0 SIMMETRIC_1(IRES,IB)=0.0 SIMMETRIC_1(IRES,IZ)=0.0 SIMMETRIC_1(IRES,I1)=0.0 SIMMETRIC_1(IRES,I2)=0.0 ENDDO RETURN END C END CALC_PROFILE C....................................................................... C....................................................................... C SUB HSSP SUBROUTINE HSSP(NALIGN,CHAINREMARK) C write XX.HSSP files using alignment-data RS 1988/89 IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C import INTEGER NALIGN CHARACTER*(*) CHAINREMARK C internal c character*1 csq_1_array(maxsq) CHARACTER CTEMP*128,HSSPFILE*200,CDATE*9 CHARACTER*132 HSSPLINE,DATABASE,CPARAMETER(10) REAL RMIN,RMAX REAL RELEASE INTEGER NPARALINE,NRES,LRES,NENTRIES,NRESIDUE INTEGER I,J,ISTART,ISTOP c initialize DO I=1,NALIGN AL_EXCLUDEFLAG(I)=' ' ENDDO DO J=1,MAXSQ DO I=1,MAXPROFAA AL_SEQPROF(J,I)=0 ENDDO ENDDO DO J=1,MAXSQ AL_VARIABILITY(J)=0 AL_ENTROPY(J)=0 NOCC_1(J)=0 AL_NDELETION(J)=0 AL_NINS(J)=0 ENDDO C HSSP release note WRITE(HSSPLINE,'(A)')'HSSP HOMOLOGY DERIVED SECONDARY'// + ' STRUCTURE OF PROTEINS , VERSION 1.0 1991' C get swissprot release CALL SWISSPROTRELEASE(KREL,RELNOTES,RELEASE,NENTRIES,NRESIDUE) WRITE(DATABASE,'(A,F4.1,A,I6,A)')'SEQBASE RELEASE ',release, + ' OF EMBL/SWISS-PROT WITH ',nentries,' SEQUENCES' c get actual date CDATE=' ' CALL GETDATE (CDATE) C get GCG-metric file and scale between 0.0 and 1.0 RMIN=0.0 RMAX=1.0 CALL GETSIMMETRIC(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES_1,NIOSTATES_1,NSTRSTATES_2, + NIOSTATES_2,CSTRSTATES,CIOSTATES, + IORANGE,KSIM,METRIC_HSSP_VAR,SIMORG) IF (NSTRSTATES_1 .NE. 1 .OR. NIOSTATES_1 .NE. 1) THEN WRITE(*,*)'**** ERROR: NSTRSTATES_1 OR NIOSTATES_1 .GT. 1' WRITE(*,*)'CHANGE CALC_VAR ROUTINE' ENDIF CALL SCALEMETRIC(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + SIMORG,RMIN,RMAX,0.0,0.0) C write alignment parameter in CPARAMETER (passed to HSSPHEADER) NPARALINE=1 WRITE(CPARAMETER(NPARALINE),'(A,F4.1,A,F4.1)') + ' SMIN: ',SMIN,' SMAX: ',SMAX NPARALINE=NPARALINE+1 IF (OPENWEIGHT_ANSWER .EQ. 'PROFILE' ) THEN WRITE(CPARAMETER(NPARALINE),'(A,A)') + ' gap-open: profile',' gap-elongation: profile' ELSE WRITE(CPARAMETER(NPARALINE),'(A,F4.1,A,F4.1)') + ' gap-open: ',gapopen_1(1),' gap-elongation: ',gapelong_1(1) ENDIF IF (LPROFILE_1 .OR. LPROFILE_2) THEN NPARALINE=NPARALINE+1 WRITE(CPARAMETER(NPARALINE),'(A,A)') + ' profile from : ',name_1(1:100) ENDIF NPARALINE=NPARALINE+1 IF (LCONSERV_1 .OR. LCONSERV_2 .OR. LCONSIMPORT) THEN WRITE(CPARAMETER(NPARALINE),'(A)') + ' conservation weights: YES' ELSE WRITE(CPARAMETER(NPARALINE),'(A)') + ' conservation weights: NO' ENDIF NPARALINE=NPARALINE+1 IF (LINSERT_1) THEN WRITE(CPARAMETER(NPARALINE),'(A)') + 'InDels in secondary structure allowed: YES' ELSE WRITE(CPARAMETER(NPARALINE),'(A)') + 'InDels in secondary structure allowed: NO' ENDIF NPARALINE=NPARALINE+1 CALL CONCAT_STRINGS(' alignments sorted according to : ', + CSORTMODE,CPARAMETER(NPARALINE) ) IF (LHSSP_LONG_ID .EQV. .TRUE.) THEN NPARALINE=NPARALINE+1 CALL CONCAT_STRINGS(' LONG-ID : ', + HSSP_FORMAT_ANSWER,CPARAMETER(NPARALINE) ) ENDIF IF (HSSP_ANSWER .EQ. 'YES') THEN CALL CONCAT_STRINGS(HSSPID_1,'.hssp',HSSPFILE) ELSE HSSPFILE=HSSP_ANSWER ENDIF BRKID_1=HSSPID_1(1:4) C IF (.NOT. LDSSP_1 ) THEN NRES=N1 LRES=N1 HEADER_1=NAME_1(1:40) COMPND_1=' ' SOURCE_1=' ' AUTHOR_1=' ' ELSE NRES=N1 LRES=NRES-NCHAINUSED+1 ENDIF DO I=1,N1 CSQ_1_ARRAY(I)=CSQ_1(I:I) ENDDO CALL CALC_VAR(NALIGN,NRES,CSQ_1_ARRAY,AL_HOM, + AL_IFIRST,AL_ILAST,ISEQPOINTER, + SEQBUFFER,AL_EXCLUDEFLAG,MAXSTRSTATES,MAXIOSTATES, + NTRANS,TRANS,SIMORG,AL_VARIABILITY) CALL CALC_PROF(MAXSQ,MAXPROFAA,NRES,CSQ_1_ARRAY,NALIGN, + AL_EXCLUDEFLAG,AL_HOM,AL_IFIRST, + AL_ILAST,SEQBUFFER,ISEQPOINTER,TRANS,AL_SEQPROF, + NOCC_1,AL_NDELETION,AL_NINS,AL_ENTROPY,AL_RELENT) IF (CHAINREMARK .NE. ' ') THEN CTEMP=' ' I=INDEX(CHAINREMARK,'!') IF (I .NE. 0) THEN WRITE(CTEMP,'(A)')CHAINREMARK(I+2:) ENDIF CHAINREMARK=' ' CALL STRPOS(CTEMP,ISTART,ISTOP) WRITE(CHAINREMARK,'(A)')CTEMP(1:ISTOP) ENDIF CALL HSSPHEADER(KHSSP,HSSPFILE,HSSPLINE,HSSPID_1,CDATE,DATABASE, + CPARAMETER,NPARALINE,ISOSIGFILE,ISAFE,LFORMULA, + HEADER_1,COMPND_1,SOURCE_1,AUTHOR_1,LRES, + NCHAIN_1,NCHAINUSED,CHAINREMARK,NALIGN) CALL WRITE_HSSP(KHSSP,MAXSQ,NALIGN,NRES,AL_EMBLPID, + AL_PDB_POINTER,AL_ACCESSION,AL_HOM,AL_SIM, + AL_IFIRST,AL_ILAST,AL_JFIRST,AL_JLAST,AL_HOMLEN, + AL_NGAP,AL_LGAP,AL_LSEQ_2,AL_COMPOUND, + ISEQPOINTER,SEQBUFFER,PDBNO_1,CHAINID_1, + CSQ_1_ARRAY,STRUC_1,COLS_1,BP1_1,BP2_1, + SHEETLABEL_1,NSURF_1,NOCC_1,AL_VARIABILITY, + AL_SEQPROF,AL_NDELETION,AL_NINS,AL_ENTROPY, + AL_RELENT,CONSWEIGHT_1,INSNUMBER,INSALI, + INSPOINTER,INSLEN,INSBEG_1,INSBEG_2,INSBUFFER, + ISOLEN,ISOIDE,NSTEP,LFORMULA,LALL,ISAFE, + AL_EXCLUDEFLAG,LCONSERV_1,LHSSP_LONG_ID) RETURN END C END HSSP C....................................................................... C....................................................................... C SUB READ_FILENAME SUBROUTINE READ_FILENAME(KUNIT,FILENAME,LENDFILE,LERROR) CHARACTER*(*) FILENAME INTEGER KUNIT LOGICAL LENDFILE,LERROR LENDFILE=.FALSE. LERROR=.FALSE. FILENAME=' ' READ(KUNIT,'(A)',END=100,ERR=200)FILENAME RETURN 100 LENDFILE=.TRUE. RETURN 200 LERROR=.TRUE. RETURN END C END READ_FILENAME C....................................................................... C....................................................................... C SUB RECEIVE_RESULTS_FROM_NODE SUBROUTINE RECEIVE_RESULTS_FROM_NODE(NALIGN) C import INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C export INTEGER NALIGN C local for each node c integer irecpoi(maxaligns),ifilepoi(maxaligns) c real alisortkey(maxaligns),len2_orig(maxaligns) C internal INTEGER IWORKER,IALIGN,IBEG,IEND INTEGER IALIGN_GOOD_ALL C receive result from nodes and store in GLOBAL space ILINK=1 LOGSTRING=' ' IALIGN_GOOD_ALL = IALIGN_GOOD WRITE(*,*)' receive results: nalign nworker ', nalign, + NWORKER,IALIGN_GOOD CALL FLUSH_UNIT(6) c msgtype=idtop DO IWORKER=1,NWORKER MSGTYPE=4000 ILINK=IWORKER CALL MP_RECEIVE_DATA(MSGTYPE,LINK(ILINK)) CALL MP_GET_INT4(MSGTYPE,LINK(ILINK),IALIGN,N_ONE) CALL MP_GET_INT4(MSGTYPE,LINK(ILINK),IALIGN_GOOD, + N_ONE) IALIGN_GOOD_ALL = IALIGN_GOOD_ALL + IALIGN_GOOD WRITE(*,*)ILINK,' IALIGN/GOOD/GOOD_ALL', + IALIGN,IALIGN_GOOD,IALIGN_GOOD_ALL CALL FLUSH_UNIT(6) IF (IALIGN .GT. 0) THEN MSGTYPE=5000 IBEG=NALIGN IEND=NALIGN+IALIGN-1 IF (IEND .GT. MAXALIGNS) THEN WRITE(*,*)'FATAL ERROR: MAXALIGNS OVERFLOW, INCREASE !!' CALL FLUSH_UNIT(6) STOP ENDIF CALL MP_RECEIVE_DATA(MSGTYPE,LINK(ILINK)) CALL MP_GET_REAL4(MSGTYPE,LINK(ILINK), + ALISORTKEY(IBEG),IALIGN) CALL MP_GET_INT4(MSGTYPE,LINK(ILINK), + IRECPOI(IBEG),IALIGN) CALL MP_GET_INT4(MSGTYPE,LINK(ILINK), + IFILEPOI(IBEG),IALIGN) NALIGN=NALIGN+IALIGN WRITE(LOGSTRING,'(A,4(I6))')' pid / done : ', + IWORKER,IALIGN ELSE WRITE(LOGSTRING,'(A,I6,I6)')'nothing found: ', + IWORKER,IALIGN ENDIF CALL LOG_FILE(KLOG,LOGSTRING,0) CALL FLUSH_UNIT(6) ENDDO NALIGN=NALIGN-1 IALIGN_GOOD=IALIGN_GOOD_ALL WRITE(*,*)' total done : ',nalign,ialign_good CALL FLUSH_UNIT(6) RETURN END C END RECEIVE_RESULTS_FROM_NODE C....................................................................... C....................................................................... C SUB SEND_ALI_REQUEST SUBROUTINE SEND_ALI_REQUEST(IWORKER,IRECORD,IMSGTAG,CHECKVAL) IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' c input INTEGER IWORKER,IRECORD,IMSGTAG REAL CHECKVAL MSGTYPE=6000 CALL MP_INIT_SEND() CALL MP_PUT_INT4(MSGTYPE,IWORKER,IRECORD,N_ONE) CALL MP_PUT_INT4(MSGTYPE,IWORKER,IMSGTAG,N_ONE) CALL MP_PUT_REAL4(MSGTYPE,IWORKER,CHECKVAL,N_ONE) CALL MP_SEND_DATA(MSGTYPE,LINK(IWORKER)) RETURN END C END SEND_ALI_REQUEST C....................................................................... C....................................................................... C SUB SEND_JOBS C get "ready" signal from node and send "nfile" jobs SUBROUTINE SEND_JOBS(LH1,LH2,NFILE,NALIGN,NENTRIES, + NAMINO_ACIDS) C import IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' INTEGER NFILE,NENTRIES,NAMINO_ACIDS c import REAL LH1(0:MAXMAT) INTEGER*2 LH2(0:MAXTRACE) c real lh(0:maxmat*2) C internal INTEGER IFLAG c integer inix INTEGER IFILE,JFILE,ISET,ILINK,I,NALIGN,IALIGN,IFIRST_ROUND INTEGER NRECORD,IPOINTER INTEGER IDONE(MAXPROC) c integer ipos,nsplit,isplit c logical lendbase,ldb_read_one LOGICAL LERROR,LENDFILE CHARACTER*80 FILENAME C init FILENAME=' ' JFILE=0 ILINK=1 MSGTYPE=0 DO I=1,MAXPROC IDONE(I)=0 ENDDO NALIGN=1 IALIGN=0 IFILE=1 IFIRST_ROUND=0 NRECORD=0 ISET=0 IPOINTER=1 IALIGN_GOOD=0 c ldb_read_one=.false. c$$$ if (ldb_read_one .eqv. .true.) THEN c$$$ nbuffer_len = 6 + len(name_2) + len(compnd_2) + c$$$ + len(ACCESSION_2) + len(pdbref_2) c$$$ lfirst_scan=.false. c$$$ nsplit=namino_acids / (nworker +1) c$$$ c$$$ iset=0 ; isplit=0 c$$$ c$$$ do while( ifile .le. nfile) c$$$ lendbase=.false. c$$$ call open_sw_data_file(kbase,ifile,split_db_data, c$$$ + split_db_path) c$$$c write(*,*)ifile,nseq_warm_start,isplit,nsplit c$$$c call flush_unit(6) c$$$ c$$$ do while(lendbase .eqv. .false.) c$$$ call get_swiss_entry(maxsq,kbase,lbinary,n2in, c$$$ + name_2,compnd_2, c$$$ + ACCESSION_2,pdbref_2,csq_2,lendbase) c$$$ c$$$ if (lendbase .eqv. .false.) THEN c$$$ IF ( (ipointer + nbuffer_len + n2in) .gt. c$$$ + maxdatabase_buffer) THEN c$$$ write(*,*)' **** FATAL ERROR ****' c$$$ write(*,*)' database_buffer overflow increase' c$$$ write(*,*)' dimension of MAXDATABASE_BUFFER' c$$$ STOP c$$$ endif c$$$ write(cbuffer_line(1:),'(i6,a,a,a,a)') c$$$ + n2in,name_2, c$$$ + compnd_2,ACCESSION_2,pdbref_2 c$$$ do ipos=1,nbuffer_len c$$$ cdatabase_buffer(ipointer)= c$$$ + cbuffer_line(ipos:ipos) c$$$ ipointer=ipointer+1 c$$$ enddo c$$$ do ipos=1,n2in c$$$ cdatabase_buffer(ipointer)=csq_2(ipos:ipos) c$$$ ipointer=ipointer+1 c$$$ enddo c$$$ isplit=isplit+n2in c$$$ nseq_warm_start=nseq_warm_start+1 c$$$ if ( (isplit .ge. nsplit) .and. c$$$ + (iset .le. nworker) ) then c$$$ iset=iset+1 ; ipointer=ipointer-1 c$$$ write(*,'(a,i6,i8,i10,i8)') c$$$ + 'internal buffer: ',iset,nseq_warm_start, c$$$ + ipointer,isplit c$$$ call flush_unit(6) c$$$ c$$$ msgtype=8000 ; ilink=iset c$$$ call mp_init_send() c$$$ call mp_put_int4(msgtype,ilink,ipointer,n_one) c$$$ call mp_put_int4(msgtype,ilink,nseq_warm_start, c$$$ + n_one) c$$$ call mp_send_data(msgtype,link(ilink)) c$$$ msgtype=9000 c$$$ call mp_init_send() c$$$ call mp_put_string_array(msgtype,ilink, c$$$ + cdatabase_buffer,ipointer) c$$$ call mp_send_data(msgtype,link(ilink)) c$$$ ipointer=1 ; nseq_warm_start=0 ; isplit=0 c$$$ endif c$$$ else c$$$ close(kbase) ; ifile=ifile+1 c$$$ endif c$$$ enddo c$$$ enddo c$$$ msgtype=10000 c$$$ call mp_init_send() c$$$ call mp_put_int4(msgtype,ilink,ipointer,n_one) c$$$ call mp_cast(nworker,msgtype,link(1)) c$$$ endif CALL GET_CPU_TIME('time init:',IDPROC, + ITIME_OLD,ITIME_NEW,TOTAL_TIME,LOGSTRING) CALL LOG_FILE(KLOG,LOGSTRING,2) IPOINTER=1 IF (LISTOFSEQ_2 .EQV. .FALSE.) THEN IF (LFIRST_SCAN .EQV. .TRUE.) THEN DO WHILE (IFILE .LE. NFILE ) MSGTYPE=2000 ILINK=-1 c call mp_receive_data(msgtype,ilink) c call mp_get_int4(msgtype,ilink,ilink,n_one) C first test for messages CALL MP_PROBE(MSGTYPE,IFLAG) C if no communication is necessary do some "real" work IF ( IFLAG.EQ.0 .AND. IFILE .GE. NWORKSET*MAXQUEUE) THEN WRITE(LOGSTRING,*)' file to host: ',ifile CALL LOG_FILE(KLOG,LOGSTRING,1) CALL HOST_INTERFACE(LH1,LH2,IFILE,FILENAME,IALIGN, + NRECORD,IPOINTER) IFILE=IFILE+1 IFIRST_ROUND=1 c we have to fill the work-queue ELSE MSGTYPE=2000 CALL MP_RECEIVE_DATA(MSGTYPE,ILINK) CALL MP_GET_INT4(MSGTYPE,ILINK,ILINK,N_ONE) CALL MP_INIT_SEND() C when we communicate the fist time, we fill the queue IF (IFIRST_ROUND .EQ. 0) THEN ISET=ISET+1 JFILE=ISET DO I=1,MAXQUEUE c write(*,'(a,i4,a,i4)')' file: ',jfile,' to: ',ilink c call flush_unit(6) MSGTYPE=3000 CALL MP_PUT_INT4(MSGTYPE,ILINK,JFILE,N_ONE) JFILE=JFILE+NWORKSET IFILE=IFILE+1 ENDDO CALL MP_SEND_DATA(MSGTYPE,LINK(ILINK)) C send one file-pointer to refill the work-queue ELSE c write(*,'(a,i4,a,i4)')' file: ',ifile,' to: ',ilink c call flush_unit(6) MSGTYPE=3000 CALL MP_PUT_INT4(MSGTYPE,ILINK,IFILE,N_ONE) IFILE=IFILE+1 CALL MP_SEND_DATA(MSGTYPE,LINK(ILINK)) ENDIF ENDIF ENDDO LFIRST_SCAN=.FALSE. C now tell everybody that the work is done ISET=0 DO WHILE (ISET .LT. NWORKSET) MSGTYPE=2000 ILINK=-1 CALL MP_RECEIVE_DATA(MSGTYPE,ILINK) CALL MP_GET_INT4(MSGTYPE,ILINK,ILINK,N_ONE) IF (IDONE(ILINK) .EQ. 0) THEN c write(*,'(a,i4)')' last from: ',ilink ; call flush_unit(6) ISET=ISET+1 IDONE(ILINK)=1 MSGTYPE=3000 IFILE=-1 CALL MP_INIT_SEND() CALL MP_PUT_INT4(MSGTYPE,ILINK,IFILE,N_ONE) CALL MP_SEND_DATA(MSGTYPE,LINK(ILINK)) c else c write(*,'(a,i4)')' collect dead message: ',ilink c call flush_unit(6) ENDIF ENDDO WRITE(LOGSTRING,'(a,i6,i8,i10)')'internal buffer: ', + IDPROC,NSEQ_WARM_START,IPOINTER CALL LOG_FILE(KLOG,LOGSTRING,1) ELSE CALL HOST_INTERFACE(LH1,LH2,IFILE,FILENAME,IALIGN, + NRECORD,IPOINTER) ENDIF ELSE C =================================================================== C list of filenames C =================================================================== IFILE=0 WRITE(*,*)' load work queue: ',ilink LENDFILE=.FALSE. LERROR=.FALSE. DO ILINK=1,NWORKSET DO I=1,MAXQUEUE_LIST IF ( (LENDFILE .EQV. .FALSE.) .AND. + (LERROR .EQV. .FALSE. ) ) THEN CALL READ_FILENAME(KLIS2,FILENAME,LENDFILE, + LERROR) ENDIF IF ( (LENDFILE .EQV. .TRUE.) .OR. + (LERROR .EQV. .TRUE. ) ) THEN FILENAME='STOP' ENDIF WRITE(*,'(A,A,A,I4)')'file: ',filename(1:50), + ' to: ',ILINK CALL FLUSH_UNIT(6) MSGTYPE=9000 CALL MP_INIT_SEND() CALL MP_PUT_STRING(MSGTYPE,ILINK,FILENAME, + LEN(FILENAME)) CALL MP_SEND_DATA(MSGTYPE,LINK(ILINK)) IF ( (LENDFILE .EQV. .TRUE.) .OR. + (LERROR .EQV. .TRUE. ) ) THEN GOTO 500 ENDIF IFILE=IFILE+1 ENDDO ENDDO DO WHILE (.TRUE. ) MSGTYPE=2000 ILINK=-1 C first test for messages CALL MP_PROBE(MSGTYPE,IFLAG) C if no communication is necessary do some "real" work c IF ( iflag .eq. 0 ) THEN c ifirst_round=1 c IF ( iflag .eq. 0 .and. c + ifile .ge. (nworkset * maxqueue_list) ) THEN c ifirst_round=1 c call read_filename(klis2,filename,lendfile,lerror) c IF (lendfile .eqv. .true. .or. lerror .eqv. .true.) THEN c filename='STOP' c goto 500 c endif c write(logstring,*)' host is working on file: ',ifile c call log_file(klog,logstring,1) c call host_interface(lh1,lh2,ifile,filename,ialign, c + nrecord,ipointer) c ifile=ifile+1 c we have to fill the work-queue c else MSGTYPE=2000 CALL MP_RECEIVE_DATA(MSGTYPE,ILINK) CALL MP_GET_INT4(MSGTYPE,ILINK,ILINK,N_ONE) CALL MP_INIT_SEND() C send one file-pointer to refill the work-queue c if (ifirst_round .ne. 0) then CALL READ_FILENAME(KLIS2,FILENAME,LENDFILE,LERROR) IF ( (LENDFILE .EQV. .TRUE.) .OR. + ( LERROR .EQV. .TRUE.) ) THEN FILENAME='STOP' GOTO 500 ENDIF WRITE(*,'(A,I4)')FILENAME(1:50),ILINK CALL FLUSH_UNIT(6) MSGTYPE=3000 CALL MP_PUT_STRING(MSGTYPE,ILINK,FILENAME, + LEN(FILENAME)) CALL MP_SEND_DATA(MSGTYPE,LINK(ILINK)) IFILE=IFILE+1 C when we communicate the fist time, we fill the queue c else c endif c endif ENDDO c lfirst_scan=.false. C now tell everybody that the work is done 500 ISET=0 DO WHILE (ISET .LT. NWORKSET) MSGTYPE=2000 ILINK=-1 CALL MP_RECEIVE_DATA(MSGTYPE,ILINK) CALL MP_GET_INT4(MSGTYPE,ILINK,ILINK,N_ONE) IF (IDONE(ILINK) .EQ. 0) THEN WRITE(*,'(a,i4)')' last from: ',ilink CALL FLUSH_UNIT(6) ISET=ISET+1 IDONE(ILINK)=1 MSGTYPE=3000 FILENAME='STOP' CALL MP_INIT_SEND() CALL MP_PUT_STRING(MSGTYPE,ILINK,FILENAME, + LEN(FILENAME)) CALL MP_SEND_DATA(MSGTYPE,LINK(ILINK)) c else c write(*,'(a,i4)')' collect dead message: ',ilink c call flush_unit(6) ENDIF ENDDO ENDIF NALIGN=NALIGN+IALIGN WRITE(LOGSTRING,*)' host processed: ',nseq_warm_start, + IALIGN_GOOD CALL LOG_FILE(KLOG,LOGSTRING,1) RETURN END C END SEND_JOBS C....................................................................... C....................................................................... C SUB SETCONSERVATION SUBROUTINE SETCONSERVATION(METRIC_FILENAME) C 1. set conservation weights to 1.0 C 2. rescale matrix for the 22 amino residues such that the sum over C the matrix is 0.0 (or near) C this matrix is used to calculate the conservation weights (SIMCONSERV) c implicit none INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C import CHARACTER*(*) METRIC_FILENAME C internal INTEGER NACID PARAMETER (NACID=22) INTEGER I,J REAL XLOW,XHIGH,XMAX,XMIN,XFACTOR,SUMMAT C DO I=1,MAXSQ CONSWEIGHT_1(I)=1.0 ENDDO LFIRSTWEIGHT=.TRUE. C get metric CALL GETSIMMETRIC(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES_1,NIOSTATES_1,NSTRSTATES_2, + NIOSTATES_2,CSTRSTATES,CIOSTATES, + IORANGE,KSIM,METRIC_FILENAME,SIMORG) c rescale matrix that the sum over matrix is +- 0.0 XLOW=0.0 XHIGH=0.0 XMAX=1.0 XMIN=-1.0 XFACTOR=100.0 C (re)store original values in simconserv() 20 DO J=1,NTRANS DO I=1,NTRANS SIMCONSERV(I,J)=SIMORG(I,J,1,1,1,1) ENDDO ENDDO c scale with xmin/xmax CALL SCALEINTERVAL(SIMCONSERV,NTRANS**2,XMIN,XMAX,XLOW,XHIGH) C RESEt the values for 'X' '!' and '-' I=INDEX(TRANS,'X') DO J=1,NTRANS SIMCONSERV(I,J)=0.0 SIMCONSERV(J,I)=0.0 ENDDO I=INDEX(TRANS,'!') DO J=1,NTRANS SIMCONSERV(I,J)=0.0 SIMCONSERV(J,I)=0.0 ENDDO I=INDEX(TRANS,'-') DO J=1,NTRANS SIMCONSERV(I,J)=0.0 SIMCONSERV(J,I)=0.0 ENDDO I=INDEX(TRANS,'.') DO J=1,NTRANS SIMCONSERV(I,J)=0.0 SIMCONSERV(J,I)=0.0 ENDDO C calculate sum over matrix (22 amino acids) after scaling SUMMAT=0.0 DO I=1,NACID DO J=1,NACID SUMMAT=SUMMAT+SIMCONSERV(I,J) ENDDO ENDDO cd write(*,*)' sum: ',summat,xmin c check sum=0.0 (+- 0.01) ; if not xmin=xmin/2 ; scale again IF (SUMMAT .GT. 0.01) THEN XMIN=XMIN+(XMIN/XFACTOR) ELSE IF (SUMMAT .LT. -0.01) THEN XMIN=XMIN-(XMIN/XFACTOR) ELSE WRITE(*,*)' SETCONSERVATION: sum over matrix: ',summat WRITE(*,*)' smin is : ',xmin c kdeb=45 c call open_file(kdeb,'DEBUG.X','NEW',lerror) c do i=1,ntrans c write(kdeb,'(a,26(f5.2))')trans(i:i), c + (simconserv(i,j),j=1,ntrans) c enddo c write(kdeb,*)'sum over matrix: ',summat c write(kdeb,*)'min,max: ',xmin,xmax c close(kdeb) RETURN ENDIF GOTO 20 END C END SETCONSERVATION C....................................................................... C....................................................................... C SUB SINGLE_SEQ_WEIGHTS SUBROUTINE SINGLE_SEQ_WEIGHTS(NALIGN,SEQBUFFER, + ISEQPOINTER,AL_IFIRST,AL_ILAST,MODE,WEIGHTS) c c input: hssp alignments c w0 -- eigenvalue iteration weights x(i) c w1 -- squared eigenvectors x(i)**2 c w2 -- sum of distances w(i)=SUM(dist(i,j)) c w3 -- exponential weight w(i)=1/SUM(exp(-dist(i,j)/dmean)) c IMPLICIT NONE C import INTEGER NALIGN INTEGER AL_IFIRST(*),AL_ILAST(*),ISEQPOINTER(*) CHARACTER SEQBUFFER(*) CHARACTER MODE*(*) C export REAL WEIGHTS(*) c INTEGER MAXALIGNS,MAXSTEP PARAMETER (MAXALIGNS=300) PARAMETER(MAXSTEP=100) REAL TOLERANCE PARAMETER(TOLERANCE=0.00001) REAL DIST(MAXALIGNS,MAXALIGNS) c real sim_table(maxaligns,maxaligns) c integer maxaa c real sel_press,xpower,xtemp1,xtemp2 REAL WTEMP(MAXALIGNS) c real vtemp(maxaligns,maxaligns) c INTEGER STEP CHARACTER A1,A2 INTEGER LENGTH,NPOS INTEGER I,J,K,K0,K1,KPOS REAL X,S,DMEAN c maxaa=19 I=LEN(MODE) CALL LOWTOUP(MODE,I) IF (NALIGN .GT. MAXALIGNS) THEN WRITE(*,*)' maxaligns overflow in single_seq_weight' STOP ENDIF DO I=1,NALIGN WEIGHTS(I)=1.0 ENDDO IF (NALIGN .LE. 1) THEN WRITE(*,*)' SINGLE_SEQ_WEIGHT: no alignments !' RETURN ENDIF C calculate distance/identity table WRITE(*,*)' calculate distance table...' DO I=1,NALIGN DIST(I,I)=0.0 c sim_table(i,i)=1.0 DO J=I+1,NALIGN LENGTH=0 NPOS=0 K0=MAX(AL_IFIRST(I),AL_IFIRST(J)) K1=MIN(AL_ILAST(I),AL_ILAST(J)) KPOS=ISEQPOINTER(I) + K0 - AL_IFIRST(I) DO K=K0,K1 NPOS=NPOS+1 A1= SEQBUFFER(KPOS) A2= SEQBUFFER(KPOS) KPOS=KPOS+1 IF (A1.EQ.A2) LENGTH=LENGTH+1 IF (A1 .GE. 'a' .OR. A2 .GE. 'a') THEN IF (A1 .GE. 'a' ) THEN A1=CHAR( ICHAR(A1)-32 ) ENDIF IF (A2 .GE. 'a' ) THEN A2=CHAR( ICHAR(A2)-32 ) ENDIF IF (A1.EQ.A2) LENGTH=LENGTH+1 ENDIF c IF (a1 .ge. 'a' .and. a1 .le. 'z') THEN c a1=char( ichar(a1)-32 ) c endif c IF (a2 .ge. 'a' .and. a2 .le. 'z') THEN c a2=char( ichar(a2)-32 ) c endif END DO DIST(I,J)= 1- (FLOAT(LENGTH)/MAX(1.0,FLOAT(NPOS)) ) c sim_table(i,j)=float(length)/max(1.0,float(npos)) c dist(i,j)= 1.00 - sim_table(i,j) DIST(J,I)=DIST(I,J) c sim_table(j,i)=sim_table(i,j) END DO END DO c write(*,*) ' distances: ' c do i=1,nalign c write(*,'(26i3)') (nint(100*dist(j,i)),j=1,nalign) c end do c IF (INDEX(MODE,'MAT'). NE. 0 ) THEN WRITE(*,*)' weight mode MAT NOT active ' STOP c write(*,*)' preparing identity matrix...' c sel_press=0.5 c xpower= 1.0 / (1.0 - sel_press + (1.0/maxaa) ) c xtemp1= 1.0 + (1.0 / (maxaa * (1.0 - sel_press) ) ) c xtemp2= 1.0 / (maxaa * (1-sel_press) ) c do i=1,nalign ; do j=1,nalign c sim_table(i,j) = ( sim_table(i,j) * xtemp1 - xtemp2 ) c IF (sim_table(i,j) .le. tolerance) THEN c write(*,*)'set sim_table to tolerance ',i,j c sim_table(i,j) = tolerance c endif c sim_table(i,j) = sim_table(i,j) **xpower c enddo ; enddo c write(*,*)' calculate singular value decomposition...' c call svdcmp(sim_table,nalign,nalign,maxaligns,maxaligns,wtemp, c + vtemp) c write(*,*)' calculate matrix invers...' c do i=1,nalign c if (wtemp(i) .le. 0.0001) THEN c x=0.0 c else c x= 1/wtemp(i) c endif c do j=1,nalign c sim_table(i,j) = vtemp(i,j) * x * sim_table(i,j) c weights(i) = weights(i) + sim_table(i,j) c enddo c enddo c======================================================================= c calculate one-sequence weights from a distance matrix c step 0: w(k) = 1 / N * sum[dist(k,length)] c step i: w(k)(i) = 1 / NORM * sum[dist(k,l) * w(length)(i-1)] c iterate until sum[|w(k)(i)-w(k)(i-1)|] < tolerance c======================================================================= c eigenvector iteration c======================================================================= ELSE IF (INDEX(MODE,'EIGEN') .NE. 0 .OR. + INDEX(MODE,'SQUARE') .NE. 0) THEN DO I=1,NALIGN WTEMP(I)=1.0/NALIGN END DO STEP=0 10 STEP=STEP+1 X=0.0 DO I=1,NALIGN WEIGHTS(I)=0.0 DO J=1,NALIGN WEIGHTS(I) = WEIGHTS(I) + WTEMP(J) * DIST(I,J) END DO X=X+WEIGHTS(I) END DO S=0.0 DO I=1,NALIGN S = S +(WTEMP(I)-WEIGHTS(I)/X) * (WTEMP(I)-WEIGHTS(I)/X) WTEMP(I)=WEIGHTS(I)/X END DO S=SQRT(S/NALIGN) IF ((STEP .LT. MAXSTEP) .AND. (S .GT. TOLERANCE)) GOTO 10 WRITE(*,'(A,I5,A,F10.4)')' WEIGHTS AT STEP:', STEP, + ' DIFFERENCE: ',S WRITE(*,'(13F6.3)') (NALIGN*WTEMP(I),I=1,NALIGN) ENDIF c======================================================================= c weights(i)=wtemp(i)**2 c======================================================================= IF (INDEX(MODE,'SQUARE') .NE. 0) THEN S=0.0 DO I=1,NALIGN WEIGHTS(I)=WTEMP(I) * WTEMP(I) S=S+WEIGHTS(I) END DO DO I=1,NALIGN WEIGHTS(I)=WEIGHTS(I)/S END DO WRITE(*,*) ' squared weights ' WRITE(*,'(13F6.3)') (NALIGN*WEIGHTS(I),I=1,NALIGN) c======================================================================= c weights(i)=SUM(dist(i,j)) c======================================================================= ELSE IF (INDEX(MODE,'SUM') .NE. 0) THEN S=0.0 DO I=1,NALIGN WEIGHTS(I)=0.0 DO J=1,NALIGN WEIGHTS(I)=WEIGHTS(I) + DIST(I,J) END DO S=S+WEIGHTS(I) END DO DO I=1,NALIGN WEIGHTS(I)=WEIGHTS(I)/S END DO WRITE(*,*) ' summed distance weights ' WRITE(*,'(13F6.3)') (NALIGN*WEIGHTS(I),I=1,NALIGN) c======================================================================= c weights(i)=1/SUM(exp(-dist(i,j)/dmean)) c======================================================================= ELSE IF (INDEX(MODE,'EXP') .NE. 0) THEN S=0.0 DO I=1,NALIGN DO J=I+1,NALIGN S=S+DIST(I,J) END DO END DO DMEAN=S/NALIGN/(NALIGN-1)*2 DO I=1,NALIGN S=0.0 DO J=1,NALIGN S=S+EXP(-DIST(I,J)/DMEAN) END DO IF (S.GT.0.0) THEN WEIGHTS(I)=1/S ELSE WRITE(*,*) ' warning: s=0 in weights ' WEIGHTS(I)=1.0 END IF END DO c normalize to 1.0 S=0.0 DO I=1,NALIGN S=S+WEIGHTS(I) END DO DO I=1,NALIGN WEIGHTS(I)=WEIGHTS(I)/S END DO WRITE(*,*) ' exponential distance weights ' WRITE(*,'(13F6.3)') (NALIGN*WEIGHTS(I),I=1,NALIGN) ENDIF RETURN END C end single_seq_weight C....................................................................... C....................................................................... C SUB SVDCMP SUBROUTINE SVDCMP(A,M,N,MP,NP,W,V) PARAMETER (NMAX=2000) DIMENSION A(MP,NP),W(NP),V(NP,NP),RV1(NMAX) L=0 nm=0 G=0.0 SCALE=0.0 ANORM=0.0 IF (m .gt. nmax) THEN write(*,*)'***ERROR: dim. overflow for RV1 in SVDCMP' STOP endif DO 25 I=1,N L=I+1 RV1(I)=SCALE*G G=0.0 S=0.0 SCALE=0.0 IF (I.LE.M) THEN DO 11 K=I,M SCALE=SCALE+ABS(A(K,I)) 11 CONTINUE IF (SCALE.NE.0.0) THEN DO 12 K=I,M A(K,I)=A(K,I)/SCALE S=S+A(K,I)*A(K,I) 12 CONTINUE F=A(I,I) G=-SIGN(SQRT(S),F) H=F*G-S A(I,I)=F-G IF (I.NE.N) THEN DO 15 J=L,N S=0.0 DO 13 K=I,M S=S+A(K,I)*A(K,J) 13 CONTINUE F=S/H DO 14 K=I,M A(K,J)=A(K,J)+F*A(K,I) 14 CONTINUE 15 CONTINUE ENDIF DO 16 K= I,M A(K,I)=SCALE*A(K,I) 16 CONTINUE ENDIF ENDIF W(I)=SCALE *G G=0.0 S=0.0 SCALE=0.0 IF ((I.LE.M).AND.(I.NE.N)) THEN DO 17 K=L,N SCALE=SCALE+ABS(A(I,K)) 17 CONTINUE IF (SCALE.NE.0.0) THEN DO 18 K=L,N A(I,K)=A(I,K)/SCALE S=S+A(I,K)*A(I,K) 18 CONTINUE F=A(I,L) G=-SIGN(SQRT(S),F) H=F*G-S A(I,L)=F-G DO 19 K=L,N RV1(K)=A(I,K)/H 19 CONTINUE IF (I.NE.M) THEN DO 23 J=L,M S=0.0 DO 21 K=L,N S=S+A(J,K)*A(I,K) 21 CONTINUE DO 22 K=L,N A(J,K)=A(J,K)+S*RV1(K) 22 CONTINUE 23 CONTINUE ENDIF DO 24 K=L,N A(I,K)=SCALE*A(I,K) 24 CONTINUE ENDIF ENDIF ANORM=MAX(ANORM,(ABS(W(I))+ABS(RV1(I)))) 25 CONTINUE DO 32 I=N,1,-1 IF (I.LT.N) THEN IF (G.NE.0.0) THEN DO 26 J=L,N V(J,I)=(A(I,J)/A(I,L))/G 26 CONTINUE DO 29 J=L,N S=0.0 DO 27 K=L,N S=S+A(I,K)*V(K,J) 27 CONTINUE DO 28 K=L,N V(K,J)=V(K,J)+S*V(K,I) 28 CONTINUE 29 CONTINUE ENDIF DO 31 J=L,N V(I,J)=0.0 V(J,I)=0.0 31 CONTINUE ENDIF V(I,I)=1.0 G=RV1(I) L=I 32 CONTINUE DO 39 I=N,1,-1 L=I+1 G=W(I) IF (I.LT.N) THEN DO 33 J=L,N A(I,J)=0.0 33 CONTINUE ENDIF IF (G.NE.0.0) THEN G=1.0/G IF (I.NE.N) THEN DO 36 J=L,N S=0.0 DO 34 K=L,M S=S+A(K,I)*A(K,J) 34 CONTINUE F=(S/A(I,I))*G DO 35 K=I,M A(K,J)=A(K,J)+F*A(K,I) 35 CONTINUE 36 CONTINUE ENDIF DO 37 J=I,M A(J,I)=A(J,I)*G 37 CONTINUE ELSE DO 38 J= I,M A(J,I)=0.0 38 CONTINUE ENDIF A(I,I)=A(I,I)+1.0 39 CONTINUE DO 49 K=N,1,-1 DO 48 ITS=1,30 DO 41 L=K,1,-1 NM=L-1 IF ((ABS(RV1(L))+ANORM).EQ.ANORM) GOTO 2 IF ((ABS(W(NM))+ANORM).EQ.ANORM) GOTO 1 41 CONTINUE 1 C=0.0 S=1.0 DO 43 I=L,K F=S*RV1(I) IF ((ABS(F)+ANORM).NE.ANORM) THEN G=W(I) H=SQRT(F*F+G*G) W(I)=H H=1.0/H C= (G*H) S=-(F*H) DO 42 J=1,M Y=A(J,NM) Z=A(J,I) A(J,NM)=(Y*C)+(Z*S) A(J,I)=-(Y*S)+(Z*C) 42 CONTINUE ENDIF 43 CONTINUE 2 Z=W(K) IF (L.EQ.K) THEN IF (Z.LT.0.0) THEN W(K)=-Z DO 44 J=1,N V(J,K)=-V(J,K) 44 CONTINUE ENDIF GOTO 3 ENDIF IF (ITS.EQ.30) PAUSE 'No convergence in 30 iterations' X=W(L) NM=K-1 Y=W(NM) G=RV1(NM) H=RV1(K) F=((Y-Z)*(Y+Z)+(G-H)*(G+H))/(2.0*H*Y) G=SQRT(F*F+1.0) F=((X-Z)*(X+Z)+H*((Y/(F+SIGN(G,F)))-H))/X C=1.0 S=1.0 DO 47 J=L,NM I=J+1 G=RV1(I) Y=W(I) H=S*G G=C*G Z=SQRT(F*F+H*H) RV1(J)=Z C=F/Z S=H/Z F= (X*C)+(G*S) G=-(X*S)+(G*C) H=Y*S Y=Y*C DO 45 NM=1,N X=V(NM,J) Z=V(NM,I) V(NM,J)= (X*C)+(Z*S) V(NM,I)=-(X*S)+(Z*C) 45 CONTINUE Z=SQRT(F*F+H*H) W(J)=Z IF (Z.NE.0.0) THEN Z=1.0/Z C=F*Z S=H*Z ENDIF F= (C*G)+(S*Y) X=-(S*G)+(C*Y) DO 46 NM=1,M Y=A(NM,J) Z=A(NM,I) A(NM,J)= (Y*C)+(Z*S) A(NM,I)=-(Y*S)+(Z*C) 46 CONTINUE 47 CONTINUE RV1(L)=0.0 RV1(K)=F W(K)=X 48 CONTINUE 3 CONTINUE 49 CONTINUE RETURN END C END SVDCMP C....................................................................... C....................................................................... C SUB WRITE_HISTO SUBROUTINE WRITE_HISTO(KHISTO,HISTOFILE,NALIGN,SORTVAL) c implicit none C import INTEGER KHISTO,NALIGN REAL SORTVAL(*) CHARACTER*(*) HISTOFILE C internal c integer maxbin,maxlen c parameter (maxbin=100,maxlen=80) c integer i,ibin,nbin(maxbin),maxpop,minpop,iend c character line*(maxlen),mark LOGICAL LERROR c mark='*' c do i=1,maxlen ; line(i:i)=mark ; enddo c do i=1,maxbin ; nbin(i)=0 ; enddo CALL OPEN_FILE(KHISTO,HISTOFILE,'NEW',LERROR) c do i=1,nalign c ibin=nint( sortval(i) / sortval(nalign) * maxbin) c IF (ibin .le. 0) THEN c ibin=1 c else IF (ibin .gt. maxbin) THEN c ibin=maxbin c endif c nbin(ibin)=nbin(ibin)+1 c enddo c c maxpop=-1 ; minpop=1000000 c do i=1,maxbin c IF (nbin(i) .gt. maxpop)maxpop=nbin(i) c IF (nbin(i) .lt. minpop)minpop=nbin(i) c enddo WRITE(KHISTO,'(A,I6)') ' number of scores: ',nalign WRITE(KHISTO,'(A,F7.2)')' minimal scores: ',sortval(1) WRITE(KHISTO,'(A,F7.2)')' maximum scores: ',sortval(nalign) WRITE(KHISTO,'(A)')'_________________________________________'// + '_____________________________________________' c do i=1,maxbin c iend=nint( (float(nbin(i)) / float(maxpop)) * float(maxlen) ) c IF (iend .gt.0 ) THEN c write(khisto,'(i5,a,a)')nbin(i),' |',line(1:iend) c else c write(khisto,'(i5,a)')nbin(i),' |' c endif c enddo c write(khisto,*) WRITE(KHISTO,'(A)')' values: ' DO I=1,NALIGN WRITE(KHISTO,'(I5,2X,F7.2)')I,SORTVAL(I) ENDDO CLOSE(KHISTO) RETURN END C END WRITE_HISTO C....................................................................... C....................................................................... C SUB WRITE_MAXHOM_COM SUBROUTINE WRITE_MAXHOM_COM(CFILTER) c implicit none INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' CHARACTER*(*) CFILTER c internal INTEGER OPTCUT,LENFILENAME,IBEG,IEND,I,J,JBEG,JEND CHARACTER COMMANDFILE*80,COMEXT*4,OUTLINE*80,LINE*35 CHARACTER COMMENTLINE*80 LOGICAL LERROR C init OUTLINE=' ' OPTCUT=110 COMMENTLINE='$!==========================================='// + '============================' COMEXT='.csh' LENFILENAME=INDEX(NAME1_ANSWER,'!')-1 IF (LENFILENAME .LE. 0)LENFILENAME=LEN(NAME1_ANSWER) CALL GETPIDCODE(NAME1_ANSWER(1:LENFILENAME),HSSPID_1) CALL LOWTOUP(COMMANDFILE_ANSWER,80) CALL STRPOS(HSSPID_1,IBEG,IEND) IF (CFILTER .EQ. ' ') THEN COMMANDFILE(1:)=HSSPID_1(IBEG:IEND)//'_maxhom'//comext ELSE COMMANDFILE(1:)=HSSPID_1(IBEG:IEND)//'_hssp'//comext ENDIF CALL OPEN_FILE(KCOM,COMMANDFILE,'NEW',LERROR) C======================================================================= C UNIX c-shell script C======================================================================= COMMENTLINE='#==========================================='// + '============================' WRITE(KCOM,'(A)')'#! /bin/csh' WRITE(KCOM,'(A)')COMMENTLINE IF (CFILTER .EQ. ' ') THEN WRITE(KCOM,'(A)')'# command file to run MAXHOM' ELSE WRITE(KCOM,'(A)')'# command file to run a PRE-FILTERED MAXHOM' ENDIF WRITE(KCOM,'(A)')'goto set_enviroment' WRITE(KCOM,'(A)')'start:' WRITE(KCOM,'(A)')COMMENTLINE WRITE(KCOM,'(A)')'# This .csh file writes a temporary '// + 'command file ("MAXHOM_"process_id".temp")' WRITE(KCOM,'(A)')'# containing the answers to the MAXHOM '// + 'questions.' WRITE(KCOM,'(A)')COMMENTLINE C=================================================================== IF (CFILTER .NE.' ') THEN C get sequence 1 CALL STRPOS(NAME1_ANSWER,I,J) IF (LISTOFSEQ_1) THEN WRITE(KCOM,'(A)')'# LOOP OVER FILENAMES IN LIST' WRITE(KCOM,'(A)') + 'foreach filename ( "`cat '//NAME1_ANSWER(I:J)//'`" )' ELSE WRITE(KCOM,'(A)')'set filename = '//NAME1_ANSWER(I:J) ENDIF C get identifier WRITE(KCOM,'(A)')COMMENTLINE WRITE(KCOM,'(A)')'# GET IDENTIFIER' WRITE(KCOM,'(A)')COMMENTLINE WRITE(KCOM,'(A)')' set name1 = $filename:r' WRITE(KCOM,'(A)')' set name2 = $name1:t' C convertseq WRITE(KCOM,'(A)')COMMENTLINE WRITE(KCOM,'(A)')'# CONVERT ALL FORMATS TO FASTA' WRITE(KCOM,'(A)')' echo $filename > MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "F" >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "N" >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo $name2".y" >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "N" >> MAXHOM_$$.temp' IF (CONVERTSEQ_EXE .NE. ' ') THEN CALL STRPOS(CONVERTSEQ_EXE,IBEG,IEND) WRITE(KCOM,'(A)')' echo "run convert_seq"' WRITE(KCOM,'(A,A,A)') + ' ',CONVERTSEQ_EXE(IBEG:IEND), + ' < MAXHOM_$$.temp >& /dev/null' ELSE STOP ' ERROR: CONVERTSEQ_EXE UNDEFINED ' ENDIF WRITE(KCOM,'(A)')' rm MAXHOM_$$.temp' C run FASTA IF (CFILTER .EQ. 'FASTA') THEN WRITE(KCOM,'(A)')' echo $name2".y" > MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "S" >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "1" >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "fasta.x_"$$ >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "2000" >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo " " >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "yes" >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo " " >> MAXHOM_$$.temp' IF (FASTA_EXE .NE. ' ') THEN CALL STRPOS(FASTA_EXE,IBEG,IEND) WRITE(KCOM,'(A,A,A)')' ',FASTA_EXE(IBEG:IEND), + ' -b 2000 -d 2000 -o < MAXHOM_$$.temp > fasta.x_$$' ELSE STOP ' ERROR: FASTA_EXE UNDEFINED ' ENDIF WRITE(KCOM,'(A)')' rm MAXHOM_$$.temp' WRITE(KCOM,'(A)')' rm $name2".y"' C get filter.list WRITE(KCOM,'(A)')COMMENTLINE WRITE(KCOM,'(A)')'# EXTRACT POSSIBL HITS FROM FASTA-OUTPUT' WRITE(KCOM,'(A)')' echo "fasta.x_"$$ > MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "filter.list_"$$ >> MAXHOM_$$.temp' I=ISAFE-5 IF (I.GT.0) THEN WRITE(OUTLINE,'(A,I2)')'FORMULA+',I ELSE IF (I.EQ.0) THEN WRITE(OUTLINE,'(A)')'FORMULA' ELSE WRITE(OUTLINE,'(A,I2)')'FORMULA-',ABS(I) ENDIF CALL STRPOS(OUTLINE,I,J) WRITE(KCOM,'(A)')' echo "'//OUTLINE(I:J)// + '" >> MAXHOM_$$.temp' WRITE(KCOM,'(A,I5,A)')' echo "',OPTCUT, + '" >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "distance" >> MAXHOM_$$.temp' IF (FILTER_FASTA_EXE .NE. ' ') THEN CALL STRPOS(FILTER_FASTA_EXE,IBEG,IEND) WRITE(KCOM,'(A,A,A)')' ',FILTER_FASTA_EXE(IBEG:IEND), + ' < MAXHOM_$$.temp' ELSE STOP ' ERROR: FILTER_FASTA_EXE UNDEFINED ' ENDIF WRITE(KCOM,'(A)')' rm MAXHOM_$$.temp' WRITE(KCOM,'(A)')' rm fasta.x_$$' C rename output files if wanted IF (STRIPFILE_ANSWER.NE.'NO') THEN STRIPFILE_ANSWER='$name2"_strip.x"' ENDIF IF (long_output_ANSWER.NE.'NO') THEN LONG_OUTPUT_ANSWER='$name2"_long.x"' ENDIF IF (PLOTFILE_ANSWER.NE.'NO') THEN PLOTFILE_ANSWER='$name2"_trace.x"' ENDIF C run BLASTP ELSE IF (CFILTER .EQ. 'BLASTP') THEN IF (BLASTP_EXE .NE. ' ') THEN CALL STRPOS(BLASTP_EXE,IBEG,IEND) WRITE(KCOM,'(A)')' echo "run blastp"' WRITE(KCOM,'(A,A,A)')' ',BLASTP_EXE(IBEG:IEND), + ' swiss $name2".y" B=2000 > blast.x_$$' ELSE STOP ' ERROR: BLASTP_EXE UNDEFINED ' ENDIF WRITE(KCOM,'(A)')' rm MAXHOM_$$.temp' WRITE(KCOM,'(A)')' rm $name2".y"' write(kcom,'(a)')commentline WRITE(KCOM,'(A)')'# EXTRACT HITS FROM BLASTP-OUTPUT' IF (FILTER_BLASTP_EXE .NE. ' ') THEN CALL STRPOS(FILTER_BLASTP_EXE,IBEG,IEND) CALL STRPOS(sw_current,jBEG,jEND) WRITE(KCOM,'(A,A,A,A,A)')' ', + FILTER_BLASTP_EXE(IBEG:IEND),' ', + sw_current(jbeg:jend), + ' < blast.x_$$ > filter.list_$$' write(kcom,'(a)')' rm blast.x_$$' ELSE STOP ' ERROR: FILTER_BLASTP_EXE UNDEFINED ' ENDIF ENDIF ENDIF C call MAXHOM LINE='" >> MAXHOM_$$.temp ' write(kcom,'(a)')commentline WRITE(KCOM,'(A)')'# -------- finally call MAXHOM -------' write(kcom,'(a)')commentline WRITE(KCOM,'(A)')' echo "COMMAND NO" > MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "BATCH" >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "PID: "$$ >> MAXHOM_$$.temp' IF (CFILTER .NE. ' ') THEN WRITE(KCOM,'(A)')' echo "SEQ_1 "$filename >> MAXHOM_$$.temp' WRITE(KCOM,'(A)') + ' echo "SEQ_2 filter.list_"$$ >> MAXHOM_$$.temp' ELSE CALL STRPOS(NAME1_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "SEQ_1 '//NAME1_ANSWER(I:J)//LINE CALL STRPOS(NAME2_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "SEQ_2 '//NAME2_ANSWER(I:J)//LINE ENDIF CALL STRPOS(PROFILE_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "2_PROFILES '// + PROFILE_ANSWER(I:J)//LINE CALL STRPOS(METRIC_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "METRIC '//METRIC_ANSWER(I:J)//LINE CALL STRPOS(NORM_PROFILE_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "NORM_PROFILE '// + NORM_PROFILE_ANSWER(I:J)//LINE CALL STRPOS(PROFILE_EPSILON_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "MEAN_PROFILE '// + PROFILE_EPSILON_ANSWER(I:J)//LINE CALL STRPOS(PROFILE_GAMMA_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "FACTOR_GAPS '// + PROFILE_GAMMA_ANSWER(I:J)//LINE CALL STRPOS(SMIN_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "SMIN '//SMIN_ANSWER(I:J)//LINE CALL STRPOS(SMAX_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "SMAX '//SMAX_ANSWER(I:J)//LINE CALL STRPOS(OPENWEIGHT_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "GAP_OPEN '// + OPENWEIGHT_ANSWER(I:J)//LINE CALL STRPOS(ELONGWEIGHT_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "GAP_ELONG '// + ELONGWEIGHT_ANSWER(I:J)//LINE CALL STRPOS(WEIGHT1_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "WEIGHT1 '//WEIGHT1_ANSWER(I:J)//LINE CALL STRPOS(WEIGHT2_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "WEIGHT2 '//WEIGHT2_ANSWER(I:J)//LINE CALL STRPOS(WAY3_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "WAY3_ALIGN '// + WAY3_ANSWER(I:J)//LINE CALL STRPOS(INDEL_ANSWER_1,I,J) WRITE(KCOM,'(A)')' echo "INDEL_1 '// + INDEL_ANSWER_1(I:J)//LINE CALL STRPOS(INDEL_ANSWER_2,I,J) WRITE(KCOM,'(A)')' echo "INDEL_2 '// + INDEL_ANSWER_2(I:J)//LINE CALL STRPOS(BACKWARD_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "RELIABILITY '// + BACKWARD_ANSWER(I:J)//LINE CALL STRPOS(FILTER_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "FILTER_RANGE '// + FILTER_ANSWER(I:J)//LINE CALL STRPOS(NBEST_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "NBEST '//NBEST_ANSWER(I:J)//LINE CALL STRPOS(NGLOBALHITS_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "MAXALIGN '// + NGLOBALHITS_ANSWER(I:J)//LINE CALL STRPOS(THRESHOLD_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "THRESHOLD '// + THRESHOLD_ANSWER(I:J)//LINE CALL STRPOS(SORTMODE_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "SORT '//SORTMODE_ANSWER(I:J)//LINE CALL STRPOS(HSSP_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "HSSP '//HSSP_ANSWER(I:J)//LINE CALL STRPOS(SAMESEQ_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "SAME_SEQ_SHOW '// + SAMESEQ_ANSWER(I:J)//LINE CALL STRPOS(COMPARE_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "SUPERPOS '//COMPARE_ANSWER(I:J)//LINE CALL STRPOS(PDBPATH_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "PDB_PATH '//PDBPATH_ANSWER(I:J)//LINE CALL STRPOS(PROFILEOUT_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "PROFILE_OUT '// + PROFILEOUT_ANSWER(I:J)//LINE CALL STRPOS(STRIPFILE_ANSWER,I,J) IF (INDEX(STRIPFILE_ANSWER,'$name2').NE.0) THEN WRITE(KCOM,'(A)')' echo "STRIP_OUT "'// + STRIPFILE_ANSWER(I:J)// + ' >> MAXHOM_$$.temp' ELSE WRITE(KCOM,'(A)')' echo "STRIP_OUT '// + STRIPFILE_ANSWER(I:J)//LINE ENDIF CALL STRPOS(long_output_ANSWER,I,J) IF (INDEX(long_output_ANSWER,'$name2').ne.0) THEN WRITE(KCOM,'(A)')' echo "LONG_OUT "'// + long_output_ANSWER(I:J)// + ' >> MAXHOM_$$.temp' ELSE WRITE(KCOM,'(A)')' echo "LONG_OUT '// + long_output_ANSWER(I:J)//LINE ENDIF CALL STRPOS(PLOTFILE_ANSWER,I,J) IF (INDEX(PLOTFILE_ANSWER,'$name2').ne.0) THEN WRITE(KCOM,'(A)')' echo "DOT_PLOT "'//PLOTFILE_ANSWER(I:J)// + ' >> MAXHOM_$$.temp' ELSE WRITE(KCOM,'(A)')' echo "DOT_PLOT '// + PLOTFILE_ANSWER(I:J)//LINE WRITE(KCOM,'(A)')' echo "RUN" >> MAXHOM_$$.temp' ENDIF WRITE(KCOM,'(A)')' maxhom -nopar < MAXHOM_$$.temp' CALT WRITE(KCOM,'(A)')' $snice maxhom < MAXHOM_$$.temp' WRITE(KCOM,'(A)')' rm MAXHOM_$$.temp' CALL STRPOS(COREFILE,IBEG,IEND) WRITE(KCOM,'(A,A,A)')' rm ',COREFILE(IBEG:IEND),'$$' WRITE(KCOM,'(A)')' rm filter.list_$$' IF (CFILTER .NE. ' ' .AND. LISTOFSEQ_1) THEN WRITE(KCOM,'(A)')'end' ENDIF WRITE(KCOM,'(A)')'exit' WRITE(KCOM,'(A)')COMMENTLINE WRITE(KCOM,'(A)')'set_enviroment:' WRITE(KCOM,'(A)')'nohup' WRITE(KCOM,'(A)')'alias rm ''rm -f''' WRITE(KCOM,'(A)')'goto start' WRITE(KCOM,'(A)')COMMENTLINE CLOSE(KCOM) C====================================================================== WRITE(*,*)'****************************************************' CALL STRPOS(COMMANDFILE,IBEG,IEND) WRITE(*,*)' wrote command file to: ',commandfile(ibeg:iend) IF (CMACHINE .EQ. 'VMS' ) THEN WRITE(*,*)'now submit this command file to a batch queue' ELSE CALL CHANGE_MODE(COMMANDFILE,'+x',i) WRITE(*,*)'to execute this file type: ' IF (I .NE. 0) THEN WRITE(*,'(A,A)')'chmod +x ',COMMANDFILE(IBEG:IEND) ENDIF WRITE(*,'(2X,A,A)')COMMANDFILE(IBEG:IEND),' > /dev/null &' ENDIF WRITE(*,*)'****************************************************' RETURN END C END WRITE_MAXHOM_COM C....................................................................... profphd-utils-1.0.10/filter_hssp.f0000755015075101507510000016615012012371464016374 0ustar lkajanlkajan*----------------------------------------------------------------------* * * * FORTRAN code for program FILTER_HSSP * * filters an HSSP file * * * *----------------------------------------------------------------------* * * * Author: * * * * Reinhard Schneider Mar, 1990 version 1.0 * * -> Mar, 1997 version 2.0 * * LION http://www.lion-ag/ * * D-69120 Heidelberg schneider@lion-ag.de * * * * * * Changes: * * * * Burkhard Rost May, 1998 version 2.1 * * Oct, 1998 version 2.2 * * * * EMBL/LION http://www.embl-heidelberg.de/~rost/ * * D-69012 Heidelberg rost@embl-heidelberg.de * * * *----------------------------------------------------------------------* * * * General note: - uses library lib-maxhom.f * * * *----------------------------------------------------------------------* * * * Description (from RS): * * * C======================================================================= C C ------------------------------ C INSTALLATION: C ------------------------------ C C 1. change the character variable "metricfile" to your enviroment path C 2. compile it C C======================================================================= C INCREASE THE NUMBER OF FOLLOWING THREE PARAMETER IF NECESSARY C======================================================================= C C maxaligns = maximal number of alignments in a HSSP-file C maxres = maximal number of residues in a PDB-protein C maxcore = maximal space for storing the alignments C maxins = maximal number of insertions in alignend sequences C maxinsbuffer = total size for buffer of all insertions C C======================================================================= C C ------------------------------ C Explanation of variables: C ------------------------------ C C maxaa= 20 amino acids C pdbid= Brookhaven Data Bank identifier C header,compound,source,author= informations about the PDB-protein C pdbseq= amino acid sequence of the PDB-protein C chainid= chain identifier (chain A etc.) C secstr= DSSP secondary structure summary C bp1,bp2= beta-bridge partner C cols= DSSP hydrogen bonding patterns for turns and helices, C geometrical bend, chirality, one character name of beta-ladder C and of beta-sheet C sheetlabel= chain identifier of beta bridge partner C seqlength= number of amino acids in the PDB-protein C pdbno= residue number as in PDB file C nchain= number of different chains in pdbid.DSSP data set C kchain= number of chains used in HSSP data set C nalign= number of alignments C acc= solvated residue surface area in A**2 C emblid= EMBL/SWISSPROT identifier of the alignend protein C strid= if the 3-D structure of this protein is known, then strid C (structure ID) is the Protein Data Bank identifier as taken C from the EMBL/SWISSPROT entry C protname= one line description of alignend protein C aliseq= sequential storage for the alignments C alipointer= points to the beginning of alignment X ( 1>= X <=nalign ) C ifir,ilas= first and last position of the alignment in the test C protein C jfir,jlas= first and last position of the alignment in the alignend C protein C lali= length of the alignment excluding insertions and deletions C ngap= number of insertions and deletions in the alignment C lgap= total length of all insertions and deletions C lenseq= length of the entire sequence of the alignend protein C ide= percentage of residue identity of the alignment C var= sequence variability as derived from the nalign alignments C seqprof= relative frequency for each of the 20 amino acids C nocc= number of alignend sequences spanning this position (including C the test sequence C ndel= number of sequences with a deletion in the test protein at this C position C nins= number of sequences with an insertion in the test protein at C this position C entropy= entropy measure of sequence variability at this position C relent= relative entropy (entropy normalized to the range 0-100) C====================================================================== C C ------------------------------ C CHANGES: C ------------------------------ C IDEMAX_ANSWER = maximal pairwise sequence identity to not get C excluded *----------------------------------------------------------------------* PROGRAM FILTER_HSSP IMPLICIT NONE C---- C---- overall memory limiting parameters C---- INTEGER MAXALIGNS,MAXRES,MAXCORE,MAXAA,MAXINS INTEGER MAXINSBUFFER PARAMETER (MAXALIGNS= 8765) C PARAMETER (MAXALIGNS= 19999) PARAMETER (MAXRES= 9999) C PARAMETER (MAXRES= 10000) PARAMETER (MAXINS= 50000) PARAMETER (MAXINSBUFFER= 4321432) C---- lkajan: 16M to comfortably store all the characters of an alignment C---- of size SEQLENGTH 3685 and NALIGN 2794 PARAMETER (MAXCORE= 16777216) PARAMETER (MAXAA= 20) C---- C---- other parameters C---- INTEGER MAXSTEP C---- used for variability INTEGER NTRANS,MAXSTRSTATES,MAXIOSTATES REAL SMIN,SMAX PARAMETER (MAXSTEP= 100) PARAMETER (NTRANS= 26) PARAMETER (MAXSTRSTATES= 3) PARAMETER (MAXIOSTATES= 4) PARAMETER (SMIN= 0.0) PARAMETER (SMAX= 1.0) C only used to get rid of INDEX command (CPU time) INTEGER NASCII PARAMETER (NASCII= 256) C files INTEGER KIN,KOUT,KMAT,KISO PARAMETER (KIN= 10) PARAMETER (KOUT= 11) PARAMETER (KMAT= 12) PARAMETER (KISO= 14) C---- br 2003-08: switch for new (1999) HSSP curve LOGICAL LNEWCURVE PARAMETER (LNEWCURVE= .TRUE.) C PARAMETER (LNEWCURVE= .FALSE.) C---- C---- attributes of sequence with known structure C---- CHARACTER*40 PDBID,HEADER CHARACTER*80 COMPOUND,SOURCE,AUTHOR,CHAINREMARK CHARACTER PDBSEQ(MAXRES),CHAINID(MAXRES),SECSTR(MAXRES), + COLS(MAXRES)*7,SHEETLABEL(MAXRES) INTEGER SEQLENGTH,PDBNO(MAXRES),NCHAIN,KCHAIN,NALIGN, + NALIGN_FILTER,BP1(MAXRES),BP2(MAXRES),ACC(MAXRES) C---- C---- attributes of aligned sequences C---- CHARACTER*40 EMBLID(MAXALIGNS) CHARACTER*6 STRID(MAXALIGNS) CHARACTER*10 ACCNUM(MAXALIGNS) CHARACTER*60 PROTNAME(MAXALIGNS) CHARACTER ALISEQ(MAXCORE),EXCLUDEFLAG(MAXALIGNS) INTEGER ALIPOINTER(MAXALIGNS), + IFIR(MAXALIGNS),ILAS(MAXALIGNS),JFIR(MAXALIGNS), + JLAS(MAXALIGNS),LALI(MAXALIGNS),NGAP(MAXALIGNS), + LGAP(MAXALIGNS),LENSEQ(MAXALIGNS), + INSNUMBER,INSALI(MAXINS),INSPOINTER(MAXINS), + INSLEN(MAXINS),INSBEG_1(MAXINS),INSBEG_2(MAXINS) C---- C---- buffer to store insertions of alignments C---- CHARACTER INSBUFFER(MAXINSBUFFER) REAL IDE(MAXALIGNS),SIM(MAXALIGNS) C---- C---- attributes of profile C---- INTEGER VAR(MAXRES),SEQPROF(MAXRES,MAXAA),NOCC(MAXRES), + NDEL(MAXRES),NINS(MAXRES),RELENT(MAXRES) REAL ENTROPY(MAXRES),CONSWEIGHT(MAXRES),CONSWEIGHT_MIN C---- C---- miscellaneous C---- LOGICAL LERROR,LCONSERV,LHSSP_LONG_ID,LIDE_100,LSAME CHARACTER*9 CDATE C---- C---- threshold C---- LOGICAL LCONSIDER,LFORMULA,LALL INTEGER ISOLEN(MAXSTEP),NSTEP,ISAFE REAL ISOIDE(MAXSTEP),IDEMAX CHARACTER*80 IDEMAX_ANSWER C---- C---- value of best match C---- INTEGER NSTRSTATES,NIOSTATES C---- C---- comparison metric C---- REAL MATRIX(NTRANS,NTRANS,MAXSTRSTATES,MAXIOSTATES, + MAXSTRSTATES,MAXIOSTATES) C---- C---- normalised to sum over 0 (by SETCONSERVATION) C---- REAL SIMCONSERV(NTRANS,NTRANS) REAL IORANGE(MAXSTRSTATES,MAXIOSTATES) CHARACTER CSTRSTATES*(MAXSTRSTATES),CIOSTATES*(MAXIOSTATES) C---- C---- C...... CHARACTER*80 HSSPLINE,DATABASE CHARACTER*132 CPARAMETER(10),LINE CHARACTER TRANS*(NTRANS) C..... CHARACTER*132 HSSPFILE,OUTFILE,METRICFILE,OUTFILE_TMP, + DISTANCE_TABLE,DISTANCE_TABLE_TMP CHARACTER*80 THRESHOLD,EXCLUDE_IDENTICAL,TEMPNAME CHARACTER PID*40,LOWER*26 C---- C---- only used to get rid of INDEX command (CPU time) C---- INTEGER LOWERPOS(NASCII) INTEGER MATPOS(NASCII) C---- C---- internal C---- INTEGER ISTART,ISTOP,I,J,ILINE,NPARALINE,IALIGN,JALIGN, + NRES,ILEN REAL DISTANCE CHARACTER*80 CTEMP LOGICAL LDIST_TABLE C---- end of variables C---- ------------------------------------------------------------------ C order of amino acids in HSSP sequence profile TRANS= 'VLIMFWYGAPSTCHRKQENDBZX!-.' C used to convert lower case char from DSSP-seq to 'C' (Cys) LOWER= 'abcdefghijklmnopqrstuvwxyz' CALL GETPOS(LOWER,LOWERPOS,NASCII) INSNUMBER=0 DO I=1,MAXINS INSPOINTER(I)=0 INSLEN(I)= 0 INSBEG_1(I)= 0 INSBEG_2(I)= 0 ENDDO C---- ------------------------------------------------------------------ C defaults: CHANGE TO LOCAL ENVIROMENT C---- ------------------------------------------------------------------ METRICFILE= '/home/rost/pub/max/mat/Maxhom_GCG.metric' THRESHOLD= 'formula+5' IDEMAX_ANSWER= '1.00' HSSPFILE= ' ' LFORMULA= .FALSE. LALL= .FALSE. DATABASE= ' ' HSSPLINE= ' ' EXCLUDE_IDENTICAL='NO' DISTANCE_TABLE= 'NO' LIDE_100= .FALSE. LINE= ' ' LDIST_TABLE= .FALSE. LHSSP_LONG_ID= .FALSE. CONSWEIGHT_MIN= 0.01 C---- end of settings C---- ------------------------------------------------------------------ C---- ------------------------------------------------------------------ C---- blabla for user C---- ------------------------------------------------------------------ WRITE(*,*)' ' WRITE(*,*)'**************************** HSSP-FILE FILTER ***'// + '*****************************' WRITE(*,*)' R. Schneider, 1990 and later, EMBL-Heidelberg' WRITE(*,*)'**************************************************'// + '****************************' WRITE(*,*)' ' WRITE(*,*)' If you want to exclude an alignment independent'// + ' from the threshold, mark the' WRITE(*,*)' alignment in your "personal" copy of the HSSP-'// + 'file in the following way:' WRITE(*,*)' ' WRITE(*,*)' type a character (non blank, overstrike mode)'// + ' after the alignment number' WRITE(*,*)' in the PROTEINS-block, like the "*" in:' WRITE(*,*)' ' WRITE(*,*)' 15*: REV1_YEAST' WRITE(*,*)' ^' C---- ------------------------------------------------------------------ C---- prompt for input C---- ------------------------------------------------------------------ C---- query name of input file CALL GETCHAR(132,HSSPFILE, ' HSSP input file ?') CALL GETPIDCODE(HSSPFILE,PID) CALL STRPOS(PID,ISTART,ISTOP) C---- query name of output file OUTFILE=PID(1:ISTOP)//'.hssp_fil' CALL GETCHAR(132,OUTFILE_TMP, ' HSSP output file ?') CALL STRPOS(OUTFILE_TMP,ISTART,ISTOP) OUTFILE(1:)=OUTFILE_TMP(ISTART:ISTOP) C---- query metric used to convert identity to similarity CALL GETCHAR(132,METRICFILE,' metric file (HSSP-VARIABILITY) ?') C---- query cut-off threshold IF (LNEWCURVE) THEN CALL GETCHAR(80,THRESHOLD, ' BR 1999 threshold ? /n '// + ' /n '// + ' formula+x : formula value plus x percent /n '// + ' ALL : no threshold or /n '// + ' "file name" : with threshold specification ') ELSE CALL GETCHAR(80,THRESHOLD, ' RS 1989 threshold ? /n '// + ' /n '// + ' formula+x : formula value plus x percent /n '// + ' ALL : no threshold or /n '// + ' "file name" : with threshold specification ') ENDIF C---- maximal pairwise sequence identity CALL GETCHAR(80,IDEMAX_ANSWER,' exclude too similar pairs?/n'// + ' positive real number < 1 (like: 0.8)') CALL LOWTOUP(IDEMAX_ANSWER,80) CALL STRPOS(IDEMAX_ANSWER,ISTART,ISTOP) CALL READ_REAL_FROM_STRING(IDEMAX_ANSWER(ISTART:ISTOP),IDEMAX) C---- end of new stuff C---- query 'you want to clean up identical pairs?' CALL GETCHAR(80,EXCLUDE_IDENTICAL,' clean up 100% '// + 'identical pairs ?') C---- query name of distance table (or NO for no table written) CALL GETCHAR(132,DISTANCE_TABLE,' write distance table ?/n'// + ' NO : default no file written/n'// + ' YES : default file = PDBid_distance.table/n'// + ' FILE_NAME : written into this file written/n') DISTANCE_TABLE_TMP=DISTANCE_TABLE CALL LOWTOUP(DISTANCE_TABLE_TMP,80) IF (INDEX(DISTANCE_TABLE_TMP,'NO') .NE. 0) THEN LDIST_TABLE=.FALSE. ELSE IF (INDEX(DISTANCE_TABLE_TMP,'YES') .NE. 0) THEN LDIST_TABLE=.TRUE. DISTANCE_TABLE="DEFAULT" ELSE LDIST_TABLE=.TRUE. ENDIF TEMPNAME(1:)=THRESHOLD CALL LOWTOUP(TEMPNAME,80) IF (INDEX(TEMPNAME,'FORMULA') .NE. 0) THEN LFORMULA=.TRUE. I=INDEX(THRESHOLD,'+') J=INDEX(THRESHOLD,'-') C---- extract 'safe' range IF (I.NE.0) THEN CALL STRPOS(THRESHOLD,ISTART,ISTOP) READ(THRESHOLD(I+1:ISTOP),'(I2)')ISAFE WRITE(*,'(A,I2,A)')' use formula value +',isafe,' %' ELSE IF (J.NE.0) THEN CALL STRPOS(THRESHOLD,ISTART,ISTOP) READ(THRESHOLD(J:ISTOP),'(I2)')ISAFE WRITE(*,'(A,I2,A)')' use formula value ',isafe,' %' ELSE ISAFE=0 ENDIF ELSE IF (INDEX(TEMPNAME,'ALL') .NE. 0) THEN LALL=.TRUE. ELSE C---- read threshold from file (for details look in routine gethsspcut) CALL GETHSSPCUT(KISO,MAXSTEP,THRESHOLD,ISOLEN,ISOIDE,NSTEP) ENDIF WRITE(*,*)'=================================================='// + '============================' CALL LOWTOUP(EXCLUDE_IDENTICAL,80) IF ( INDEX(EXCLUDE_IDENTICAL,'Y') .NE. 0) LIDE_100=.TRUE. C---- ------------------------------------------------------------------ C---- read HSSP-file C---- ------------------------------------------------------------------ CALL READHSSP(KIN,HSSPFILE,LERROR, + MAXRES,MAXALIGNS,MAXCORE,MAXINS,MAXINSBUFFER, + PDBID,HEADER,COMPOUND,SOURCE,AUTHOR,SEQLENGTH, + NCHAIN,KCHAIN,CHAINREMARK,NALIGN, + EXCLUDEFLAG,EMBLID,STRID,IDE,SIM,IFIR,ILAS, + JFIR,JLAS,LALI,NGAP,LGAP,LENSEQ,ACCNUM,PROTNAME, + PDBNO,PDBSEQ,CHAINID,SECSTR,COLS,SHEETLABEL,BP1, + BP2,ACC,NOCC,VAR,ALISEQ,ALIPOINTER, + SEQPROF,NDEL,NINS,ENTROPY,RELENT,CONSWEIGHT, + INSNUMBER,INSALI,INSPOINTER,INSLEN,INSBEG_1, + INSBEG_2,INSBUFFER,LCONSERV,LHSSP_LONG_ID) IF (LERROR) THEN WRITE(*,*)'*** ERROR reading HSSP-file (after READHSSP)' STOP ENDIF C---- C---- read 'old' database specification and parameter C---- CALL OPEN_FILE(KIN,HSSPFILE,'old,readonly',lerror) DO WHILE(INDEX(LINE,'HSSP').EQ.0) READ(KIN,'(A)')LINE ENDDO I=LEN(HSSPLINE) HSSPLINE(1:I)=LINE(1:I) DO WHILE (INDEX(LINE,'SEQBASE').EQ.0 .AND. + INDEX(LINE,'DATABASE').EQ.0) READ(KIN,'(A)')LINE ENDDO I=LEN(DATABASE) DATABASE(1:I)=LINE(1:I) LINE=' ' ILINE=1 DO WHILE (INDEX(LINE,'THRESHOLD').EQ.0) READ(KIN,'(A)')LINE IF (INDEX(LINE,'PARAMETER').NE.0) THEN CPARAMETER(ILINE)=LINE(11:) ILINE=ILINE+1 ENDIF ENDDO NPARALINE=ILINE-1 CLOSE(KIN) NRES=SEQLENGTH+KCHAIN-1 C---- C---- read SIMILARITY MATRIX C---- CALL GETSIMMETRIC(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES,NIOSTATES,NSTRSTATES,NIOSTATES, + CSTRSTATES,CIOSTATES,IORANGE,KMAT,METRICFILE,MATRIX) C---- CALL SCALEMETRIC(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + MATRIX,SMIN,SMAX,0.0,0.0) C---- C---- GET ACTUAL DATE C---- CDATE=' ' CALL GETDATE (CDATE) C---- C---- ------------------------------------------------------------------ C---- check all pairwise identities, and exclude all pairs i,j with: C---- j > i, and identity(i,j) > IDEMAX C---- ------------------------------------------------------------------ IF (IDEMAX.LT.1) THEN CALL DIST_TABLE_EXCLUDE(IDEMAX,NALIGN,NRES,IDE,IFIR,ILAS, + LALI,ALIPOINTER,ALISEQ,EXCLUDEFLAG,NTRANS,TRANS) END IF C---- C---- ------------------------------------------------------------------ C---- get number of alignments after 'clean up' C---- ------------------------------------------------------------------ WRITE(*,*)'=================================================='// + '============================' NALIGN_FILTER=0 DO IALIGN=1,NALIGN IF ( EXCLUDEFLAG(IALIGN) .EQ. ' ') THEN IF (LNEWCURVE) THEN CALL CHECKHSSPCUT99(LALI(IALIGN),IDE(IALIGN)*100,ISOLEN, + ISOIDE,NSTEP,LFORMULA,LALL,ISAFE,LCONSIDER,DISTANCE) ELSE CALL CHECKHSSPCUT(LALI(IALIGN),IDE(IALIGN)*100,ISOLEN, + ISOIDE,NSTEP,LFORMULA,LALL,ISAFE,LCONSIDER,DISTANCE) ENDIF IF (LCONSIDER) THEN NALIGN_FILTER=NALIGN_FILTER+1 C---- C---- for exclusion of identical alignments C---- IF (LIDE_100) THEN IF (IDE(IALIGN).EQ.1.0 .AND. IFIR(IALIGN).EQ.1 .AND. + ILAS(IALIGN) .EQ. NRES ) THEN EXCLUDEFLAG(IALIGN)='*' WRITE(*,'(i6,a)')ialign,'. ALIGNMENT excluded: '// + 'identical to master' NALIGN_FILTER=NALIGN_FILTER-1 ELSE C---- C---- loop over other alignments and check for identity C---- DO JALIGN=IALIGN+1,NALIGN IF (EXCLUDEFLAG(JALIGN) .EQ. ' ') THEN IF (IFIR(IALIGN) .EQ. IFIR(JALIGN) .AND. + ILAS(IALIGN) .EQ. ILAS(JALIGN) ) THEN LSAME=.TRUE. I=0 ILEN=ILAS(IALIGN)-IFIR(IALIGN)+1 DO WHILE( LSAME .AND. I .LE. ILEN) IF (ALISEQ(ALIPOINTER(IALIGN)+I).NE. + ALISEQ(ALIPOINTER(JALIGN)+I)) THEN LSAME=.FALSE. ENDIF I=I+1 ENDDO IF (LSAME) THEN EXCLUDEFLAG(JALIGN)='*' WRITE(*,'(I6,A,I6)')JALIGN, + ' ALIGNMENT excluded: identical to'// + ' alignment: ',ialign ENDIF ENDIF ENDIF ENDDO ENDIF ENDIF ELSE EXCLUDEFLAG(IALIGN)='*' WRITE(*,'(I6,A)')IALIGN,'. ALIGNMENT excluded: threshold' ENDIF ELSE WRITE(*,'(I6,A)')IALIGN,'. ALIGNMENT excluded: marked' ENDIF ENDDO C---- -------------------------------------------------- C---- write header of HSSP file C---- -------------------------------------------------- IF (CHAINREMARK .NE. ' ') THEN CTEMP(1:)=CHAINREMARK(1:) I=INDEX(CTEMP,':') CHAINREMARK(1:)=CTEMP(I+1:) ENDIF C---- C---- write header C---- CALL HSSPHEADER(KOUT,OUTFILE,HSSPLINE,PDBID,CDATE, + DATABASE,CPARAMETER,NPARALINE, + THRESHOLD,ISAFE,LFORMULA, + HEADER,COMPOUND,SOURCE,AUTHOR,SEQLENGTH, + NCHAIN,KCHAIN,CHAINREMARK,NALIGN_FILTER) C---- END IF empty! IF (NALIGN_FILTER .EQ. 0) THEN WRITE(6,*)'-*- WARNING FILTER_HSSP file empty (no ali found)!' WRITE(KOUT,'(A)')'//' CLOSE(KOUT) STOP ENDIF C---- -------------------------------------------------- C---- conservation weights, profile, and variance C---- -------------------------------------------------- C---- C---- 98-10: br & rs C---- normalise MATRIX -> SIMCONSERV C---- such that SIMCONSERV has an average of 0 C---- CALL SETCONSERVATION(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES,NIOSTATES,CSTRSTATES,CIOSTATES,IORANGE,KMAT, + METRICFILE,MATRIX,SIMCONSERV) C---- C---- 98-10: br C---- 98-10: br & rs C---- compile conservation weights C---- CALL GETCONSWEIGHT_BR(MAXSTRSTATES,MAXIOSTATES, + MAXRES,MAXALIGNS,NTRANS,NASCII,NALIGN,NRES, + IDE,IFIR,ILAS,LALI,ALIPOINTER,PDBSEQ,ALISEQ, + EXCLUDEFLAG,TRANS,MATRIX,SIMCONSERV,MATPOS, + CONSWEIGHT_MIN,CONSWEIGHT) C---- C---- rescale metric C---- CALL SCALEMETRIC(NTRANS,TRANS,MAXSTRSTATES, + MAXIOSTATES,MATRIX,SMIN,SMAX,0.0,0.0) C---- C---- calculate variability C---- CALL CALC_VAR(NALIGN,NRES,PDBSEQ,IDE,IFIR,ILAS, + ALIPOINTER,ALISEQ,EXCLUDEFLAG,MAXSTRSTATES, + MAXIOSTATES,NTRANS,TRANS,MATRIX,VAR) C---- C---- profiles C---- CALL CALC_PROF(MAXRES,MAXAA,NRES,PDBSEQ,NALIGN,EXCLUDEFLAG,IDE, + IFIR,ILAS,ALISEQ,ALIPOINTER,TRANS,SEQPROF, + NOCC,NDEL,NINS,ENTROPY,RELENT) C---- C---- finally write new HSSP file C---- CALL WRITE_HSSP(KOUT,MAXRES,NALIGN,NRES,EMBLID,STRID,ACCNUM,IDE, + SIM,IFIR,ILAS,JFIR,JLAS,LALI,NGAP,LGAP,LENSEQ, + PROTNAME,ALIPOINTER,ALISEQ,PDBNO,CHAINID,PDBSEQ, + SECSTR,COLS,BP1,BP2,SHEETLABEL,ACC,NOCC,VAR, + SEQPROF,NDEL,NINS,ENTROPY,RELENT,CONSWEIGHT, + INSNUMBER,INSALI,INSPOINTER,INSLEN,INSBEG_1, + INSBEG_2,INSBUFFER,ISOLEN,ISOIDE,NSTEP, + LFORMULA,LALL,ISAFE,EXCLUDEFLAG,LCONSERV, + LHSSP_LONG_ID) C---- C---- additionally write table with sequence identity ij ? C---- IF (LDIST_TABLE) THEN IF (DISTANCE_TABLE .EQ. 'DEFAULT') THEN CALL STRPOS(PDBID,ISTART,ISTOP) DISTANCE_TABLE=PDBID(ISTART:ISTOP)//'_distance.table' ENDIF CALL DIST_TABLE2(KOUT,DISTANCE_TABLE,NALIGN,NRES,PDBID, + EMBLID,IDE,IFIR,ILAS,LALI,ALIPOINTER,ALISEQ, + EXCLUDEFLAG,NTRANS,TRANS) WRITE(6,'(A,A)')'--- distance table into file:',DISTANCE_TABLE ENDIF END C end of FILTER_HSSP C...................................................................... C...................................................................... C SUB DIST_TABLE SUBROUTINE DIST_TABLE(KOUT,FILENAME,NALIGN,NRES, + PDB_ID,EMBL_ID,IDE,IFIR,ILAS,LALI, + ALIPOINTER,ALISEQ,EXCLUDEFLAG,NTRANS,CTRANS) CPLAN speed up this routine C RS May 93 C calculate and write distance table for all pairs of alignend C sequences in one HSSP-file C C---- import IMPLICIT NONE INTEGER KOUT,NALIGN,NRES,NTRANS, + IFIR(*),ILAS(*),LALI(*),ALIPOINTER(*) REAL IDE(*) CHARACTER FILENAME*(*) CHARACTER*(*) ALISEQ(*),EXCLUDEFLAG(*),PDB_ID*(*),EMBL_ID(*) C---- allowed sequence symbols CHARACTER*(*) CTRANS C---- C---- internal C---- INTEGER MAXRES,NASCII,MAXLEN PARAMETER (NASCII= 256) PARAMETER (MAXRES= 9999) C PARAMETER (MAXRES= 10000) PARAMETER (MAXLEN= 20000) INTEGER IALIGN_SEQ(MAXRES),JALIGN_SEQ(MAXRES), + I,IALIGN,JALIGN,ILEN,IRES, + IPOS,JPOS,IBEG,IEND,KPOS,IAGR,ISTEP REAL SEQDIST CHARACTER LINE*(MAXLEN) LOGICAL LERROR C---- only used to get rid of INDEX command (CPU TIME) INTEGER MATPOS(NASCII) C---- ------------------------------------------------------------------ C---- now work on it C---- ------------------------------------------------------------------ WRITE(6,*)' dist_table' C---- C---- calculate variability only for the 22 (BZ) amino acids C---- DO I=1,NASCII MATPOS(I)=0 ENDDO CALL GETPOS(CTRANS(1:22),MATPOS,NASCII) IF (NRES .GT. MAXRES) THEN WRITE(6,*)'ERROR: NRES .GT. MAXRES IN DIST_TABLE' WRITE(6,*)'**** INCREASE MAXRES ****' STOP ENDIF C---- C---- initialize C---- DO I=1,NRES IALIGN_SEQ(I)=0 JALIGN_SEQ(I)=0 ENDDO LINE=' ' ISTEP=14 KPOS=13 C---- C---- output file C---- CALL OPEN_FILE(KOUT,FILENAME,'NEW,RECL=50000',LERROR) CALL STRPOS(PDB_ID,IBEG,IEND) WRITE(KOUT,'(A,A)')PDB_ID(IBEG:IEND),' distance table' WRITE(KOUT,'(A,I6)')'number of alignments: ',nalign C---- C---- first line gives identities and alignment lengths of the HSSP C---- WRITE(LINE(KPOS:),'(A,A)')'|',PDB_ID(1:12) KPOS=KPOS+ISTEP DO IALIGN=1,NALIGN-1 IF ( EXCLUDEFLAG(IALIGN) .EQ. ' ') THEN IF (KPOS .GT. MAXLEN) THEN WRITE(6,*)'*** ERROR: MAXLEN overflow in DIST_TABLE'// + ' (lib-maxhom)' WRITE(6,*)'*** INCREASE DIMENSION to > ',KPOS STOP ENDIF WRITE(LINE(KPOS:),'(A,A)')'|',EMBL_ID(IALIGN) KPOS=KPOS+ISTEP ENDIF ENDDO CALL STRPOS(LINE,IBEG,IEND) WRITE(KOUT,*)LINE(1:IEND) DO I=1,IEND LINE(I:I)='=' ENDDO WRITE(KOUT,*)LINE(1:IEND) C---- ------------------------------------------------------------------ C---- loop over all alignments C---- ------------------------------------------------------------------ DO IALIGN=1,NALIGN IF ( EXCLUDEFLAG(IALIGN) .EQ. ' ') THEN LINE=' ' KPOS=27 C---- C---- first distance to pdb-seq C---- SEQDIST=1.0-IDE(IALIGN) WRITE(LINE(1:),'(A,A,2X,F4.2,1X,I5)') + EMBL_ID(IALIGN),'|',SEQDIST,LALI(IALIGN) IPOS=ALIPOINTER(IALIGN)-IFIR(IALIGN) C---- C---- store alignment sequence in integer array C---- DO IRES=IFIR(IALIGN),ILAS(IALIGN) IALIGN_SEQ(IRES)=MATPOS( ICHAR( ALISEQ(IPOS+IRES) ) ) IF ( IALIGN_SEQ(IRES) .EQ. 0) THEN IF (ALISEQ(IPOS+IRES) .GE. 'a' .AND. + ALISEQ(IPOS+IRES) .LE. 'z') THEN IALIGN_SEQ(IRES)= + MATPOS(ICHAR(ALISEQ(IPOS+IRES))-32) ENDIF ENDIF ENDDO C---- C---- loop over pair partners C---- DO JALIGN=1,IALIGN-1 IF ( EXCLUDEFLAG(JALIGN) .EQ. ' ') THEN JPOS=ALIPOINTER(JALIGN)-IFIR(JALIGN) DO IRES=IFIR(JALIGN),ILAS(JALIGN) JALIGN_SEQ(IRES)=MATPOS(ICHAR(ALISEQ(JPOS+IRES))) IF ( JALIGN_SEQ(IRES) .EQ. 0) THEN IF (ALISEQ(JPOS+IRES) .GE. 'a' .AND. + ALISEQ(JPOS+IRES) .LE. 'z') THEN JALIGN_SEQ(IRES)= + MATPOS(ICHAR(ALISEQ(JPOS+IRES))-32) ENDIF ENDIF ENDDO SEQDIST=0.0 IAGR=0 ILEN=0 C---- C---- get distance between overlap of alignend sequences C---- IBEG= MAX(IFIR(IALIGN),IFIR(JALIGN)) IEND= MIN(ILAS(IALIGN),ILAS(JALIGN)) DO IRES= IBEG,IEND IF ( IALIGN_SEQ(IRES) .NE. 0 .AND. + JALIGN_SEQ(IRES) .NE. 0) THEN IF (IALIGN_SEQ(IRES) .EQ. JALIGN_SEQ(IRES)) THEN IAGR=IAGR+1 ENDIF ILEN=ILEN+1 ENDIF ENDDO IF (ILEN .NE. 0) THEN SEQDIST=1.0-(FLOAT(IAGR)/ILEN) ENDIF C---- C---- build up output line C---- WRITE(LINE(KPOS:),'(A,2X,F4.2,1X,I5)')'|',SEQDIST,ILEN KPOS=KPOS+14 IF (KPOS .GT. MAXLEN) THEN WRITE(6,*)'*** ERROR: MAXLEN overflow in'// + ' DIST_TABLE (lib-maxhom)' WRITE(6,*)'*** INCREASE DIMENSION > ',KPOS STOP ENDIF ENDIF ENDDO C---- end loop over pairs CALL STRPOS(LINE,IBEG,IEND) WRITE(KOUT,'(A)')LINE(IBEG:IEND) ENDIF ENDDO C---- end loop over all alignments CLOSE(KOUT) RETURN END C END DIST_TABLE C...................................................................... C...................................................................... C SUB DIST_TABLE2 SUBROUTINE DIST_TABLE2(KOUT,FILENAME,NALIGN,NRES, + PDB_ID,EMBL_ID,IDE,IFIR,ILAS,LALI, + ALIPOINTER,ALISEQ,EXCLUDEFLAG,NTRANS,CTRANS) CPLAN speed up this routine C RS May 93 C calculate and write distance table for all pairs of alignend C sequences in one HSSP-file C C---- import IMPLICIT NONE INTEGER KOUT,NALIGN,NRES,NTRANS, + IFIR(*),ILAS(*),LALI(*),ALIPOINTER(*) REAL IDE(*) CHARACTER FILENAME*(*) CHARACTER*(*) ALISEQ(*),EXCLUDEFLAG(*),PDB_ID*(*),EMBL_ID(*) C---- allowed sequence symbols CHARACTER*(*) CTRANS C---- C---- internal C---- INTEGER MAXRES,NASCII,MAXLEN PARAMETER (NASCII= 256) PARAMETER (MAXRES= 9999) C PARAMETER (MAXRES= 10000) PARAMETER (MAXLEN= 20000) INTEGER IALIGN_SEQ(MAXRES),JALIGN_SEQ(MAXRES), + I,IALIGN,JALIGN,ILEN,IRES, + IPOS,JPOS,IBEG,IEND,KPOS,IAGR,ISTEP REAL SEQDIST CHARACTER LINE*(MAXLEN) LOGICAL LERROR CHARACTER XC C---- only used to get rid of INDEX command (CPU TIME) INTEGER MATPOS(NASCII) C---- ------------------------------------------------------------------ C---- now work on it C---- ------------------------------------------------------------------ C---- spacer XC=CHAR(9) XC=' ' XC='|' WRITE(6,*)' dist_table' C---- C---- calculate variability only for the 22 (BZ) amino acids C---- DO I=1,NASCII MATPOS(I)=0 ENDDO CALL GETPOS(CTRANS(1:22),MATPOS,NASCII) IF (NRES .GT. MAXRES) THEN WRITE(6,*)'ERROR: NRES .GT. MAXRES IN DIST_TABLE' WRITE(6,*)'**** INCREASE MAXRES ****' STOP ENDIF C---- C---- initialize C---- DO I=1,NRES IALIGN_SEQ(I)=0 JALIGN_SEQ(I)=0 ENDDO LINE=' ' ISTEP=12 KPOS=12 C---- ------------------------------------------------------------------ C---- output file C---- CALL OPEN_FILE(KOUT,FILENAME,'NEW,RECL=50000',LERROR) C---- C---- header C---- CALL STRPOS(PDB_ID,IBEG,IEND) WRITE(KOUT,'(A)') '# HSSP_FILTER DISTANCE TABLE' WRITE(KOUT,'(A,A)') '# PDBID ',PDB_ID(IBEG:IEND) WRITE(KOUT,'(A,I6)')'# NALIGN ',NALIGN WRITE(KOUT,'(A,A)') '# NOTATION ','ROWS: i = 1 .. nali' WRITE(KOUT,'(A,A)') '# NOTATION ','EACH ROW : j = i+1 .. nali' WRITE(KOUT,'(A,A)') '# NOTATION ','EACH CELL: PIDE,LALI' C---- C---- how long will lines be? C---- KPOS=ISTEP DO IALIGN=2,NALIGN IF ( EXCLUDEFLAG(IALIGN) .EQ. ' ') THEN IF (KPOS .GT. MAXLEN) THEN WRITE(6,*)'*** ERROR: MAXLEN overflow in DIST_TABLE'// + ' (lib-maxhom)' WRITE(6,*)'*** INCREASE DIMENSION!!' STOP ENDIF KPOS=KPOS+ISTEP ENDIF ENDDO IEND=KPOS C---- beautiful FORTRAN ASCII DO I=1,IEND LINE(I:I)='=' ENDDO WRITE(KOUT,'(A,A)')"# ",LINE(1:IEND) C---- C---- first line gives identities and alignment lengths of the HSSP C---- WRITE(LINE(1:),'(A)')PDB_ID(1:12) KPOS=ISTEP DO IALIGN=2,NALIGN IF ( EXCLUDEFLAG(IALIGN) .EQ. ' ') THEN WRITE(LINE(KPOS:),'(A,A)')XC,EMBL_ID(IALIGN) KPOS=KPOS+ISTEP ENDIF ENDDO CALL STRPOS(LINE,IBEG,IEND) WRITE(KOUT,*)LINE(1:IEND) C---- ------------------------------------------------------------------ C---- loop over all alignments C---- ------------------------------------------------------------------ DO IALIGN=1,NALIGN IF ( EXCLUDEFLAG(IALIGN) .EQ. ' ') THEN LINE=' ' KPOS=1+ISTEP C---- C---- first distance to pdb-seq C---- SEQDIST=1.0-IDE(IALIGN) WRITE(LINE(1:),'(A,A,I4,A1,I5)') + EMBL_ID(IALIGN)(1:ISTEP),XC, + INT(100*SEQDIST),',',LALI(IALIGN) IPOS=ALIPOINTER(IALIGN)-IFIR(IALIGN) C---- C---- store alignment sequence in integer array (for i) C---- DO IRES=IFIR(IALIGN),ILAS(IALIGN) IALIGN_SEQ(IRES)=MATPOS( ICHAR( ALISEQ(IPOS+IRES) ) ) IF ( IALIGN_SEQ(IRES) .EQ. 0) THEN IF (ALISEQ(IPOS+IRES) .GE. 'a' .AND. + ALISEQ(IPOS+IRES) .LE. 'z') THEN IALIGN_SEQ(IRES)= + MATPOS(ICHAR(ALISEQ(IPOS+IRES))-32) ENDIF ENDIF ENDDO C---- C---- loop over pair partners C---- DO JALIGN=(IALIGN+1),NALIGN IF ( EXCLUDEFLAG(JALIGN) .EQ. ' ') THEN JPOS=ALIPOINTER(JALIGN)-IFIR(JALIGN) C---- C---- store alignment sequence in integer array (for j) C---- DO IRES=IFIR(JALIGN),ILAS(JALIGN) JALIGN_SEQ(IRES)=MATPOS(ICHAR(ALISEQ(JPOS+IRES))) IF ( JALIGN_SEQ(IRES) .EQ. 0) THEN IF (ALISEQ(JPOS+IRES) .GE. 'a' .AND. + ALISEQ(JPOS+IRES) .LE. 'z') THEN JALIGN_SEQ(IRES)= + MATPOS(ICHAR(ALISEQ(JPOS+IRES))-32) ENDIF ENDIF ENDDO SEQDIST=0.0 IAGR=0 ILEN=0 C---- C---- get distance between overlap of alignend sequences C---- IBEG= MAX(IFIR(IALIGN),IFIR(JALIGN)) IEND= MIN(ILAS(IALIGN),ILAS(JALIGN)) DO IRES= IBEG,IEND IF ( IALIGN_SEQ(IRES) .NE. 0 .AND. + JALIGN_SEQ(IRES) .NE. 0) THEN IF (IALIGN_SEQ(IRES) .EQ. JALIGN_SEQ(IRES)) THEN IAGR=IAGR+1 ENDIF ILEN=ILEN+1 ENDIF ENDDO IF (ILEN .NE. 0) THEN SEQDIST=1.0-(FLOAT(IAGR)/ILEN) ENDIF C---- C---- build up output line C---- C WRITE(LINE(KPOS:),'(A,2X,F4.2,1X,I5)')'|',SEQDIST,ILEN WRITE(LINE(KPOS:),'(A,I4,A1,I5)') + XC, INT(100*SEQDIST),',',ILEN KPOS=KPOS+ISTEP IF (KPOS .GT. MAXLEN) THEN WRITE(6,*)'*** ERROR: MAXLEN overflow in'// + ' DIST_TABLE (lib-maxhom)' WRITE(6,*)'*** INCREASE DIMENSION!!' STOP ENDIF ENDIF ENDDO C---- end loop over pairs CALL STRPOS(LINE,IBEG,IEND) WRITE(KOUT,'(A)')LINE(IBEG:IEND) ENDIF ENDDO C---- end loop over all alignments CLOSE(KOUT) RETURN END C END DIST_TABLE2 C...................................................................... C...................................................................... C SUB DIST_TABLE_EXCLUDE SUBROUTINE DIST_TABLE_EXCLUDE(IDEMAX,NALIGN,NRES,IDE, + IFIR,ILAS,LALI,ALIPOINTER,ALISEQ,EXCLUDEFLAG,NTRANS,CTRANS) C---- C---- BR May 98 C---- C---- Calculate pairwise levels of sequence identity, and returns C---- EXCLUDEFLAG(iali)='*' if the identity of alignment number C---- 'iali' to any other sequence is above IDEMAX (threshold input) C---- C---- out: output is the changed vectore EXCLUDEFLAG C---- C---- import IMPLICIT NONE INTEGER NALIGN,NRES,NTRANS, + IFIR(*),ILAS(*),LALI(*),ALIPOINTER(*) REAL IDE(*),IDEMAX CHARACTER*(*) ALISEQ(*),EXCLUDEFLAG(*) C---- allowed sequence symbols CHARACTER*(*) CTRANS C---- C---- internal C---- INTEGER MAXRES,NASCII,MAXLEN PARAMETER (NASCII= 256) PARAMETER (MAXRES= 9999) C PARAMETER (MAXRES= 10000) PARAMETER (MAXLEN= 100000) C br 2000-04: was 20000! C PARAMETER (MAXLEN= 20000) INTEGER IALIGN_SEQ(MAXRES),JALIGN_SEQ(MAXRES), + I,IALIGN,JALIGN,ILEN,IRES, + IPOS,JPOS,IBEG,IEND,KPOS,IAGR,ISTEP REAL SEQIDE CHARACTER LINE*(MAXLEN) LOGICAL LERROR C---- only used to get rid of INDEX command (CPU TIME) INTEGER MATPOS(NASCII) C---- ------------------------------------------------------------------ C---- now work on it C---- ------------------------------------------------------------------ WRITE(6,*) '--- lib-maxhom:DIST_TABLE_EXCLUDE' C---- C---- calculate variability only for the 22 (BZ) amino acids C---- DO I=1,NASCII MATPOS(I)=0 ENDDO CALL GETPOS(CTRANS(1:22),MATPOS,NASCII) IF (NRES .GT. MAXRES) THEN WRITE(6,*)'*** ERROR: NRES .GT. MAXRES IN DIST_TABLE_EXCLUDE' WRITE(6,*)'*** INCREASE MAXRES to > ',NRES STOP ENDIF C---- C---- initialize C---- DO I=1,NRES IALIGN_SEQ(I)=0 JALIGN_SEQ(I)=0 ENDDO ISTEP=14 KPOS=13 C---- C---- first line gives identities and alignment lengths of the HSSP C---- KPOS=KPOS+ISTEP DO IALIGN=1,NALIGN-1 IF ( EXCLUDEFLAG(IALIGN) .EQ. ' ') THEN IF (KPOS .GT. MAXLEN) THEN WRITE(6,*)'*** ERROR: MAXLEN overflow in'// + ' DIST_TABLE_EXCLUDE (lib-maxhom)' WRITE(6,*)'*** INCREASE DIMENSION to > ',KPOS STOP ENDIF KPOS=KPOS+ISTEP ENDIF ENDDO C---- ------------------------------------------------------------------ C---- loop over all alignments C---- ------------------------------------------------------------------ DO IALIGN=1,NALIGN IF ( EXCLUDEFLAG(IALIGN) .EQ. ' ') THEN KPOS=27 C---- C---- first distance to pdb-seq C---- SEQIDE=IDE(IALIGN) IPOS=ALIPOINTER(IALIGN)-IFIR(IALIGN) C---- C---- store alignment sequence in integer array C---- DO IRES=IFIR(IALIGN),ILAS(IALIGN) IALIGN_SEQ(IRES)=MATPOS( ICHAR( ALISEQ(IPOS+IRES) ) ) IF ( IALIGN_SEQ(IRES) .EQ. 0) THEN IF (ALISEQ(IPOS+IRES) .GE. 'a' .AND. + ALISEQ(IPOS+IRES) .LE. 'z') THEN IALIGN_SEQ(IRES)= + MATPOS(ICHAR(ALISEQ(IPOS+IRES))-32) ENDIF ENDIF ENDDO C---- C---- loop over pair partners C---- C DO JALIGN=1,IALIGN-1 DO JALIGN=IALIGN+1,NALIGN IF ( EXCLUDEFLAG(JALIGN) .EQ. ' ') THEN JPOS=ALIPOINTER(JALIGN)-IFIR(JALIGN) DO IRES=IFIR(JALIGN),ILAS(JALIGN) JALIGN_SEQ(IRES)=MATPOS(ICHAR(ALISEQ(JPOS+IRES))) IF ( JALIGN_SEQ(IRES) .EQ. 0) THEN IF (ALISEQ(JPOS+IRES) .GE. 'a' .AND. + ALISEQ(JPOS+IRES) .LE. 'z') THEN JALIGN_SEQ(IRES)= + MATPOS(ICHAR(ALISEQ(JPOS+IRES))-32) ENDIF ENDIF ENDDO SEQIDE=0.0 IAGR=0 ILEN=0 C---- C---- get distance between overlap of alignend sequences C---- IBEG= MAX(IFIR(IALIGN),IFIR(JALIGN)) IEND= MIN(ILAS(IALIGN),ILAS(JALIGN)) DO IRES= IBEG,IEND IF ( IALIGN_SEQ(IRES) .NE. 0 .AND. + JALIGN_SEQ(IRES) .NE. 0) THEN IF (IALIGN_SEQ(IRES) .EQ. JALIGN_SEQ(IRES)) THEN IAGR=IAGR+1 ENDIF ILEN=ILEN+1 ENDIF ENDDO IF (ILEN .NE. 0) THEN SEQIDE=(FLOAT(IAGR)/ILEN) ENDIF C---- exclude if too similar IF (SEQIDE.GE.IDEMAX) THEN EXCLUDEFLAG(JALIGN)='*' ENDIF C---- C---- build up output line C---- KPOS=KPOS+14 IF (KPOS .GT. MAXLEN) THEN WRITE(6,*)'*** ERROR: MAXLEN overflow in'// + ' DIST_TABLE_EXCLUDE (lib-maxhom)' WRITE(6,*)'*** INCREASE DIMENSION > ',KPOS STOP ENDIF ENDIF ENDDO C---- end loop over pairs ENDIF ENDDO C---- end loop over all alignments RETURN END C END DIST_TABLE_EXCLUDE C...................................................................... C...................................................................... C SUB GETCONSWEIGHT_BR SUBROUTINE GETCONSWEIGHT_BR(MAXSTRSTATES,MAXIOSTATES, + MAXRES,MAXALIGNS,NTRANS,NASCII, + NALIGN,NRES,IDE,IFIR,ILAS,LALI,ALIPOINTER,PDBSEQ,ALISEQ, + EXCLUDEFLAG,CTRANS,MATRIX,SIMCONSERV,MATPOS, + CONSWEIGHT_MIN,CONSWEIGHT) C---- C---- BR May 98 + Oct 98 C---- C---- Calculate the conservation weight, taking into account the C---- EXCLUDEFLAG(iali)='*' if the identity of alignment number C---- 'iali' to any other sequence is above IDEMAX (threshold input) C---- C---- C---- out: output is the conservation weight C---- C---- ------------------------------------------------------------------ C---- C---- formula: C---- nali C---- ----- C---- \ (1 - IDE(s,a,b)) * SIM(s,a,b) C---- CW(s)= > ----------------------------- C---- / DISTSUM C---- ----- C---- a,b C---- C---- with: CW(s): conservation weight for residue s C---- a,b: alignment between protein a and b C---- IDE(s,a,b): identity of residue s in a and s in b C---- = 0 || 1 C---- SIM(s,a,b): similarity of residue s in a and s in b C---- C---- DISTSUM(s): sum over all distances at position s C---- and the following definition: C---- C---- nali nres = overlap (a,b) C---- ----- ---- C---- \ \ delta(s,a,b) C---- DISTSUM(s)= > > ------------------------------ C---- / / number of overlapping residues C---- ----- ---- C---- a,b s C---- C---- with: delta(s,a,b) = 1 if residue s in a = residue s in b C---- 0 else C---- C---- normalise: C---- finally normalised weights to have an average of 1: C---- C---- nres C---- ----- C---- \ C---- NORM(s)= > CW(s) C---- / C---- ----- C---- s C---- C---- CW_NORM(s)= CW(s) / NORM C---- C---- C---- ------------------------------------------------------------------ C---- IMPLICIT NONE C---- C---- import C---- C parameters for array INTEGER MAXSTRSTATES,MAXIOSTATES,MAXRES,MAXALIGNS, + NTRANS,NASCII C actual values INTEGER NALIGN,NRES C pointer arrays INTEGER IFIR(MAXALIGNS),ILAS(MAXALIGNS), + ALIPOINTER(MAXALIGNS) C alignment length lali(i) length of ali between PDBSEQ and ali i INTEGER LALI(MAXALIGNS) C guide sequence CHARACTER PDBSEQ(MAXRES) C all aligned sequences CHARACTER*(*) ALISEQ(*) C flags (take ' ', else other) CHARACTER*(*) EXCLUDEFLAG(MAXALIGNS) C percentage sequence identity REAL IDE(MAXALIGNS),IDEMAX C comparison metric REAL MATRIX(NTRANS,NTRANS,MAXSTRSTATES,MAXIOSTATES, + MAXSTRSTATES,MAXIOSTATES) C normalised to sum over 0 (by SETCONSERVATION) REAL SIMCONSERV(NTRANS,NTRANS) C allowed sequence symbols CHARACTER*(*) CTRANS C minimal conservation weight REAL CONSWEIGHT_MIN C only used to get rid of INDEX command (CPU time) INTEGER MATPOS(NASCII) C---- C---- output from sbr C---- C conservation weight REAL CONSWEIGHT(MAXRES) C---- ------------------------------------------------------------------ C---- internal C---- ------------------------------------------------------------------ C---- C INTEGER NASCII,MAXLEN INTEGER MAXRES_LOC,MAXLEN PARAMETER (MAXRES_LOC= 9999) C PARAMETER (MAXRES_LOC= 10000) PARAMETER (MAXLEN= 20000) INTEGER IALIGN_SEQ(MAXRES_LOC),JALIGN_SEQ(MAXRES_LOC), + I,J,IALIGN,JALIGN,ILEN,IRES, + IPOS,JPOS,IBEG,IEND,KPOS,IAGR,ISTEP,ISAFERANGE, + IALIGN_SEQTMP,JALIGN_SEQTMP INTEGER NOCC(MAXRES_LOC) C---- big: seq ide distance for pair i,j REAL SUM,MEAN,XVAL INTEGER NPOS REAL SIMDIST_POS(MAXRES_LOC) REAL SEQDIST_POS(MAXRES_LOC) REAL SEQDIST CHARACTER LINE*(MAXLEN) LOGICAL LERROR C---- ------------------------------------------------------------------ C---- C---- defaults, ini C---- C---- FORMULA+ISAFERANGE -> include into averaging ISAFERANGE= 5 C---- br: make 'safer' for weights ISAFERANGE= 5 C---- ------------------------------------------------------------------ C---- now work on it C---- ------------------------------------------------------------------ WRITE(6,*) '--- GETCONSWEIGHT_BR begin' C---- C---- calculate variability only for the 22 (BZ) amino acids C---- DO I=1,NASCII MATPOS(I)=0 ENDDO CALL GETPOS(CTRANS(1:22),MATPOS,NASCII) IF (NRES .GT. MAXRES) THEN WRITE(*,*)'*** ERROR: NRES .GT. MAXRES IN GETCONSWEIGHT_BR' WRITE(*,*)'*** INCREASE MAXRES ****' STOP ENDIF C---- C---- ERROR! C---- IF (NRES .LE. 0) RETURN C---- C---- initialize C---- DO I=1,NRES IALIGN_SEQ(I)= 0 JALIGN_SEQ(I)= 0 SIMDIST_POS(I)=0.0 SEQDIST_POS(I)=0.0 NOCC(I)= 0 ENDDO C---- ------------------------------------------------------------------ C---- guide against all C---- ------------------------------------------------------------------ C---- store PDB sequence in integer array DO IRES=1,NRES IALIGN_SEQ(IRES)=MATPOS(ICHAR(PDBSEQ(IRES))) IF ( IALIGN_SEQ(IRES) .EQ. 0) THEN IF ( PDBSEQ(IRES) .GE. 'a' .AND. + PDBSEQ(IRES) .LE. 'z') THEN IALIGN_SEQ(IRES)= + MATPOS(ICHAR(PDBSEQ(IRES))-32) ENDIF ENDIF ENDDO C---- C---- loop over all aligned sequences C---- DO JALIGN=1,NALIGN IF ( EXCLUDEFLAG(JALIGN) .EQ. ' ') THEN C---- store alignment sequence in integer array (for j) JPOS=ALIPOINTER(JALIGN)-IFIR(JALIGN) DO IRES=IFIR(JALIGN),ILAS(JALIGN) JALIGN_SEQ(IRES)=MATPOS(ICHAR(ALISEQ(JPOS+IRES))) IF ( JALIGN_SEQ(IRES) .EQ. 0) THEN IF (ALISEQ(JPOS+IRES) .GE. 'a' .AND. + ALISEQ(JPOS+IRES) .LE. 'z') THEN JALIGN_SEQ(IRES)= + MATPOS(ICHAR(ALISEQ(JPOS+IRES))-32) ENDIF ENDIF ENDDO C---- C---- passed into SBR: IDE(i) C---- SEQDIST= 1.0 - IDE(JALIGN) C---- C---- get overlap of aligned sequences C---- IBEG= MAX(1, IFIR(JALIGN)) IEND= MIN(NRES,ILAS(JALIGN)) DO IRES= IBEG,IEND IF ( (IALIGN_SEQ(IRES) .NE. 0) .AND. + (JALIGN_SEQ(IRES) .NE. 0) ) THEN C---- C---- count up number of pairs found for current residue C---- NOCC(IRES)=NOCC(IRES)+1 C---- C---- position specific distances C---- SEQDIST_POS(IRES)= + SEQDIST_POS(IRES) + SEQDIST SIMDIST_POS(IRES)= + SIMDIST_POS(IRES) + + SEQDIST * SIMCONSERV(IALIGN_SEQ(IRES), + JALIGN_SEQ(IRES)) END IF END DO ENDIF C---- end of: exclude? ENDDO C---- end loop over all sequences aligned to guide C---- ------------------------------------------------------------------ C---- all against all: get all pair distances C---- ------------------------------------------------------------------ DO IALIGN=1,NALIGN C---- to exclude? IF ( EXCLUDEFLAG(IALIGN) .EQ. ' ') THEN C---- store alignment sequence in integer array (for i) IPOS=ALIPOINTER(IALIGN)-IFIR(IALIGN) DO IRES=IFIR(IALIGN),ILAS(IALIGN) IALIGN_SEQ(IRES)=MATPOS( ICHAR( ALISEQ(IPOS+IRES) ) ) IF ( IALIGN_SEQ(IRES) .EQ. 0) THEN IF (ALISEQ(IPOS+IRES) .GE. 'a' .AND. + ALISEQ(IPOS+IRES) .LE. 'z') THEN IALIGN_SEQ(IRES)= + MATPOS(ICHAR(ALISEQ(IPOS+IRES))-32) ENDIF ENDIF ENDDO C---- C---- loop over pair partners C---- DO JALIGN=IALIGN+1,NALIGN IF ( EXCLUDEFLAG(JALIGN) .EQ. ' ') THEN C---- store alignment sequence in integer array (for j) JPOS=ALIPOINTER(JALIGN)-IFIR(JALIGN) DO IRES=IFIR(JALIGN),ILAS(JALIGN) JALIGN_SEQ(IRES)=MATPOS(ICHAR(ALISEQ(JPOS+IRES))) IF ( JALIGN_SEQ(IRES) .EQ. 0) THEN IF (ALISEQ(JPOS+IRES) .GE. 'a' .AND. + ALISEQ(JPOS+IRES) .LE. 'z') THEN JALIGN_SEQ(IRES)= + MATPOS(ICHAR(ALISEQ(JPOS+IRES))-32) ENDIF ENDIF ENDDO C---- C---- get distance between overlap of aligned sequences C---- IAGR=0 ILEN=0 IBEG= MAX(IFIR(IALIGN),IFIR(JALIGN)) IEND= MIN(ILAS(IALIGN),ILAS(JALIGN)) DO IRES=IBEG,IEND IF ( IALIGN_SEQ(IRES) .NE. 0 .AND. + JALIGN_SEQ(IRES) .NE. 0) THEN C---- count up number of pairs found for current residue NOCC(IRES)=NOCC(IRES)+1 IF (IALIGN_SEQ(IRES).EQ.JALIGN_SEQ(IRES)) THEN IAGR=IAGR+1 ENDIF ILEN=ILEN+1 ENDIF ENDDO IF (ILEN .NE. 0) THEN SEQDIST= 1.0 - (FLOAT(IAGR)/ILEN) DO IRES=IBEG,IEND IF ( IALIGN_SEQ(IRES).NE.0 .AND. + JALIGN_SEQ(IRES).NE.0 ) THEN SEQDIST_POS(IRES)= + SEQDIST_POS(IRES) + SEQDIST SIMDIST_POS(IRES)= + SIMDIST_POS(IRES) + + SEQDIST * SIMCONSERV(IALIGN_SEQ(IRES), + JALIGN_SEQ(IRES)) ENDIF END DO END IF ENDIF C---- end of: exclude? ENDDO C---- end loop over pairs ENDIF ENDDO C---- end loop over all alignments C---- ------------------------------------------------------------------ C---- assign conservation weights C---- ------------------------------------------------------------------ DO IRES=1,NRES IF ((SEQDIST_POS(IRES) .GT. 0) .AND. + (NOCC(IRES) .GT. 0) ) THEN CONSWEIGHT(IRES)= + SIMDIST_POS(IRES)/SEQDIST_POS(IRES) C write(6,'(a,i4,a,f8.3,a,f8.3,a,f8.3)')'xx before norm i=', C + ires,' cw=',consweight(ires), C + ' sim=',simdist_pos(ires), C + ' dis=',seqdist_pos(ires) ELSE CONSWEIGHT(IRES)=1.0 END IF C---- C---- no negative values for conservation weight C---- IF (CONSWEIGHT(IRES) .LT. CONSWEIGHT_MIN) THEN CONSWEIGHT(IRES)=CONSWEIGHT_MIN ENDIF END DO C---- ------------------------------------------------------------------ C---- weight conservation weights (average = 1) C---- ------------------------------------------------------------------ NPOS= 0 MEAN= 1.0 SUM= 0.0 DO IRES=1,NRES IF (NOCC(IRES).NE.0) THEN SUM = SUM + CONSWEIGHT(IRES) NPOS=NPOS+1 ENDIF ENDDO IF (NPOS .NE. 0) THEN MEAN=SUM/NPOS ENDIF WRITE(6,*)'GETCONSWEIGHT: SUM,MEAN ',SUM,MEAN IF (MEAN.GT. 0.99 .AND. MEAN .LT. 1.01) RETURN XVAL=1.0-MEAN DO IRES=1,NRES IF (NOCC(IRES).NE.0) CONSWEIGHT(IRES)=CONSWEIGHT(IRES)+XVAL ENDDO WRITE(6,*) '--- GETCONSWEIGHT_BR ended' RETURN END C END GETCONSWEIGHT_BR C...................................................................... C...................................................................... C SUB SETCONSERVATION SUBROUTINE SETCONSERVATION(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES,NIOSTATES,CSTRSTATES,CIOSTATES,IORANGE,KMAT, + METRIC_FILENAME,MATRIX,SIMCONSERV) C 1. set conservation weights to 1.0 C 2. rescale matrix for the 22 amino residues such that the sum over C the matrix is 0.0 (or near) C this matrix is used to calculate the conservation weights (SIMCONSERV) c implicit none C import CHARACTER*(*) METRIC_FILENAME REAL MATRIX(NTRANS,NTRANS,MAXSTRSTATES,MAXIOSTATES, + MAXSTRSTATES,MAXIOSTATES) REAL SIMCONSERV(1:26,1:26) INTEGER NSTRSTATES,NIOSTATES,KMAT CHARACTER*(*) CSTRSTATES,CIOSTATES REAL IORANGE(MAXSTRSTATES,MAXIOSTATES) C internal INTEGER NACID,MAXRES_LOC PARAMETER (NACID= 22) PARAMETER (MAXRES_LOC= 9999) C PARAMETER (MAXRES_LOC= 10000) CHARACTER*(26) TRANS INTEGER I,J LOGICAL LFIRSTWEIGHT REAL XLOW,XHIGH,XMAX,XMIN,XFACTOR,SUMMAT REAL CONSWEIGHT_1(MAXRES_LOC), + CONSWEIGHT_2(MAXRES_LOC),CONSMIN *----------------------------------------------------------------------* C DO I=1,MAXRES_LOC CONSWEIGHT_1(I)=1.0 ENDDO LFIRSTWEIGHT=.TRUE. C get metric C 98-10 br: already done C CALL GETSIMMETRIC(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, C + NSTRSTATES,NIOSTATES,NSTRSTATES,NIOSTATES, C + CSTRSTATES,CIOSTATES,IORANGE,KMAT,METRIC_FILENAME,MATRIX) c rescale matrix that the sum over matrix is +- 0.0 XLOW= 0.0 XHIGH= 0.0 XMAX= 1.0 XMIN= -1.0 XFACTOR=100.0 C (re)store original values in simconserv() 20 DO J=1,NTRANS DO I=1,NTRANS SIMCONSERV(I,J)=MATRIX(I,J,1,1,1,1) ENDDO ENDDO c scale with xmin/xmax CALL SCALEINTERVAL(SIMCONSERV,NTRANS**2,XMIN,XMAX,XLOW,XHIGH) C RESEt the values for 'X' '!' and '-' I=INDEX(TRANS,'X') DO J=1,NTRANS SIMCONSERV(I,J)=0.0 SIMCONSERV(J,I)=0.0 ENDDO I=INDEX(TRANS,'!') DO J=1,NTRANS SIMCONSERV(I,J)=0.0 SIMCONSERV(J,I)=0.0 ENDDO I=INDEX(TRANS,'-') DO J=1,NTRANS SIMCONSERV(I,J)=0.0 SIMCONSERV(J,I)=0.0 ENDDO I=INDEX(TRANS,'.') DO J=1,NTRANS SIMCONSERV(I,J)=0.0 SIMCONSERV(J,I)=0.0 ENDDO C calculate sum over matrix (22 amino acids) after scaling SUMMAT=0.0 DO I=1,NACID DO J=1,NACID SUMMAT=SUMMAT+SIMCONSERV(I,J) ENDDO ENDDO cd write(*,*)' sum: ',summat,xmin c check sum=0.0 (+- 0.01) ; if not xmin=xmin/2 ; scale again IF (SUMMAT .GT. 0.01) THEN XMIN=XMIN+(XMIN/XFACTOR) ELSE IF (SUMMAT .LT. -0.01) THEN XMIN=XMIN-(XMIN/XFACTOR) ELSE WRITE(*,*)' SETCONSERVATION: sum over matrix: ',summat WRITE(*,*)' smin is : ',xmin RETURN ENDIF GOTO 20 END C END SETCONSERVATION C...................................................................... profphd-utils-1.0.10/filter_hssp.pod0000644015075101507510000000217612012371464016723 0ustar lkajanlkajan=pod =head1 NAME filter_hssp - filters an HSSP file =head1 SYNOPSIS filter_hssp [OPTION] =head1 DESCRIPTION filter_hssp is used to filters an HSSP file. Instead of using command line arguments it starts a conversation with the user where the desired function can be invoked by answering questions. This program is used by the prof(1) secondary structure, accessibility and transmembrane helix predictor from Burkhard Rost. =head1 OPTIONS None. Answer questions printed on STDOUT by typing your choice into STDIN. =head1 AUTHORS =over =item Reinhard Schneider Mar, 1990 version 1.0 LION, Heidelberg =item Reinhard Schneider Mar, 1997 version 2.0 LION, Heidelberg =item Burkhard Rost May, 1998 version 2.1 EMBL/LION, Heidelberg =item Burkhard Rost Oct, 1998 version 2.2 EMBL/LION, Heidelberg =item Laszlo Kajan TU Muenchen, Germany =item Guy Yachdav CUBIC (Columbia University, NY, USA), Technical University Munich (Munich, DE), BioSof LLC (USA) =back =head1 SEE ALSO prof(1), convert_seq(1) =cut profphd-utils-1.0.10/lib-convert.f0000755015075101507510000214172712012400570016272 0ustar lkajanlkajan*----------------------------------------------------------------------* C---- ------------------------------------------------------------------ C---- contains now all previously needed libraries for Schneider stuff C---- ------------------------------------------------------------------ C...................................................................... C FUN EMPTYSTRING(STRING) FUNCTION EMPTYSTRING(STRING) LOGICAL EMPTYSTRING CHARACTER*(*) STRING EMPTYSTRING=.TRUE. DO I=1,LEN(STRING) IF (STRING(I:I).NE.' ') THEN EMPTYSTRING=.FALSE. GOTO 10 ENDIF ENDDO 10 RETURN END C END EMPTYSTRING C...................................................................... C...................................................................... C FUN LCHAINBREAK LOGICAL FUNCTION LCHAINBREAK (CS,IS) C CS March 1988 C Check for '!', which is DSSP chain break CHARACTER CS*1 LCHAINBREAK=CS .EQ. '!' IF (LCHAINBREAK) THEN WRITE (*,*)'INFO: chain break detected at residue',IS ENDIF RETURN END C END LCHAINBREAK C...................................................................... C...................................................................... C FUN LEGALRES LOGICAL FUNCTION LEGALRES(CS,IS,TRANS,NTRANS,PUNCTUATION) C Brigitte Altenberg Dec 1987, changes by CS March 1988 C Check for legal residues. Unknown residues are reported (warning), C except for declared punctation. CHARACTER*(*) PUNCTUATION,TRANS,CS C Punctations are not reported. They are format-specific. C PIR: PUNCTUATION=' ,.:;()+' LEGALRES=.TRUE. L=INDEX(TRANS(1:NTRANS),CS) IF (L.EQ.0 .AND. PUNCTUATION .NE.' ') THEN M=INDEX(PUNCTUATION,CS) IF (M.EQ.0) THEN WRITE (*,*)'LEGALRES: unknown RESIDUE:',CS, + ': with ASCIIcode: ',ICHAR(CS), + ' after sequence position', IS WRITE (*,*)'CAUTION: GETAASEQ will replace this by "-"' ENDIF LEGALRES=.FALSE. ENDIF RETURN END C END LEGALRES C...................................................................... C...................................................................... C SUB ACC_TO_INT SUBROUTINE ACC_TO_INT(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES,NIOSTATES,IORANGE,NRES,LSQ,LSTR,NACC,LACC) IMPLICIT NONE C import INTEGER NTRANS CHARACTER*(*) TRANS INTEGER MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES,NIOSTATES REAL IORANGE(MAXSTRSTATES,MAXIOSTATES) INTEGER NRES,NACC(*),LSQ(*),LSTR(*) C export INTEGER LACC(*) C internal INTEGER MAXAA PARAMETER (MAXAA= 26) INTEGER ACCMAX(MAXAA),I,IOSTATE,ISTR REAL PER C max. Acc. in order of TRANS (VLIMFWYGAPSTCHRKQENDBZX!-.) C V L I M F W Y G A P S T C 142,164,169,188,197,227,222,84,106,136,130,142 C C H R K Q E N D B Z X ! - . C 135,184,248,205,198,194,157,163,157,194,0,0,0 0 DATA ACCMAX /142,164,169,188,197,227,222,84,106,136,130,142, + 135,184,248,205,198,194,157,163,157,194,0,0,0,0/ IF (TRANS .NE. 'VLIMFWYGAPSTCHRKQENDBZX!-.' ) THEN WRITE(6,*)'*** ERROR: TRANS NOT IN RIGHT ORDER in ACC_TO_INT' STOP ENDIF IF (NTRANS .GT. MAXAA) THEN WRITE(6,*)'*** ERROR: NTRANS .GT. MAXAA IN ACC_TO_INT' STOP ENDIF IF (NIOSTATES .EQ. 1) THEN CALL INIT_INT_ARRAY(1,NRES,LACC,1) RETURN ENDIF DO I=1,NRES IF (LSQ(I) .EQ. 0) THEN LACC(I)=0 ELSE ISTR=LSTR(I) IF (NSTRSTATES .EQ. 1)ISTR=1 IF (ISTR .EQ. 0) THEN WRITE(6,*)'*** ERROR: LSTR .EQ. 0 IN ACC_TO_INT' STOP ENDIF IF (ACCMAX(LSQ(I)) .NE. 0) THEN PER=(NACC(I)*100.0) / ACCMAX(LSQ(I)) IF (PER .GE. 100.0)PER=100.0 IOSTATE=1 DO IOSTATE=1,NIOSTATES IF (PER .LE. IORANGE(ISTR,IOSTATE) ) THEN LACC(I)=IOSTATE GOTO 100 ENDIF ENDDO ELSE LACC(I)=1 ENDIF 100 CONTINUE c100 if (i .le. 10) then c WRITE(6,*)' acctoint I,LSTR,LACC : ',i,iSTR, c + lacc(i) c WRITE(6,*)accmax(lsq(i)),nacc(i),per c endif ENDIF ENDDO RETURN END C END ACC_TO_INT C...................................................................... C...................................................................... C SUB ALISEQENVIRONMENT SUBROUTINE ALISEQENVIRONMENT(MAXRES,MAXALIGNS, 1 NRES,NALIGN,IFIR,ILAS,INSNUMBER,INSALI,INSLEN, 2 INSAP,LINS,NINS,TOTALINSLEN,ERROR) C 21.6.93 IMPLICIT NONE C IMPORT INTEGER MAXRES,MAXALIGNS,NRES,NALIGN,INSNUMBER INTEGER IFIR(*),ILAS(*),INSALI(*),INSLEN(*),INSAP(*) C EXPORT INTEGER TOTALINSLEN(MAXRES) INTEGER*2 NINS(MAXRES,0:MAXALIGNS) LOGICAL ERROR,LINS(MAXRES) C INTERNAL INTEGER*2 INT2_TEMP INTEGER MAXALIGNS_LOC PARAMETER (MAXALIGNS_LOC= 8765) C PARAMETER (MAXALIGNS_LOC= 19999) INTEGER IALIGN,IPOS,IAP,IINS,TIL *----------------------------------------------------------------------* IF ( NALIGN .GT. MAXALIGNS .OR. 1 NALIGN .GT. MAXALIGNS_LOC ) THEN WRITE(6,'(1X,A)') 1 'MAXALIGNS overflow in AliseqEnvironment !' ERROR = .TRUE. RETURN ENDIF IF ( NRES .GT. MAXRES ) THEN WRITE(6,'(1X,A)') 'MAXRES overflow in AliseqEnvironment !' ERROR = .TRUE. RETURN ENDIF IPOS = 0 IINS = 1 DO IAP = 1,NRES TOTALINSLEN(IAP) = 0 NINS(IAP,0) = 0 ENDDO DO IALIGN = 1,NALIGN DO WHILE ( IINS .LT. INSNUMBER .AND. 1 INSALI(IINS) .LT. IALIGN ) IINS = IINS + 1 ENDDO DO IAP = IFIR(IALIGN),ILAS(IALIGN) IPOS = IPOS + 1 IF ( INSALI(IINS) .EQ. IALIGN .AND. 1 INSAP(IINS) .EQ. IAP ) THEN NINS(IAP,IALIGN) = IINS C CONVERSION INT*4 TO INT*2 INT2_TEMP = INSLEN(IINS) NINS(IAP,0) = MAX(NINS(IAP,0),INT2_TEMP) LINS(IAP) = .TRUE. IF ( INSALI(IINS+1) .EQ. IALIGN ) IINS = IINS + 1 ENDIF ENDDO ENDDO TIL = 0 DO IAP = 1,NRES IF ( LINS(IAP) ) TIL = TIL + NINS(IAP,0) TOTALINSLEN(IAP) = TIL ENDDO RETURN END C END ALISEQENVIRONMENT C...................................................................... C...................................................................... C SUB ALITOSTRUCRMS SUBROUTINE ALITOSTRUCRMS(MAXALSQ,MAXSQ,BRKFILE_1,BRKFILE_2, + KBRK,PDBNO_1,CHAINID_1,PDBNO_2,CHAINID_2, + ALI_1,ALI_2,LENALI,IFIR,ILAS,JFIR,JLAS,LCALPHA,RMS) C RS 89 C import an alignment, cut it in pieces (if necessary) and C calculate the RMS between pieces C use routines SETPIECES,GETCOOR,COMPALISTRUC C IMPORT : C BRKFILE_1,BRKFILE_2 : filename of coordinate files C KBRK : unit for coordinate files C ALI_1,ALI_2 : alignment string (see remark in SETPIECES) C LENALI : length of alignment including insertions C IFIR,ILAS : first and last position of seq 1 C JFIR,JLAS : first and lasr position of seq 2 C LCALPHA : compare only C-alpha atoms if true C OUTPUT: C RMS C IMPLICIT NONE C---- import INTEGER MAXALSQ,MAXSQ INTEGER KBRK,LENALI,IFIR,ILAS,JFIR,JLAS CHARACTER*(*) BRKFILE_1,BRKFILE_2 CHARACTER*1 ALI_1(MAXALSQ),ALI_2(MAXALSQ) CHARACTER*1 CHAINID_1(MAXSQ),CHAINID_2(MAXSQ) INTEGER PDBNO_1(MAXSQ),PDBNO_2(MAXSQ) REAL RMS C---- internal parameters INTEGER MXRES,MXATM PARAMETER (MXRES= 9999) C PARAMETER (MXRES= 10000) C PARAMETER (MXRES= 30011) PARAMETER (MXATM=10*MXRES) C---- internal variables C REAL RMS C INTEGER LENALI,IFIR,ILAS,JFIR,JLAS,KBRK C C if true compare only C-alpha LOGICAL LCALPHA CHARACTER*200 BRKBEFORE1,BRKBEFORE2 c alignment C CHARACTER*1 ALI_1(MAXALSQ),ALI_2(MAXALSQ) C CHARACTER*(*) CHAINID_1(*),CHAINID_2(*) C INTEGER PDBNO_1(*),PDBNO_2(*) C very long sequences are cut in pieces INTEGER NSHIFTED COMMON/CSHIFT1/NSHIFTED LOGICAL LSHIFTED COMMON/CSHIFT2/LSHIFTED c molecule attributes C CHARACTER*(*) BRKFILE_1,BRKFILE_2 CHARACTER NAMMOL1(5)*200,NAMMOL2(5)*200 INTEGER NRES_1,NRES_2,NATM1,NATM2 c residue attributes ; number and chain CHARACTER*6 CIDRES_1(MXRES),CIDRES_2(MXRES) C points to first, last and CEN atom. center residue coors INTEGER IPATM1RES(3,MXRES),IPATM2RES(3,MXRES) REAL RRES1(3,MXRES),RRES2(3,MXRES) C atom attributes C atom belongs to res number IPRESATM C atom coors C superposition weights. CHARACTER*4 NAMATM1(MXATM),NAMATM2(MXATM) INTEGER IPRES1ATM(MXATM),IPRES2ATM(MXATM) REAL RATM1(3,MXATM),RATM2(3,MXATM) REAL WSUP1(MXATM),WSUP2(MXATM) c piece attributes INTEGER MXPIECES PARAMETER (MXPIECES=50) INTEGER IRESPIE,NPIECES,NRESPIE,NATMPIE COMMON /CPIECE/IRESPIE(2,2,MXPIECES),NPIECES,NRESPIE(2), + NATMPIE(2) C compare only if sequences of BRK and DSSP are the same LOGICAL LCHECK *----------------------------------------------------------------------* C---- ------------------------------------------------------------------ C C---- ------------------------------------------------------------------ c get pieces from alignment IF (LSHIFTED) THEN RMS=-1.0 RETURN ENDIF CALL SETPIECES(MAXALSQ,ALI_1,ALI_2,LENALI,IFIR,ILAS,JFIR, + JLAS,IRESPIE,MXPIECES,NPIECES) c.get coordinates c if coordinates are still in memory dont read them again IF (BRKFILE_1 .NE. BRKBEFORE1) THEN CALL GETCOORFORHSSP(BRKFILE_1,KBRK,NAMMOL1,NRES_1,NATM1,MXRES, + MXATM,CIDRES_1,IPATM1RES,RRES1, + NAMATM1,IPRES1ATM,RATM1) ENDIF IF (BRKFILE_2.NE.BRKBEFORE2) THEN CALL GETCOORFORHSSP(BRKFILE_2,KBRK,NAMMOL2,NRES_2,NATM2,MXRES, + MXATM,CIDRES_2,IPATM2RES,RRES2, + NAMATM2,IPRES2ATM,RATM2) ENDIF IF (NRES_1.EQ.0 .OR. NRES_2.EQ.0) THEN WRITE(6,*)'**** IN ALITOSTRUCRMS *****' WRITE(6,*)' READ ERROR IN FILE: ',BRKFILE_1,' OR ',BRKFILE_2 WRITE(6,*)' STRUCTURE ALIGNMENT SKIPPED ' RMS=-1.0 RETURN ELSE BRKBEFORE1=BRKFILE_1 BRKBEFORE2=BRKFILE_2 ENDIF CALL CHECKPOSITION(PDBNO_1,CHAINID_1,PDBNO_2,CHAINID_2, + CIDRES_1,CIDRES_2,NRES_1,NRES_2,LCHECK) IF (LCHECK) THEN CALL COMPALISTRUC(BRKFILE_1,BRKFILE_2, + NRES_1,NRES_2,NATM1,NATM2, + IPATM1RES,IPATM2RES,RRES1,RRES2, + RATM1,RATM2,WSUP1,WSUP2,LCALPHA,RMS) ELSE RMS=-1.0 ENDIF RETURN END C END ALITOSTRUCRMS C...................................................................... C...................................................................... C SUB ASCIIFILTER SUBROUTINE ASCIIFILTER(LINE) C Chris Sander, May 1986 (changed by RS 92) C replaces non-printable characters by blanks. C specification in terms of ASCII-table integers C system and choice dependent PARAMETER (LOWLIMIT= 32) PARAMETER (HILIMIT= 126) c import CHARACTER*(*) LINE *----------------------------------------------------------------------* CALL STRPOS(LINE,IBEG,IEND) DO I=IBEG,IEND IASCII=ICHAR(LINE(I:I)) IF ( IASCII .LT. LOWLIMIT .OR. IASCII .GT. HILIMIT ) THEN LINE(I:I)=' ' WRITE(6,*)'* ASCIIFILTER: funny character replaced by blank' WRITE(6,*)' integer value is: ',IASCII ENDIF ENDDO RETURN END C END ASCII-FILTER C...................................................................... C...................................................................... C SUB CALC_PROF SUBROUTINE CALC_PROF(MAXRES,MAXAA,NRES,PDBSEQ,NALIGN, + EXCLUDEFLAG,IDE,IFIR,ILAS,ALISEQ,ALIPOINTER,TRANS, + SEQPROF,NOCC,NDEL,NINS,ENTROPY,RELENT) IMPLICIT NONE C import REAL IDE(*) INTEGER MAXRES,MAXAA,NRES,NALIGN, + IFIR(*),ILAS(*),ALIPOINTER(*) CHARACTER PDBSEQ(*),ALISEQ(*),EXCLUDEFLAG(*) CHARACTER*(*) TRANS C export INTEGER SEQPROF(MAXRES,MAXAA),RELENT(*), + NDEL(*),NINS(*),NOCC(*) REAL ENTROPY(*) C internal INTEGER NASCII,MAXALIGNS_LOC PARAMETER (NASCII= 256) PARAMETER (MAXALIGNS_LOC= 8765) C PARAMETER (MAXALIGNS_LOC= 19999) REAL SUMENTROPY,X,XENTROPY,XMAXENTROPY INTEGER IRES,IALIGN,IPOS,I,J, + LOWERPOS(NASCII),ITEST INTEGER*2 INS_START(MAXALIGNS_LOC) CHARACTER C1,LOWER*26 *----------------------------------------------------------------------* WRITE(6,*)' CALC_PROF' IF (NALIGN .GT. MAXALIGNS_LOC) THEN WRITE(6,*)' CALC_PROF: MAXALIGNS_LOC overflow' STOP ELSE IF (NALIGN .LE. 0) THEN RETURN ENDIF C used to convert lower case characters from the DSSP-seq to 'C' (Cys) LOWER='abcdefghijklmnopqrstuvwxyz' CALL GETPOS(LOWER,LOWERPOS,NASCII) C initialize DO I=1,MAXRES DO J=1,MAXAA SEQPROF(I,J)=0 ENDDO NOCC(I)=0 NDEL(I)=0 NINS(I)=0 ENTROPY(I)=0 RELENT(I)=0 ENDDO DO IALIGN=1,NALIGN INS_START(IALIGN)=0 ENDDO C CALCULATE SEQUENCE PROFILE AND ENTROPY SUMENTROPY=0.0 DO IRES=1,NRES C residue of DSSP-sequence (SEQ1) C1=PDBSEQ(IRES) C convert lower case character in DSSP to 'Cys' I=LOWERPOS(ICHAR(C1)) IF (I.NE.0) C1='C' CALL GETSEQPROF(C1,TRANS,IRES,NOCC,SEQPROF,MAXRES,MAXAA) C residues of aligned sequences DO IALIGN=1,NALIGN IF ( EXCLUDEFLAG(IALIGN) .EQ. ' ') THEN IF (IRES.GE.IFIR(IALIGN).AND. + IRES.LE.ILAS(IALIGN)) THEN IPOS=ALIPOINTER(IALIGN)+IRES-IFIR(IALIGN) C1=ALISEQ(IPOS) ELSE C1=' ' ENDIF I=LOWERPOS(ICHAR(C1)) C if lower case character: insertions IF (I.NE.0 .AND. INS_START(IALIGN) .EQ. 0) THEN NINS(IRES)=NINS(IRES)+1 CALL LOWTOUP(C1,1) INS_START(IALIGN)=1 ELSE IF (INS_START(IALIGN) .EQ. 1) THEN INS_START(IALIGN)=0 ENDIF IF (C1 .NE.' ' ) THEN IF (C1.NE.'.') THEN CALL GETSEQPROF(C1,TRANS,IRES,NOCC, + SEQPROF,MAXRES,MAXAA) ELSE C if '.' : deletion NDEL(IRES)=NDEL(IRES)+1 ENDIF ENDIF ENDIF ENDDO ENDDO C calculate ENTROPY DO IRES=1,NRES SUMENTROPY=0.0 IF (NOCC(IRES).GT.1) THEN DO I=1,MAXAA IF (SEQPROF(IRES,I).NE. 0) THEN X=FLOAT (SEQPROF(IRES,I)) / FLOAT (NOCC(IRES)) XENTROPY=X * (-LOG(X)) SUMENTROPY=SUMENTROPY+XENTROPY ENDIF ENDDO ENTROPY(IRES)=SUMENTROPY IF (NOCC(IRES).LE.20) THEN XMAXENTROPY = -LOG (1 / FLOAT(NOCC(IRES))) ELSE C log(0.05) = ln (1/20) XMAXENTROPY = -LOG(0.05) ENDIF RELENT(IRES)=NINT(SUMENTROPY*100/ XMAXENTROPY) ENDIF ENDDO C normalize sequence profile DO IRES=1,NRES DO I=1,MAXAA IF (NOCC(IRES).GE.1) THEN X=FLOAT(SEQPROF(IRES,I)) *100.0 / FLOAT(NOCC(IRES)) SEQPROF(IRES,I)=NINT(X) ENDIF ENDDO C ITEST=0 C DO I=1,MAXAA C ITEST=ITEST+SEQPROF(IRES,I) C ENDDO C IF (ITEST .NE. 100) THEN C WRITE(6,*)'calc_prof: itest .ne. 100: ',itest C WRITE(6,*)ires,nocc(ires) C ENDIF ENDDO RETURN END C END CALCPROFILE C...................................................................... C...................................................................... C SUB CALC_VAR SUBROUTINE CALC_VAR(NALIGN,NRES,PDBSEQ,IDE,IFIR,ILAS, + ALIPOINTER,ALISEQ,EXCLUDEFLAG, + MAXSTRSTATES,MAXIOSTATES,NTRANS,MATSEQ, + MATRIX,VAR) C---- import IMPLICIT NONE INTEGER NALIGN,NRES,NTRANS,MAXSTRSTATES,MAXIOSTATES, + IFIR(*),ILAS(*),ALIPOINTER(*) CHARACTER PDBSEQ(*), ALISEQ(*),EXCLUDEFLAG(*) REAL IDE(*) C used for variability CHARACTER*(*) MATSEQ REAL MATRIX(NTRANS,NTRANS,MAXSTRSTATES,MAXIOSTATES, + MAXSTRSTATES,MAXIOSTATES) C---- export INTEGER VAR(*) C---- internal INTEGER MAXRES,NASCII PARAMETER (NASCII= 256) PARAMETER (MAXRES= 9999) C PARAMETER (MAXRES= 10000) C PARAMETER (MAXRES= 30011) INTEGER I,J,IALIGN,JALIGN,ILEN,IRES, + IPOS,JPOS,IBEG,IEND,IAGR,ICYS,MALIGN,KALIGN, + IPDB_SEQ(MAXRES), + IALIGN_SEQ(MAXRES),JALIGN_SEQ(MAXRES) REAL SUMVAR(MAXRES),SUMDIST(MAXRES), + TMPVAL(MAXRES),SEQDIST LOGICAL LEGALRES(MAXRES) C value of best match REAL VALMAX C only used to get rid of INDEX command (CPU time) INTEGER MATPOS(NASCII),LOWERPOS(NASCII) CHARACTER LOWER*26 *----------------------------------------------------------------------* C---- ------------------------------------------------------------------ WRITE(6,*)' calc_var' C used to convert lower case characters from the DSSP-seq to 'C' (Cys) LOWER='abcdefghijklmnopqrstuvwxyz' CALL GETPOS(LOWER,LOWERPOS,NASCII) C calculate variability only for the 22 (BZ) amino acids DO I=1,NASCII MATPOS(I)=0 ENDDO CALL GETPOS(MATSEQ(1:22),MATPOS,NASCII) IF (NRES .GT. MAXRES) THEN WRITE(6,*)'ERROR: nres.gt.maxres in calc_var' WRITE(6,*)'**** increase maxres ****' STOP ENDIF C---- initialise VALMAX=0.0 DO I=1,NTRANS DO J=1,NTRANS IF (MATRIX(J,I,1,1,1,1) .GT. VALMAX) THEN VALMAX=MATRIX(J,I,1,1,1,1) ENDIF ENDDO ENDDO DO I=1,NRES VAR(I)=0 SUMVAR(I)=0.0 SUMDIST(I)=0.0 IALIGN_SEQ(I)=0 JALIGN_SEQ(I)=0 ENDDO IF (NALIGN .LE. 0) RETURN C..................................................... C CALCULATE VARIABILITY C variability= distance(k,l) * matrix(i,j,1,1,1,1) C k,l = sequence C i,j = residue C distance= 1-(matches/length) C length=length of alignment - gaps C C convert DSSP-seq and first 'good' alignment seq to integers ICYS=MATPOS(ICHAR('C')) DO I=1,NRES IPDB_SEQ(I)=MATPOS( ICHAR(PDBSEQ(I) ) ) IF ( IPDB_SEQ(I) .EQ. 0) THEN J=LOWERPOS( ICHAR(PDBSEQ(I)) ) IF (J .NE. 0) IPDB_SEQ(I)=ICYS ENDIF ENDDO C find last alignment to be considered and store sequence of last C alignment in ialign_seq for first iteration of next loop MALIGN=0 DO IALIGN=1,NALIGN IF ( EXCLUDEFLAG(IALIGN) .EQ. ' ') THEN MALIGN=IALIGN ENDIF ENDDO IALIGN=MALIGN C---- BR 99.09: correct if none found IF (IALIGN .EQ. 0) RETURN IPOS=ALIPOINTER(IALIGN)-IFIR(IALIGN) DO IRES=IFIR(IALIGN),ILAS(IALIGN) IF (IRES .GT. 0) THEN IALIGN_SEQ(IRES)=MATPOS( ICHAR( ALISEQ(IPOS+IRES) ) ) IF ( IALIGN_SEQ(IRES) .EQ. 0) THEN IF (ALISEQ(IPOS+IRES) .GE. 'a' .AND. + ALISEQ(IPOS+IRES) .LE. 'z') THEN IALIGN_SEQ(IRES)=MATPOS(ICHAR(ALISEQ(IPOS+IRES))-32) ENDIF ENDIF END IF ENDDO C loop from last 'good' alignment till first DO IALIGN=MALIGN,1,-1 IF ( IALIGN .GT. 0 .AND. + EXCLUDEFLAG(IALIGN) .EQ. ' ' .AND. + IFIR(IALIGN) .GT. 0) THEN C distance between PDBseq and alignment SEQDIST=1.0-IDE(IALIGN) C accumulate distance etc. DO IRES=IFIR(IALIGN),ILAS(IALIGN) IF (IPDB_SEQ(IRES).NE.0.AND.IALIGN_SEQ(IRES).NE.0) THEN SUMVAR(IRES)=SUMVAR(IRES) + + (SEQDIST * + MATRIX(IPDB_SEQ(IRES),IALIGN_SEQ(IRES),1,1,1,1)) SUMDIST(IRES)=SUMDIST(IRES)+SEQDIST ENDIF ENDDO ENDIF C pairwise comparison of alignend sequences from first to "ialign" C store last 'good' alignment before "ialign" in "kalign" so we can C use the last "jalign"-seq for the next iteration of the "ialign"-seq KALIGN=0 DO JALIGN=1,IALIGN-1 IF ( EXCLUDEFLAG(JALIGN) .EQ. ' ') THEN KALIGN=JALIGN JPOS=ALIPOINTER(JALIGN)-IFIR(JALIGN) DO IRES=IFIR(JALIGN),ILAS(JALIGN) IF (IRES .GT. 0) THEN JALIGN_SEQ(IRES)= + MATPOS( ICHAR( ALISEQ(JPOS+IRES) ) ) IF ( JALIGN_SEQ(IRES) .EQ. 0) THEN IF (ALISEQ(JPOS+IRES) .GE. 'a' .AND. + ALISEQ(JPOS+IRES) .LE. 'z') THEN JALIGN_SEQ(IRES)= + MATPOS(ICHAR(ALISEQ(JPOS+IRES))-32) ENDIF ENDIF ENDIF ENDDO SEQDIST=0.0 IAGR=0 ILEN=0 C get distance between overlap of alignend seqs IBEG= MAX(IFIR(IALIGN),IFIR(JALIGN)) IEND= MIN(ILAS(IALIGN),ILAS(JALIGN)) DO IRES= IBEG,IEND IF (IRES .GT. 0) THEN LEGALRES(IRES)=.FALSE. IF ( IALIGN_SEQ(IRES) .NE. 0 .AND. + JALIGN_SEQ(IRES) .NE. 0) THEN LEGALRES(IRES)=.TRUE. IF (IALIGN_SEQ(IRES) .EQ. JALIGN_SEQ(IRES)) + IAGR=IAGR+1 TMPVAL(IRES)= + MATRIX(IALIGN_SEQ(IRES), + JALIGN_SEQ(IRES),1,1,1,1) ILEN=ILEN+1 ENDIF ENDIF ENDDO IF (ILEN .NE. 0) THEN SEQDIST=1.0-(FLOAT(IAGR)/ILEN) DO IRES=IBEG,IEND IF (LEGALRES(IRES)) THEN SUMDIST(IRES)=SUMDIST(IRES)+SEQDIST SUMVAR(IRES)=SUMVAR(IRES)+(SEQDIST*TMPVAL(IRES)) ENDIF ENDDO ENDIF ENDIF ENDDO IF (KALIGN .GT. 0) THEN DO I=IFIR(KALIGN),ILAS(KALIGN) IF (I .GT. 0) THEN IALIGN_SEQ(I)=JALIGN_SEQ(I) ENDIF ENDDO ENDIF ENDDO C calculate variability DO IRES=1,NRES IF (SUMDIST(IRES) .NE. 0.0) THEN VAR(IRES)=NINT((VALMAX- (SUMVAR(IRES)/SUMDIST(IRES)) )*100) ENDIF ENDDO RETURN END C END CALC_VAR C...................................................................... C...................................................................... C SUB CHARARRAYREPL c subroutine CHARARRAYREPL(string,length,c1,c2) c Implicit None C replaces all occurences of c1 by c2 C Import c integer length c character*1 c1, c2 C Import/Export c character string(*) C Internal c integer ipos c do ipos = 1,length c if ( string(ipos) .eq. c1 ) string(ipos) = c2 c enddo c return c end C END CHARARRAYREPL C...................................................................... C...................................................................... C SUB CHECKFORMAT SUBROUTINE CHECKFORMAT(IN,INNAME,FORMATNAME,ERRFLAG) C CHECK IF FORMAT ONE OF :DSSP,PIR,EMBL,GCG OR SOMETHING NOT SPECIFIED LOGICAL ERRFLAG CHARACTER*(*) FORMATNAME,INNAME CHARACTER*1000 FILENAME,LINE FORMATNAME='UNK' LINE=' ' FILENAME=' ' I=INDEX(INNAME,'_!_') J=0 K=0 C J=INDEX(INNAME,'hssp_') C K=INDEX(INNAME,'dssp_') L=INDEX(INNAME,'dssp_ca_') M=INDEX(INNAME,'dssp_mod') IF (I.NE.0) THEN FILENAME=INNAME(:I-1) ELSE IF (J .NE. 0) THEN FILENAME=INNAME(:J+3) ELSE IF ( (K .NE. 0) .AND. (L .LE. 0) .AND. (M .EQ. 0) ) THEN FILENAME=INNAME C FILENAME=INNAME(:K+3) ELSE FILENAME=INNAME ENDIF CALL OPEN_FILE(IN,FILENAME,'READONLY,OLD',ERRFLAG) IF (ERRFLAG) THEN WRITE(6,*)' open file error in CHECKFORMAT' GOTO 11 ENDIF I=0 LENGTH=LEN(LINE) DO WHILE(.TRUE.) I=I+1 READ(IN,'(A)',END=99) LINE IF (INDEX(LINE(:2),'ID').NE.0) THEN DO WHILE (.TRUE.) C LOOK FOR DIFF:EMBL,GCG READ (IN,'(A)',END=10)LINE IF (INDEX(LINE,'..').NE.0) THEN C there are still some swissprot files with '..' CALL LOWTOUP(LINE,LENGTH) IF (INDEX(LINE,'CHECK:').NE.0 .AND. + INDEX(LINE,'MSF:').NE.0) THEN FORMATNAME='MSF' GOTO 99 ELSE IF (INDEX(LINE,'CHECK:').NE.0) THEN FORMATNAME='GCG' GOTO 99 ENDIF ENDIF ENDDO 10 FORMATNAME='EMBL' ENDIF IF (INDEX(LINE,'PROGRAM DSSP,').NE.0) THEN FORMATNAME='DSSP' GOTO 99 ELSE IF (INDEX(LINE,'-PROFILE').NE.0) THEN FORMATNAME='PROFILE' IF (INDEX(LINE,'SECONDARY').NE.0) THEN FORMATNAME='PROFILE-DSSP' ENDIF GOTO 99 ELSE IF (INDEX(LINE(1:5),'HSSP ').NE.0) THEN FORMATNAME='HSSP' GOTO 99 ELSE IF ( (I.EQ.1) .AND. (LINE(1:6) .EQ. 'HEADER') ) THEN FORMATNAME='BRK' GOTO 99 ELSE IF (INDEX(LINE,'..').NE.0) THEN CALL LOWTOUP(LINE,LENGTH) IF (INDEX(LINE,'CHECK:').NE.0 .AND. + INDEX(LINE,'MSF:').NE.0) THEN FORMATNAME='MSF' GOTO 99 ELSE IF (INDEX(LINE,'CHECK:').NE.0) THEN FORMATNAME='GCG' GOTO 99 ENDIF ELSE IF ( LINE(1:1) .EQ. '>') THEN FORMATNAME='FASTA' READ(IN,'(A)',END=99)LINE IF (LINE .NE. ' ') THEN CALL LOWTOUP(LINE,LEN(LINE)) CALL STRPOS(LINE,ISTART,ISTOP) DO I=ISTART,ISTOP IASCII=ICHAR(LINE(I:I)) IF ( (IASCII .EQ. 85) .OR. + (IASCII .EQ. 79) .OR. + (IASCII .EQ. 74) .OR. + (IASCII .GE. 33 .AND. IASCII .LE. 64) .OR. + (IASCII .GE. 91 ) ) THEN FORMATNAME='PIR' GOTO 20 ENDIF ENDDO ELSE FORMATNAME='PIR' ENDIF 20 DO WHILE(.TRUE.) READ(IN,'(A)',END=99)LINE IF ( LINE(1:1) .EQ. '>') THEN FORMATNAME='FASTA-DB' GOTO 99 ENDIF ENDDO GOTO 99 ELSE IF (LINE(1:1) .EQ. '*') THEN FORMATNAME='STAR' GOTO 99 ENDIF ENDDO 99 CLOSE (IN) RETURN 11 RETURN END C END CHECKFORMAT C...................................................................... C...................................................................... C SUB CHECKHSSPCUT SUBROUTINE CHECKHSSPCUT(LEN,IDENTITY,ISOLEN,ISOIDE,NSTEP, + LFORMULA,LALL,ISAFE,LCONSIDER,DISTANCE) C RS 89 C check if sequence identity <==> length of alignment are in the 'good' C part of the HSSP-PLOT C if OK : LCONSIDER= TRUE IMPLICIT NONE INTEGER I,LEN,IRANGE,JRANGE REAL IDENTITY,DISTANCE,Y LOGICAL LCONSIDER,LFORMULA,LALL INTEGER ISOLEN(*),NSTEP,ISAFE REAL ISOIDE(*) LCONSIDER=.FALSE. DISTANCE=0.0 C equation from cutoffs in the HSSP-plot IF (LFORMULA .OR. LALL) THEN IF (LEN.LT.10) THEN IF (.NOT. LFORMULA)LCONSIDER=.TRUE. RETURN ENDIF IF (LEN.GT.200) THEN Y= 24.767 + ISAFE C DISTANCE IS ALWAYS DISTANCE FROM ORIGINAL CURVE DISTANCE=IDENTITY - (290.15* (200**(-0.56158)) ) ELSE Y=( 290.15* (LEN**(-0.56158)) ) + ISAFE C distance is always distance from original curve DISTANCE=IDENTITY - (290.15* (LEN**(-0.56158)) ) ENDIF IF (IDENTITY .GE. Y)LCONSIDER=.TRUE. IF (.NOT. LFORMULA)LCONSIDER=.TRUE. RETURN ELSE C dont consider alignments less than smallest length in datafile IF (LEN .GE. ISOLEN(1)) THEN DO I=1,NSTEP IRANGE=ISOLEN(I) C if length is longer than longest specified set upper range to LENGTH+1 IF (I.NE.NSTEP) THEN JRANGE=ISOLEN(I+1) ELSE JRANGE=LEN+1 ENDIF IF (LEN .GE. IRANGE .AND. LEN .LT. JRANGE) THEN C if identity .GE. than ISOSIG-data IF (IDENTITY.GE.ISOIDE(I)) THEN LCONSIDER=.TRUE. DISTANCE=IDENTITY-ISOIDE(I) CD WRITE(6,*)len,identity,isolen(i),isoide(i) GOTO 10 ENDIF ENDIF ENDDO ELSE LCONSIDER=.FALSE. ENDIF 10 RETURN ENDIF END C END CHECKHSSPCUT C...................................................................... C...................................................................... C SUB CHECKHSSPCUT99 SUBROUTINE CHECKHSSPCUT99(LEN,IDENTITY,ISOLEN,ISOIDE,NSTEP, + LFORMULA,LALL,ISAFE,LCONSIDER,DISTANCE) C RS 1989 C BR 2003 C check if sequence identity <==> length of alignment are in the 'good' C part of the HSSP-PLOT C now the new one taken: C C pide= 480 * L ^ { -0.32 (1 + e ^-(L/1000)) } C C if OK (ie the particular pair IS taken): LCONSIDER= TRUE C---- local variables IMPLICIT NONE INTEGER I,LEN,IRANGE,JRANGE REAL IDENTITY,DISTANCE,Y,X1,X2 LOGICAL LCONSIDER,LFORMULA,LALL INTEGER ISOLEN(*),NSTEP,ISAFE REAL ISOIDE(*) ******------------------------------*-----------------------------****** C---- defaults LCONSIDER= .FALSE. DISTANCE= 0.0 C equation from cutoffs in the HSSP-plot IF (LFORMULA .OR. LALL) THEN Cbr- <= 11 is too short! IF (LEN.LE.11) THEN IF (.NOT. LFORMULA) LCONSIDER=.TRUE. RETURN ENDIF Cbr-- > 450 saturation at 19.5 IF (LEN.GT.450) THEN Y= 19.5 + ISAFE Cbr-- distance is NOT ALWAYS distance from curve DISTANCE=IDENTITY - 19.5 ELSE Cbr-- exponential function X1=-1*0.32*( 1 + EXP(-1*(REAL(LEN))/1000) ) X2=480 * (LEN**(X1)) C Y=(480 * (LEN**(-0.32* (1+EXP(-1*(LEN/1000))) )) )+ISAFE Y= X2 + ISAFE C distance is always distance from original curve C DISTANCE=IDENTITY - C + (480 * (LEN**(-0.32* (1+EXP(-1*(LEN/1000))) )) ) DISTANCE=IDENTITY - X2 ENDIF IF (IDENTITY .GE. Y) LCONSIDER=.TRUE. IF (.NOT. LFORMULA) LCONSIDER=.TRUE. RETURN ELSE C dont consider alignments less than smallest length in datafile IF (LEN .GE. ISOLEN(1)) THEN DO I=1,NSTEP IRANGE=ISOLEN(I) C if length is longer than longest specified set upper range to LENGTH+1 IF (I.NE.NSTEP) THEN JRANGE=ISOLEN(I+1) ELSE JRANGE=LEN+1 ENDIF IF (LEN .GE. IRANGE .AND. LEN .LT. JRANGE) THEN C if identity .GE. than ISOSIG-data IF (IDENTITY.GE.ISOIDE(I)) THEN LCONSIDER=.TRUE. DISTANCE=IDENTITY-ISOIDE(I) CD WRITE(6,*)len,identity,isolen(i),isoide(i) GOTO 10 ENDIF ENDIF ENDDO ELSE LCONSIDER=.FALSE. ENDIF 10 RETURN ENDIF END C END CHECKHSSPCUT99 C...................................................................... C...................................................................... C SUB CHECKPOSITION SUBROUTINE CHECKPOSITION(PDBNO_1,CHAINID_1,PDBNO_2,CHAINID_2, + CBRKID_1,CBRKID_2,NRES_1,NRES_2,LMATCH) C RS 89 C check if pieces from DSSP-alignment match the position in the C Brookhaven coordinate file C if not this routine tries to find the right position C piece attributes INTEGER MXPIECES PARAMETER (MXPIECES= 50) COMMON /CPIECE/IRESPIE(2,2,MXPIECES),NPIECES,NRESPIE(2), + NATMPIE(2) C ALIGNMENT AND SEQUENCES C BRK-NUMBER FROM DSSP INTEGER PDBNO_1(*),PDBNO_2(*) CHARACTER*(*) CHAINID_1(*),CHAINID_2(*) C BRK-NUMBER FROM BRK CHARACTER*(*) CBRKID_1(*),CBRKID_2(*) C INTERNAL C TRUE IF PIECES ARE THE SAME LOGICAL LMATCH CHARACTER*6 CTEST *----------------------------------------------------------------------* LMATCH=.FALSE. C CHECK PIECES DO IPIECE=1,NPIECES C CHECK PIECE FROM TEST SEQUENCE IB=IRESPIE(1,1,IPIECE) IE=IRESPIE(2,1,IPIECE) C put chain identifier of BRK at first position; in DSSP last position WRITE(CTEST,'(A,I4,A)')CHAINID_1(IB),PDBNO_1(IB),' ' IF (CTEST .NE. CBRKID_1(IB)) THEN WRITE(6,*)' CHECKPOSITION: DSSP/BRK pieces are '// + 'different try to find right positions in piece 1' DO IPOS=-NRES_1,NRES_1 IF (IB+IPOS .GT. 0 .AND. IB+IPOS .LT. NRES_1) THEN IF (CTEST .EQ. CBRKID_1(IB+IPOS)) THEN IRESPIE(1,1,IPIECE)=IB+IPOS IRESPIE(2,1,IPIECE)=IE+IPOS LMATCH=.TRUE. WRITE(6,*)' CHECKPOSITION: right position found ' WRITE(6,*)' IPIECE : ',ipiece WRITE(6,*)' DSSP-piece is: ',ib,ie WRITE(6,*)' BRK-piece is: ',ib+ipos,ie+ipos GOTO 100 ENDIF ENDIF ENDDO ELSE LMATCH=.TRUE. ENDIF 100 CONTINUE IF (.NOT. LMATCH) THEN WRITE(6,*)'CHECKPOSITION : NO MATCH, 3D COMPARISON SKIPPED' RETURN ENDIF c check piece of comparison sequence LMATCH=.FALSE. IB=IRESPIE(1,2,IPIECE) IE=IRESPIE(2,2,IPIECE) WRITE(CTEST,'(A,I4,A)')CHAINID_2(IB),PDBNO_2(IB),' ' IF (CTEST .NE. CBRKID_2(IB)) THEN WRITE(6,*)' CHECKPOSITION: DSSP/BRK pieces are different'// + ' try to find right positions in piece 2' DO IPOS=-NRES_2,NRES_2 IF (IB+IPOS .GT. 0 .AND. IB+IPOS .LT. NRES_2) THEN WRITE(6,*)':',CTEST,':',CBRKID_2(IB+IPOS),':' IF (CTEST .EQ. CBRKID_2(IB+IPOS) ) THEN IRESPIE(1,2,IPIECE)=IB+IPOS IRESPIE(2,2,IPIECE)=IE+IPOS LMATCH=.TRUE. WRITE(6,*)' CHECKPOSITION: right position found ' WRITE(6,*)' IPIECE : ',ipiece WRITE(6,*)' DSSP-piece is: ',ib,ie WRITE(6,*)' BRK-piece is: ',ib+ipos,ie+ipos GOTO 200 ENDIF ENDIF ENDDO ELSE LMATCH=.TRUE. ENDIF IF (.NOT. LMATCH) THEN WRITE(6,*)'CHECKPOSITION : NO MATCH, 3D COMPARISON SKIPPED' RETURN ENDIF 200 CONTINUE ENDDO RETURN END C END CHECKPOSITION C...................................................................... C...................................................................... C SUB CHECKRANGE SUBROUTINE CHECKRANGE(N,NLOWER,NUPPER,VARIABLE,ROUTINE) CHARACTER*(*) ROUTINE, VARIABLE IF (N .LT. NLOWER .OR. N .GT. NUPPER ) THEN WRITE(6,*)'*** fatal error in ',routine WRITE(6,*) ' integer ',variable,' out of range ' WRITE(6,*) ' legal limits are: ',nlower, nupper WRITE(6,*) ' current value is: ',n STOP 'IN CHECKRANGE' ENDIF RETURN END C END CHECKRANGE C...................................................................... C...................................................................... C SUB CHECKINEQUALITY SUBROUTINE CHECKINEQUALITY(N,M,VARIABLE,ROUTINE) CHARACTER*(*) ROUTINE, VARIABLE INTEGER N,M IF (N .EQ. M) THEN WRITE(6,*)'*** fatal error in ',routine WRITE(6,*)variable,' are equal but should be uneq' WRITE(6,*) ' current value is: ',n,m STOP 'IN CHECKINEQUALITY' ENDIF RETURN END C END CHECKINEQUALITY C...................................................................... C...................................................................... C SUB CHECKREALEQUALITY SUBROUTINE CHECKREALEQUALITY(X1,X2,EPSILON,VARIABLE,ROUTINE) CHARACTER*(*) ROUTINE, VARIABLE REAL X1,X2,EPSILON IF (EPSILON .LT. 0.0) THEN WRITE(6,*)' *** negative epsilon in checkrealequality' ENDIF IF (ABS(X1-X2) .GT. EPSILON) THEN WRITE(6,*)'*** fatal error in ',routine WRITE(6,*)' real nums ',variable,' are not eq within',epsilon WRITE(6,*)' values are: ',x1,x2 STOP 'IN CHECKREALEQUALITY' ENDIF RETURN END C END CHECKREALEQUALITY C...................................................................... C...................................................................... C SUB CHECKSEQ SUBROUTINE CHECKSEQ(STRAND,BEGIN,END,CHECK) IMPLICIT NONE C sub version of gcg function CheckSeq 18 C Changes: C - return value now additional parameter "check" C - additional parameters "begin","end" : Strand is now read C from begin to end, no longer from 1 to first occurence of char(0) C IMPORT CHARACTER*(*) STRAND C UG INTEGER BEGIN, END C INTERNAL INTEGER CHECKTMP, COUNT, I INTEGER TABLE(0:255) CHARACTER C C EXPORT INTEGER CHECK DO I = 0, 255 C = CHAR(I) CALL LOWTOUP(C,1) TABLE(I) = ICHAR(C) END DO CHECKTMP = 0 COUNT = 0 DO I = BEGIN, END COUNT = COUNT + 1 CHECKTMP = CHECKTMP + COUNT * TABLE(ICHAR(STRAND(I:I))) IF ( COUNT.EQ.57 ) COUNT = 0 END DO CHECK = MOD(CHECKTMP, 10000) RETURN END C END CHECKSEQ C...................................................................... C...................................................................... C SUB COMPALISTRUC C COMPARE-PROTEIN-STRUCTURES. C C.SANDER MAY 1983, as CELLO subroutine July 1985. C calcs best overlap of two protein pieces CP pass storage for spliced molecule as argument RRES1SPL RATM1SPL etc CP then remove parameter here - should only exist in GRAFIX-MOLEC:COMM c subroutine compalistruc() SUBROUTINE COMPALISTRUC(FILCOO1,FILCOO2,NRES_1,NRES_2,NATM1, + NATM2,IPATM1RES,IPATM2RES,RRES1, + RRES2,RATM1,RATM2,WSUP1,WSUP2,LCALPHA, + RMS) IMPLICIT NONE INTEGER MXRESMOL,MXATMMOL PARAMETER (MXRESMOL= 600) PARAMETER (MXATMMOL=10*MXRESMOL) c molecule attributes CHARACTER*(*) FILCOO1, FILCOO2 INTEGER NRES_1,NRES_2,NATM1,NATM2 C+++++variables shared with GETCOOR/S3TOS1 - from GET-PROTEIN-LIB C points to first, last and CEN atom INTEGER IPATM1RES(3,*), IPATM2RES(3,*) C center residue coors REAL RRES1(3,*),RRES2(3,*) C atom coors REAL RATM1(3,*), RATM2(3,*) C superposition weights. REAL WSUP1(*), WSUP2(*) LOGICAL LCALPHA C compare 3-d structure piece by piece LOGICAL LPIEBYPIE C result variables C BEST TRANSROT FROM SUPERPOSE REAL TRANS(3), ROT(3,3), RMS C piece attributes INTEGER MXPIECES PARAMETER (MXPIECES= 50) INTEGER IPRESPIE,NPIECES,NRESPIE,NATMPIE COMMON /CPIECE/IPRESPIE(2,2,MXPIECES),NPIECES,NRESPIE(2), + NATMPIE(2) C local atom storage for spliced coordinates REAL RRES1SPL(3,MXRESMOL), RRES2SPL(3,MXRESMOL) REAL RATM1SPL(3,MXATMMOL), RATM2SPL(3,MXATMMOL) C internal INTEGER I,K,IATM,IPIECE,LMOL,NRES,IRESPIE,IATMPIE, + IRES,IRES1,IRES2,IPIE1,IER REAL TOTALLEN,XRMSTOTAL,XRMS C [mol1 mol1] C [mol2 mol2] C C pointers: relative to beginning of each molecule C C molecule 1,NRESMOL residues C 1,NATMMOL atoms C C piece IPRESPIE(2,2,MXPIECES) C (2,2,MXPIECES)=(beg-end,mol1-mol2,IPIECE) C NATMPIE(2) C NRESPIE(2) (2)=(mol1-mol2) C C----------------------------------------------------------------------- WRITE(6,*)' enter COMPARE-STRUCS for molecules: ' WRITE(6,'(a,a,i6,a,i6)')FILCOO1(1:40), + ' NRES=',NRES_1,' NATM= ',NATM1 WRITE(6,'(a,a,i6,a,i6)')FILCOO2(1:40), + ' NRES=',NRES_2,' NATM= ',NATM2 C Set defaults LPIEBYPIE=.FALSE. DO I=1,NATM1 WSUP1(I)=1.0 ENDDO DO I=1,NATM2 WSUP2(I)=1.0 ENDDO GOTO 200 C COMPARE STRUCS 200 CONTINUE C get compare limits WRITE(6,*) WRITE(6,*)' ---------------------------------' WRITE(6,*)' mol A is: ',FILCOO1(1:50) WRITE(6,*)' mol B is: ',FILCOO2(1:50) C reset upper limit if needed DO IPIECE=1,NPIECES DO LMOL=1,2 IF (LMOL.EQ.1) THEN NRES=NRES_1 ENDIF IF (LMOL.EQ.2) THEN NRES=NRES_2 ENDIF IF (IPRESPIE(1,LMOL,IPIECE) .LT. 1) THEN IPRESPIE(1,LMOL,IPIECE)=1 ENDIF IF (IPRESPIE(2,LMOL,IPIECE) .GT. NRES) THEN IPRESPIE(2,LMOL,IPIECE)=NRES ENDIF ENDDO ENDDO C=============================================================== C GET RMS FOR EACH PIECE AND ADD RMSS IF (LPIEBYPIE) THEN WRITE(6,*)' compare structure piece by piece ' RMS=0.0 TOTALLEN=0.0 XRMSTOTAL=0.0 DO IPIECE=1,NPIECES XRMS=0.0 DO LMOL=1,2 IRESPIE=0 IATMPIE=0 IRES1=IPRESPIE(1,LMOL,IPIECE) IRES2=IPRESPIE(2,LMOL,IPIECE) DO IRES=IRES1,IRES2 IRESPIE=IRESPIE+1 IF (LMOL.EQ.1) THEN DO K=1,3 RRES1SPL(K,IRESPIE)=RRES1(K,IRES) ENDDO C first atom of residue to last atom of residue IRES DO IATM=IPATM1RES(1,IRES),IPATM1RES(2,IRES) IATMPIE=IATMPIE+1 IF (IATMPIE .GT. MXATMMOL) THEN WRITE(6,*)' MXATMMOL overflow ' STOP ENDIF DO K=1,3 RATM1SPL(K,IATMPIE)=RATM1(K,IATM) ENDDO ENDDO ENDIF IF (LMOL.EQ.2) THEN DO K=1,3 RRES2SPL(K,IRESPIE)=RRES2(K,IRES) ENDDO C first atom of residue to last atom of residue IRES DO IATM=IPATM2RES(1,IRES),IPATM2RES(2,IRES) IATMPIE=IATMPIE+1 IF (IATMPIE .GT. MXATMMOL) THEN WRITE(6,*)' MXATMMOL overflow ' STOP ENDIF DO K=1,3 RATM2SPL(K,IATMPIE)=RATM2(K,IATM) ENDDO ENDDO ENDIF C FOR IRES=IRES1,IRES2 ENDDO NRESPIE(LMOL)=IRESPIE NATMPIE(LMOL)=IATMPIE C FOR LMOL=1,2 ENDDO WRITE(6,*) ' IPIECE : ',IPIECE WRITE(6,*)' MOL1: from ',IPRESPIE(1,1,IPIECE),' to ', + IPRESPIE(2,1,IPIECE) WRITE(6,*)' MOL2: from ',IPRESPIE(1,2,IPIECE),' to ', + IPRESPIE(2,2,IPIECE) C superpose using U3B of Wolfgang Kabsch IPIE1=1 C first atom and number of residues of piece 1 and 2 WRITE(6,*)' # of residues ' WRITE(6,'(2I10)') ( NRESPIE(K),K=1,2 ) WRITE(6,*)'----------------------------' WRITE(6,*)' CALL U3B' CALL U3B(WSUP2,RRES1SPL(1,1),RRES2SPL(1,1),NRESPIE(IPIE1), + 0,XRMS,ROT,TRANS,IER) cx XN=FLOAT(NRESPIE(IPIE1)) CX XRMS=SQRT(XRMS/XN) IS NOW IN U3B WRITE(6,'('' RMS '',F18.7)') XRMS TOTALLEN=TOTALLEN+NRESPIE(IPIE1) XRMSTOTAL=XRMSTOTAL+NRESPIE(IPIE1)*XRMS C FOR IPIECE=1,NPIECES ENDDO RMS=XRMSTOTAL/TOTALLEN WRITE(6,*)' TOTAL RMS ',RMS C C end block: splice-coors (piece by piece) C================================================================== ELSE WRITE(6,*)' compare structures: splice-coors' C...block: splice-coors DO LMOL=1,2 IRESPIE=0 IATMPIE=0 DO IPIECE=1,NPIECES IRES1=IPRESPIE(1,LMOL,IPIECE) IRES2=IPRESPIE(2,LMOL,IPIECE) DO IRES=IRES1,IRES2 IRESPIE=IRESPIE+1 IF (LMOL.EQ.1) THEN DO K=1,3 RRES1SPL(K,IRESPIE)=RRES1(K,IRES) ENDDO C....first atom of residue to last atom of residue IRES DO IATM=IPATM1RES(1,IRES),IPATM1RES(2,IRES) IATMPIE=IATMPIE+1 IF (IATMPIE .GT. MXATMMOL) THEN WRITE(6,*)' MXATMMOL overflow ' STOP ENDIF DO K=1,3 RATM1SPL(K,IATMPIE)=RATM1(K,IATM) ENDDO ENDDO ENDIF IF (LMOL.EQ.2) THEN DO K=1,3 RRES2SPL(K,IRESPIE)=RRES2(K,IRES) ENDDO C.... first atom of residue to last atom of residue IRES DO IATM=IPATM2RES(1,IRES),IPATM2RES(2,IRES) IATMPIE=IATMPIE+1 IF (IATMPIE .GT. MXATMMOL) THEN WRITE(6,*)' MXATMMOL overflow ' STOP ENDIF DO K=1,3 RATM2SPL(K,IATMPIE)=RATM2(K,IATM) ENDDO ENDDO ENDIF C FOR IRES=IRES1,IRES2 ENDDO C FOR IPIECE=1,NPIECES ENDDO NRESPIE(LMOL)=IRESPIE NATMPIE(LMOL)=IATMPIE C FOR LMOL=1,2 ENDDO C C end block: splice-coors C CALL REPORTPIECES RMS=0.0 C superpose using U3B of Wolfgang Kabsch IPIE1=1 C first atom and number of residues of piece 1 and 2 IF (LCALPHA) THEN WRITE(6,*)' # of residues ' WRITE(6,'(2I10)') ( NRESPIE(K),K=1,2 ) WRITE(6,*)'----------------------------' WRITE(6,*)' CALL U3B' CALL U3B(WSUP2,RRES1SPL(1,1),RRES2SPL(1,1),NRESPIE(IPIE1), + 0,RMS,ROT,TRANS,IER) ELSE WRITE(6,*)' # of atoms ' WRITE(6,'(2I10)') ( NATMPIE(K),K=1,2 ) WRITE(6,*)'----------------------------' CALL U3B(WSUP2,RATM1SPL(1,1),RATM2SPL(1,1),NATMPIE(IPIE1), + 0,RMS,ROT,TRANS,IER) ENDIF WRITE(6,'('' RMS '',F18.7)') RMS C LPIEBYPIE ENDIF WRITE(6,*) RETURN END C END COMPALISTRUC C...................................................................... C...................................................................... C SUB CONCAT_STRINGS SUBROUTINE CONCAT_STRINGS(STRING1,STRING2,RESULT) C concatenate "string1" and "string2" into "result" CHARACTER*(*) STRING1,STRING2,RESULT INTEGER IBEG,IEND,JBEG,JEND,ILEN RESULT=' ' CALL STRPOS(STRING1,IBEG,IEND) CALL STRPOS(STRING2,JBEG,JEND) ILEN= (IEND-IBEG+1) + (JEND-JBEG+1) IF (ILEN .GT. LEN(RESULT) ) THEN ILEN=LEN(RESULT) WRITE(6,*)' WARNING: in concat_strings: length overflow' WRITE(6,*)' cut string at: ',ilen ENDIF RESULT(1:ILEN)=STRING1(IBEG:IEND)//STRING2(JBEG:JEND) RETURN END C END CONCAT_STRINGS C...................................................................... C...................................................................... C SUB CONCAT_3STRINGS SUBROUTINE CONCAT_3STRINGS(STRING1,STRING2,STRING3,RESULT) C concatenate "string1" and "string2" and "string3" into "result" CHARACTER*(*) STRING1,STRING2,STRING3,RESULT INTEGER IBEG,IEND,JBEG,JEND,KBEG,KEND,ILEN RESULT=' ' CALL STRPOS(STRING1,IBEG,IEND) CALL STRPOS(STRING2,JBEG,JEND) CALL STRPOS(STRING3,KBEG,KEND) ILEN= (IEND-IBEG+1) + (JEND-JBEG+1) + (KEND-KBEG+1) IF (ILEN .GT. LEN(RESULT) ) THEN ILEN=LEN(RESULT) WRITE(6,*)' WARNING: IN CONCAT_STRINGS: LENGTH OVERFLOW' WRITE(6,*)' cut string at: ',ilen ENDIF RESULT(1:ILEN)=STRING1(IBEG:IEND)//STRING2(JBEG:JEND)// + STRING3(KBEG:KEND) RETURN END C END CONCAT_3STRINGS C...................................................................... C...................................................................... C SUB CONCAT_INT_STRING SUBROUTINE CONCAT_INT_STRING(INUMBER,STRING,RESULT) C concatenate "inumber" and "string2" into "result" C import/export CHARACTER*(*) STRING,RESULT INTEGER INUMBER C internal CHARACTER TEMP*64,CFORMAT*100 INTEGER IBEG,IEND,JBEG,JEND,ILEN,ILOG C init TEMP=' ' RESULT=' ' ILOG=1 C get size of number C CAUTION can produce wrong results with very high opt-levels c xnumber=float( inumber ) c if (xnumber .gt. 0.0) then c ilog = nint( log10(xnumber) + 0.5 ) c else if (xnumber .lt. 0.0) then c ilog = nint( log10( abs(xnumber) ) + 1.5 ) c endif IF (INUMBER .GT. 0) THEN IF (INUMBER .LT. 10) THEN ILOG=1 ELSE IF (INUMBER .LT. 100) THEN ILOG=2 ELSE IF (INUMBER .LT. 1000) THEN ILOG=3 ELSE IF (INUMBER .LT. 10000) THEN ILOG=4 ELSE IF (INUMBER .LT. 100000) THEN ILOG=5 ELSE IF (INUMBER .LT. 1000000) THEN ILOG=6 ELSE IF (INUMBER .LT. 10000000) THEN ILOG=7 C too big for INT4 ? c else if (inumber .lt. 100000000) then c ilog=8 ELSE WRITE(6,*)' ERROR in CONCAT_INT_STRING: update plus' CALL FLUSH_UNIT(6) ENDIF ELSE IF (INUMBER .LT. 0) THEN IF (INUMBER .GT. -10) THEN ILOG=2 ELSE IF (INUMBER .GT. -100) THEN ILOG=3 ELSE IF (INUMBER .GT. -1000) THEN ILOG=4 ELSE IF (INUMBER .GT. -10000) THEN ILOG=5 ELSE IF (INUMBER .GT. -100000) THEN ILOG=6 ELSE IF (INUMBER .GT. -1000000) THEN ILOG=7 c else if (inumber .gt. -10000000) then c ilog=8 c else if (inumber .gt. -100000000) then c ilog=9 ELSE WRITE(6,*)' ERROR in CONCAT_INT_STRING: update minus' CALL FLUSH_UNIT(6) ENDIF ENDIF CALL CONCAT_STRING_INT('(I',ILOG,TEMP) CALL CONCAT_STRINGS(TEMP,')',CFORMAT) TEMP=' ' WRITE(TEMP(1:),CFORMAT)INUMBER CALL STRPOS(TEMP,IBEG,IEND) CALL STRPOS(STRING,JBEG,JEND) IEND=IBEG+ILOG-1 ILEN= (IEND-IBEG+1) + (JEND-JBEG+1) IF (ILEN .GT. LEN(RESULT) ) THEN ILEN=LEN(RESULT) WRITE(6,*)' WARNING: in concat_int_string: length overflow' WRITE(6,*)' cut string at: ',ilen ENDIF RESULT(1:ILEN)=TEMP(IBEG:IEND)//STRING(JBEG:JEND) RETURN END C END CONCAT_INT_STRING C...................................................................... C...................................................................... C SUB CONCAT_STRING_INT SUBROUTINE CONCAT_STRING_INT(STRING,INUMBER,RESULT) C concatenate "inumber" and "string2" into "result" C import/export CHARACTER*(*) STRING,RESULT INTEGER INUMBER C internal CHARACTER TEMP*64,CFORMAT*100 INTEGER IBEG,IEND,JBEG,JEND,ILEN,ILOG C init TEMP=' ' RESULT=' ' ILOG=1 C get size of number c with some agressive optimizations, this can go wrong c xnumber=float( inumber ) c if (xnumber .gt. 0.0) then c ilog = nint( log10(xnumber) ) + 1 c else if (xnumber .lt. 0.0) then c ilog = nint( log10( abs(xnumber) ) ) + 2 c endif IF (INUMBER .GT. 0) THEN IF (INUMBER .LT. 10) THEN ILOG=1 ELSE IF (INUMBER .LT. 100) THEN ILOG=2 ELSE IF (INUMBER .LT. 1000) THEN ILOG=3 ELSE IF (INUMBER .LT. 10000) THEN ILOG=4 ELSE IF (INUMBER .LT. 100000) THEN ILOG=5 ELSE IF (INUMBER .LT. 1000000) THEN ILOG=6 ELSE IF (INUMBER .LT. 10000000) THEN ILOG=7 C too big for INT4 ? c else if (inumber .lt. 100000000) then c ilog=8 c else if (inumber .lt. 1000000000) then c ilog=9 c else if (inumber .lt. 10000000000) then c ilog=10 ELSE WRITE(6,*)' ERROR in CONCAT_STRING_INT: update plus' CALL FLUSH_UNIT(6) ENDIF ELSE IF (INUMBER .LT. 0) THEN IF (INUMBER .GT. -10) THEN ILOG=2 ELSE IF (INUMBER .GT. -100) THEN ILOG=3 ELSE IF (INUMBER .GT. -1000) THEN ILOG=4 ELSE IF (INUMBER .GT. -10000) THEN ILOG=5 ELSE IF (INUMBER .GT. -100000) THEN ILOG=6 ELSE IF (INUMBER .GT. -1000000) THEN ILOG=7 c else if (inumber .gt. -10000000) then c ilog=8 c else if (inumber .gt. -100000000) then c ilog=9 c else if (inumber .gt. -1000000000) then c ilog=10 ELSE WRITE(6,*)' ERROR in CONCAT_STRING_INT: update minus' CALL FLUSH_UNIT(6) ENDIF ENDIF CALL MAKE_FORMAT_INT(ILOG,CFORMAT) WRITE(TEMP(1:),CFORMAT)INUMBER CALL STRPOS(TEMP,IBEG,IEND) CALL STRPOS(STRING,JBEG,JEND) IEND=IBEG+ILOG-1 ILEN= (IEND-IBEG+1) + (JEND-JBEG+1) IF (ILEN .GT. LEN(RESULT) ) THEN ILEN=LEN(RESULT) WRITE(6,*)' WARNING: in concat_int_string: length overflow' WRITE(6,*)' cut string at: ',ilen ENDIF RESULT(1:ILEN)=STRING(JBEG:JEND)//TEMP(IBEG:IEND) RETURN END C END CONCAT_STRING_INT C...................................................................... C...................................................................... C SUB DAMP_GAPWEIGHT SUBROUTINE DAMP_GAPWEIGHT(IBEG,IEND,VALUE,NDAMP,PUNISH) C damp the gap-open weights by taking the mean of the range +- ndamp C CAUTION set "punish" high enough C NOT true anymore: if indels in sec-struc are not allowed these C positions are not taken into account (punish) IMPLICIT NONE INCLUDE 'maxhom.param' REAL PUNISH C INPUT REAL VALUE(*) INTEGER IBEG,IEND,NDAMP,NPOS C INTERNAL INTEGER I,J REAL SUM DO I=IBEG,IEND SUM=0.0 NPOS=0 DO J=MAX(I-NDAMP,IBEG),MIN(I+NDAMP,IEND) SUM=SUM + VALUE(J) NPOS=NPOS+1 ENDDO VALUE(I)= SUM / FLOAT(NPOS) ENDDO RETURN END C END DAMP_GAPWEIGHT C...................................................................... C...................................................................... C SUB DO_ALIGN SUBROUTINE DO_ALIGN(LH1,LH2,ISET,IALIGN,NRECORD,SDEV) IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C C import implicit C LPASS2=(from maxhom) true if protein IALIGN to take for 2nd pass C C import C ISET= (from maxhom) number of processor (=0 if not parallel) C IALIGN=(from maxhom) number of proteins aligned before, i.e. C current protein is (IALIGN+1)! C INTEGER ISET,IALIGN,NRECORD REAL SDEV C internal REAL LH1(0:MAXMAT) INTEGER*2 LH2(0:MAXTRACE) C REAL LH(0:MAXMAT*2) LOGICAL LERROR INTEGER I,IBEG,IEND,ND1,ND2,NDMAT,N2,N2NEW,N2REST INTEGER NTEST,BESTIIPOS,BESTJJPOS,NREGION,IBREAK,JBREAK INTEGER IPOSBEG,IPOSEND,JPOSBEG,JPOSEND REAL BESTVAL CHARACTER CSYMBOL LOGICAL LDBG_LOCAL C---- ------------------------------------------------------------------ C INIT C---- ------------------------------------------------------------------ LTRACEOUT= .FALSE. C BR 99.09: just to write out dbg LDBG_LOCAL=.FALSE. C LDBG_LOCAL=.TRUE. IF (LDSSP_2) THEN CALL LOWER_TO_CYS(CSQ_2,N2IN) ENDIF CALL SEQ_TO_INTEGER(CSQ_2,LSQ_2,N2IN,TRANSPOS) C get position of chain breaks CALL GETCHAINBREAKS(N2IN,LSQ_2,STRUC_2,TRANS,NBREAK_2,IBREAKPOS_2) IF (LDSSP_2) THEN CALL STR_TO_INT(N2IN,STRUC_2,LSTRUC_2,STRTRANS ) CALL STR_TO_CLASS(MAXSTRSTATES,STR_CLASSES,N2IN,STRUC_2, + STRCLASS_2,LSTRCLASS_2) CALL ACC_TO_INT(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES_2,NIOSTATES_2,IORANGE,N2IN, + LSQ_2,LSTRCLASS_2,NSURF_2,LACC_2) C not DSSP ELSE I=INDEX(STRTRANS,'U') CALL INIT_INT_ARRAY(1,N2IN,LSTRUC_2,I) DO I=1,MAXSTRSTATES IF ( INDEX(STR_CLASSES(I),STRUC_2(1)) .NE. 0) THEN CALL INIT_INT_ARRAY(1,N2IN,LSTRCLASS_2,I) CSYMBOL=STR_CLASSES(I)(1:1) ENDIF ENDDO DO I=1,N2IN STRCLASS_2(I:I)=CSYMBOL ENDDO CALL INIT_INT_ARRAY(1,N2IN,LACC_2,1) ENDIF C set gap-open to a high value in SECONDARY STRUCTURE SEGMENTS IF (.NOT. LINSERT_2 .AND. LDSSP_2) THEN CALL PUNISH_GAP(N2IN,STRUC_2,'HE',PUNISH,GAPOPEN_2 ) ENDIF IEND= 0 LSHIFTED=.FALSE. N2= N2IN N2REST= N2IN NSHIFTED=0 C ATTEMPT TO USE N2 FOR ALIGNMENT C RESET N2 TO A VALUE SMALLER THAN N2REST IF NEEDED C SET ND1 AND ND2, THE MATRIX DIMENSION TO BE USED CAUTION LH(O:ND1,0:ND2) 350 ND1= N1+1 ND2= N2+1 NDMAT= (1+ND1)*(1+ND2) LSHIFTED=(NDMAT.GT.MAXTRACE) IF (LSHIFTED) THEN ND2= (INT(MAXTRACE/(ND1+1)) )-1 N2=ND2-1 CALL OPEN_FILE(KWARN,WARNFILE,'UNKNOWN,APPEND',LERROR) CALL STRPOS(NAME_2,IBEG,IEND) WRITE(LOGSTRING(1:),'(A,I10,I10,I10,A,I8,A,A)') + ' *** WARN: MAXTRACE or MAXMAT OVERFLOW: ', + MAXTRACE,MAXMAT,NDMAT, + ' TRUNCATED TO:',N2,' FOR: ',name_2(ibeg:iend) CALL LOG_FILE(KLOG,LOGSTRING,1) CALL LOG_FILE(KWARN,LOGSTRING,0) NDMAT=(1+ND1)*(1+ND2) CALL CLOSE_FILE(KWARN,WARNFILE) ENDIF C======================================================================= C TRACE-FILE C HEADERS TO PLOT FILE..(after the second run ) C======================================================================= LTRACEOUT=.FALSE. IF (LTRACE .AND. .NOT. LPASS2) THEN LTRACEOUT=.TRUE. ENDIF C======================================================================= C THE MEAT C======================================================================= NTEST=0 BESTVAL=1000000.0 C======================================================================= C the NBEST alignments are selected via TRACE C======================================================================= NREGION=(NBREAK_1+1) * (NBREAK_2+1) DO WHILE (NTEST .LT. NREGION*NBEST .AND. BESTVAL.GT.0.0) DO IBREAK=1,NBREAK_1+1 IF (IBREAK .GT. NBREAK_1) THEN IPOSEND=N1 ELSE IPOSEND=IBREAKPOS_1(IBREAK)-1 ENDIF IF (IBREAK .EQ. 1) THEN IPOSBEG=1 ELSE IPOSBEG=IBREAKPOS_1(IBREAK-1)+1 ENDIF DO JBREAK=1,NBREAK_2+1 IF (JBREAK .GT. NBREAK_2) THEN JPOSEND=N2 ELSE JPOSEND=IBREAKPOS_2(JBREAK)-1 ENDIF IF (JBREAK .EQ. 1) THEN JPOSBEG=1 ELSE JPOSBEG=IBREAKPOS_2(JBREAK-1)+1 ENDIF C check if the 2 sequences are identical LSAMESEQ=.FALSE. IF (.NOT. LSHOW_SAMESEQ) THEN IF (IPOSEND-IPOSBEG .EQ. JPOSEND-JPOSBEG) THEN LSAMESEQ=.TRUE. I=1 DO WHILE (I .LT. (IPOSEND-IPOSBEG+1) + .AND. LSAMESEQ) IF (CSQ_1(I:I) .NE. CSQ_2(I:I) ) THEN LSAMESEQ=.FALSE. ENDIF I=I+1 ENDDO IF (LSAMESEQ) WRITE(6,*)' identical sequences ' ENDIF else ENDIF c default trace is diagonal IF (LBACKWARD) THEN DO I=0,NDMAT LH2(I)=1 ENDDO c do i=ndmat,ndmat*2 ; lh(i)=20000.0 ; enddo cwrong call init_real_array(ndmat,ndmat*2,lh,20000.0) CALL SETMATRIX(IPOSBEG,IPOSEND,JPOSBEG, + JPOSEND,N2,LH1,LH2) CALL GETBEST(IPOSBEG+1,IPOSEND+1,JPOSBEG+1, + JPOSEND+1,1,NTEST,LH1,LH2,ND1,ND2, + BESTVAL,BESTIIPOS,BESTJJPOS) WRITE(6,*)BESTVAL,BESTIIPOS,BESTJJPOS SUBOPT_VAL=BESTVAL-((FILTER_VAL*BESTVAL)/100.0) CALL SETBACK(IPOSBEG,IPOSEND,JPOSBEG, + JPOSEND,N2,LH1,LH2,BESTVAL) ELSE CALL INIT_INT2_ARRAY(0,NDMAT,LH2,1) CALL SETMATRIX_FAST(IPOSBEG,IPOSEND,JPOSBEG, + JPOSEND,N2,LH2,BESTVAL,BESTIIPOS, + BESTJJPOS) ENDIF C NOTE: TRACE will aplpy threshold, and return LCONSIDER=.FALSE. C if below threshold! IF (BESTVAL.GT.0.0) THEN CALL TRACE(ISET,ND1,ND2,LH2,IPOSBEG,JPOSBEG, + BESTVAL,BESTIIPOS,BESTJJPOS,NTEST,SDEV, + IALIGN,NRECORD) ENDIF ENDDO ENDDO ENDDO C======================================================================= IF (.NOT. LPASS2 .AND. LTRACE) THEN LTRACE=.FALSE. LTRACEOUT=.FALSE. CLOSE(KPLOT) ENDIF C======================================================================= C ENTRY FOR SHIFTED REPEAT OF TOO LONG SEQUENCE C N2 was used in previous alignment IF (LSHIFTED) THEN IEND=N2-1 IF (IEND.EQ.0) THEN STOP' MAXMAT, MAXTRACE OR MAXSQ TOO SMALL, IEND=0' ENDIF DO I=1,N2REST-IEND CSQ_2(I:I)=CSQ_2(I+IEND:I+IEND) STRUC_2(I)=STRUC_2(I+IEND) LSQ_2(I)=LSQ_2(I+IEND) NSURF_2(I)=NSURF_2(I+IEND) ENDDO DO I=N2REST-IEND+1,N2REST CSQ_2(I:I)=' ' STRUC_2(I)=' ' LSQ_2(I)=0 NSURF_2(I)=0 ENDDO N2NEW=N2REST-IEND C NEW LENGTH TO USE IS N2NEW N2REST=N2NEW NSHIFTED=NSHIFTED+IEND c WRITE(6,'(a,i6)')'>>REPEAT PASS, TOTAL SHIFT:',nshifted N2=N2REST GOTO 350 ENDIF C======================================================================= C calculate conservation weights C then next sequence in file list or global sort C======================================================================= IF ( LALIOVERFLOW .EQV. .FALSE.) THEN IF (LPASS2 .EQV. .TRUE. .AND. + LCONSERV_1 .EQV. .TRUE. .AND. + LCONSIMPORT .EQV. .FALSE. .AND. + IALIGN .GT. 0) THEN C WRITE(6,*)' CALL GETCONSWEIGHT i=',IALIGN CALL GETCONSWEIGHT(N1,IALIGN,LSQ_1) ENDIF IALIGNOLD=IALIGN ENDIF C======================================================================= C debug C======================================================================= C IF (LDBG_LOCAL) THEN C DO I=1,N1 C WRITE(6,'(I,F7.2)')I,SMIN C ENDDO C DO I=1,N1 C WRITE(6,'(I,F7.2)')I,SMAX C ENDDO C DO I=1,N1 C WRITE(6,'(I,F7.2)')I,OPEN_1 C ENDDO C DO I=1,N1 C WRITE(6,'(I,F7.2)')I,SMIN*CONSWEIGHT_1(I) C ENDDO C DO I=1,N1 C WRITE(6,'(I,F7.2)')I,SMAX*CONSWEIGHT_1(I) C ENDDO C DO I=1,N1 C WRITE(6,'(I,F7.2)')I,OPEN_GAP_1(I) C ENDDO C END IF C end dbg C======================================================================= RETURN END C END DO_ALIGN C...................................................................... C...................................................................... C SUB EXTRACT_INTEGER SUBROUTINE EXTRACT_INTEGER(LINE,CDIVIDE,KEYWORD,INTVAL) C extract an integer from a line beginning with a keyword ; cdivide C indicates the border between keyword and value for keyword C like: THIS_IS_A_KEYWORD : this_is_the_value_for_keyword IMPLICIT NONE C import CHARACTER*(*) LINE,KEYWORD,CDIVIDE c export INTEGER INTVAL c internal INTEGER LENKEY,I,J,IBEG c====================================================================== CALL STRPOS(KEYWORD,I,J) LENKEY=J-I+1 IF ( LINE(1:LENKEY) .EQ. KEYWORD(I:J) ) THEN CALL STRPOS(LINE,I,J) IBEG=INDEX(LINE,CDIVIDE) IF (IBEG .EQ. 0) THEN WRITE(6,'(A,A,A)') + 'ERROR IN EXTRACT_INTEGER: no ',cdivide,'in line' STOP ENDIF CALL STRPOS(LINE(IBEG+1:J),I,J) CALL READ_INT_FROM_STRING(LINE(IBEG+I:IBEG+J),INTVAL) c WRITE(6,'(A,A,I6)')line(1:lenkey),' is: ',intval ENDIF RETURN END C END EXTRACT_INTEGER C...................................................................... C...................................................................... C SUB EXTRACT_REAL SUBROUTINE EXTRACT_REAL(LINE,CDIVIDE,KEYWORD,REALVAL) C extract an integer from a line beginning with a keyword ; cdivide C indicates the border between keyword and value for keyword C like: THIS_IS_A_KEYWORD : this_is_the_value_for_keyword IMPLICIT NONE C import CHARACTER*(*) LINE,KEYWORD,CDIVIDE c export REAL REALVAL c internal INTEGER LENKEY,I,J,IBEG c====================================================================== CALL STRPOS(KEYWORD,I,J) LENKEY=J-I+1 IF ( LINE(1:LENKEY) .EQ. KEYWORD(I:J) ) THEN CALL STRPOS(LINE,I,J) IBEG=INDEX(LINE,CDIVIDE) IF (IBEG .EQ. 0) THEN WRITE(6,'(A,A,A)') + 'ERROR IN EXTRACT_REAL: no ',cdivide,'in line' STOP ENDIF CALL STRPOS(LINE(IBEG+1:J),I,J) CALL READ_REAL_FROM_STRING(LINE(IBEG+I:IBEG+J),REALVAL) c WRITE(6,'(A,A,F7.2)')line(1:lenkey),' is: ',realval ENDIF RETURN END C END EXTRACT_REAL C...................................................................... C...................................................................... C SUB EXTRACT_INTEGER_RANGE SUBROUTINE EXTRACT_INTEGER_RANGE(LINE,CDIVIDE1,CDIVIDE2,INTVAL) C extract two integers from a line ; C cdivide1 indicates the border between keyword and values for keyword C cdivide2 seperetes the two values C like: THIS_IS_A_KEYWORD : first_value_for_keyword - second_value IMPLICIT NONE C import CHARACTER*(*) LINE,CDIVIDE1,CDIVIDE2 c export INTEGER INTVAL(1,2) c internal INTEGER I,J,IBEG1,IBEG2 c====================================================================== IBEG1=INDEX(LINE,CDIVIDE1) IBEG2=INDEX(LINE,CDIVIDE2) IF (IBEG1.EQ.0 .OR. IBEG2 .EQ. 0) THEN WRITE(6,'(A,A,A,A)') + 'ERROR IN EXTRACT_INTEGER_RANGE: no ',cdivide1,' or ', + cdivide2 STOP ENDIF CALL STRPOS(LINE(IBEG1+1:IBEG2-1),I,J) CALL READ_INT_FROM_STRING(LINE(IBEG1+I:IBEG1+J),INTVAL(1,1) ) CALL STRPOS(LINE(IBEG2+1:),I,J) CALL READ_INT_FROM_STRING(LINE(IBEG2+I:IBEG2+J),INTVAL(1,2) ) RETURN END C END EXTRACT_INTEGER_RANGE C...................................................................... C...................................................................... C SUB EXTRACT_STRING SUBROUTINE EXTRACT_STRING(LINE,CDIVIDE,KEYWORD,STRING) C extract a string from a line beginning with a keyword ; cdivide C indicates the border between keyword and value for keyword C like: THIS_IS_A_KEYWORD : this_is_the_string_for_keyword IMPLICIT NONE C import CHARACTER*(*) LINE,KEYWORD,CDIVIDE C export CHARACTER*(*) STRING C internal INTEGER LENKEY,I,J,IBEG C====================================================================== CALL STRPOS(KEYWORD,I,J) LENKEY=J-I+1 IF ( LINE(1:LENKEY) .EQ. KEYWORD(I:J) ) THEN CALL STRPOS(LINE,I,J) IBEG=INDEX(LINE,CDIVIDE) IF (IBEG.EQ.0) THEN WRITE(6,'(A,A,A)') + 'ERROR IN EXTRACT_STRING: no ',CDIVIDE,'in line' STOP ENDIF IF (J .GT. IBEG+1) THEN CALL STRPOS(LINE(IBEG+1:J),I,J) STRING=LINE(IBEG+I:IBEG+J) ELSE STRING=' ' ENDIF c WRITE(6,*)LINE(1:LENKEY)//' is: '//LINE(IBEG+I:IBEG+J) ENDIF RETURN END C END EXTRACT_STRING C...................................................................... C...................................................................... C SUB EVALPRED SUBROUTINE EVALPRED(PROTEIN,METHOD,PRED,STRUC,NRES,LDSSP, + KOUT,KSTA) C EXTERNAL LOGICAL LDSSP CHARACTER*1 STRUC(*),PRED(*) CHARACTER*(*) METHOD, PROTEIN INTEGER NRES, KOUT, KSTA C files KOUT and KSTA must be open for write C INTERNAL PARAMETER (MSTATES= 3) C *10 ALIASES CHARACTER*10 STATES(MSTATES) C (PREDICTED,OBSERVED) sub=0 means undefined symbol. DIMENSION NC(0:MSTATES,0:MSTATES),NCOBS(0:MSTATES) DIMENSION NCPRE(0:MSTATES),MPERPRE(MSTATES),MPEROBS(MSTATES) CAUTION - ANY CHANGE IN THE ORDER OF C STATES MUST BE MADE IN PRED-STAT AS WELL C SHEET LOOP HELIX DATA STATES/'EBAPMebapm','TCLS tcls ','HGI..hgi..'/ *----------------------------------------------------------------------* C PROCEDURE DO NP=0,MSTATES DO NS=0,MSTATES NC(NP,NS)=0 ENDDO ENDDO NUNPRED=0 NPRED=0 DO I=1,NRES C FIND STRUCTURE INDEX NP=0 NS=0 DO LS=1,MSTATES IF (INDEX(STATES(LS), PRED(I)) .NE. 0) NP=LS IF (INDEX(STATES(LS), STRUC(I)) .NE. 0) NS=LS ENDDO C OBS only via DSSP IF (LDSSP) THEN IF (NS .EQ. 0) THEN WRITE(6,*)'UNKNOWN DSSP STATE AT RES',I, struc(i) c STOP'*** error in EVALPRED ' ENDIF ELSE NS=0 ENDIF C INCREMENT COUNTER NC(NP,NS)=NC(NP,NS)+1 IF (NP .NE. 0) THEN NPRED=NPRED+1 ELSE NUNPRED=NUNPRED+1 ENDIF ENDDO C (I,J) = (PREDICTED,OBSERVED) C SUCCESS RATES: NCII=SUM(OVER I.NE.0) NC(I,I) C NCOBS(J)=SUM(OVER I=1..3) NC(I,J) C of those predicted C NCPRE(I)=SUM(OVER J=0..3) NC(I,J) of all C PREDICTED RES : NPRED=SUM(OVER I=1..3) NCPRE(I) C UNPREDICTED NUNPRED=NCPRE(0) NCII=0 DO I=0,MSTATES NCOBS(I)=0 NCPRE(I)=0 DO J=0,MSTATES IF (I .EQ. J .AND. I .NE. 0) NCII=NCII+NC(I,J) C not the unpredicted IF (J .NE. 0) NCOBS(I)=NCOBS(I)+NC(J,I) C all (not) observed NCPRE(I)=NCPRE(I)+NC(I,J) ENDDO ENDDO IF (NRES.NE.0) THEN PERPRED=NINT(100.*NPRED/FLOAT(NRES)) ELSE PERPRED=0.0 WRITE(6,*)'***EVALPRED: NRES=0' ENDIF C check for consistency IF (NUNPRED .NE. NCPRE(0)) THEN WRITE(6,*) NUNPRED,NCPRE(0) STOP '*** EVALPRED: NUNPRED.NE.NCPRE(0), you idiot ' ENDIF IF (NPRED.NE.NRES-NUNPRED) THEN WRITE(6,*) NPRED, NUNPRED, NRES WRITE(6,*) + '*** EVALPRED ERROR: NPRED,NUNPRED,NRES do not add up' ENDIF C print IF (LDSSP) THEN IF (NPRED.NE.0) THEN CORRECT=NCII/FLOAT(NPRED)*100 ELSE CORRECT=0.0 WRITE(6,*)'***EVALPRED: NPRED=0' ENDIF IF (KOUT.NE.0) THEN WRITE(KOUT,110) PROTEIN,METHOD,NRES,NPRED,PERPRED,CORRECT ENDIF WRITE( *,110) PROTEIN,METHOD,NRES,NPRED,PERPRED,CORRECT ELSE CORRECT=0.0 IF (KOUT.NE.0) THEN WRITE(KOUT,110) PROTEIN,METHOD,NRES,NPRED,PERPRED ENDIF WRITE( *,110) PROTEIN,METHOD,NRES,NPRED,PERPRED 110 FORMAT(1X,A4,1X,A10,I5,' residues',I5,' predicted.',/, + ' Result: ',F5.1,'% predicted',F7.1,'% correct') C LDSSP ENDIF C percentage in the universe of predicted (NPRED.LE.NRES) DO I=1,MSTATES IF (NPRED.NE.0) THEN MPERPRE(I)=NINT(NCPRE(I)/FLOAT(NPRED)*100.0) ELSE MPERPRE(I)=0 ENDIF IF (NPRED.NE.0) THEN MPEROBS(I)=NINT(NCOBS(I)/FLOAT(NPRED)*100.0) ELSE MPEROBS(I)=0 ENDIF ENDDO C IF (KOUT.NE.0) THEN WRITE(KOUT,113) 113 FORMAT(40X,'P R E D I C T E D ') WRITE(KOUT,114) (STATES(J),J=1,MSTATES),'total',' %' 114 FORMAT(40X,10(1X,A5)) IF (LDSSP) THEN DO J=1,MSTATES WRITE(KOUT,112)'OBS',STATES(J),(NC(I,J),I=1,MSTATES), + NCOBS(J),MPEROBS(J) ENDDO ENDIF C DSSP or no DSSP: WRITE(KOUT,112)' ',' ',(NCPRE(I), I=1,MSTATES) 112 FORMAT(1X,30X,A3,1X,A5,10I6) WRITE(KOUT,112)' ','!',(MPERPRE(I), I=1,MSTATES) ENDIF C output for prediction statistics WRITE(KSTA,111) PROTEIN,METHOD,NRES,NPRED,CORRECT, + ((NC(I,J),I=1,MSTATES),J=1,MSTATES) 111 FORMAT(A4,1X,A10,2I5,F5.1,'%',20I5) RETURN END C END EVALPRED C...................................................................... C================================================================ c$$$ subroutine fetch_sw_seq(path,indexfile,datafile,kindex,kdat, c$$$ + MAXSQ,nres,name,compnd,ACCESSION,pdbref, c$$$ + seq,lend) c$$$ c$$$ implicit none c$$$C import c$$$ integer MAXSQ,kindex,kdat c$$$ character*(*) path,indexfile,datafile c$$$C export c$$$ integer nres c$$$ character*(*) name,compnd,ACCESSION,pdbref,seq c$$$ logical lend,lbinary c$$$C internal c$$$ integer maxchar,indexreclen,nsize c$$$ parameter (maxchar=38,indexreclen=40,nsize=12) c$$$ c$$$ integer i,j,ipos,jpos,irec,idatindex,ifile c$$$ logical lfound,lerror c$$$ character*132 templine,filename c$$$ character alphabet*(maxchar) c$$$ character testline*(indexreclen) c$$$ c$$$ alphabet='0199996789ABCDEFGHIJKLMNOPQRSTUVWXYZ_.' c$$$ idatindex=0 c$$$ lbinary=.true. c$$$ c$$$ call concat_strings(path,indexfile,filename) c$$$ call open_file(kindex,filename, c$$$ + 'OLD,DIRECT,FORMATTED,READONLY,RECL=40',lerror) c$$$ c$$$ call strpos(name,i,j) c$$$ c$$$ lfound=.false. c$$$ ipos=index(alphabet,name(i:i)) c$$$ jpos=index(alphabet,name(i+1:i+1)) c$$$ irec= ( ( (ipos-1) * maxchar) + jpos) + 1 c$$$ read(kindex,'(2x,i8)',rec=irec)irec c$$$ if (irec .eq. 0)goto 900 c$$$ c$$$ do while(.not. lfound) c$$$ read(kindex,'(a)',rec=irec)testline c$$$ if (index (testline,name(i:j)) .ne. 0) then c$$$ read(testline,'(12x,a,i8,i8)') c$$$ + ACCESSION(1:nsize),idatindex,ifile c$$$ lfound=.true. c$$$ endif c$$$ irec=irec+1 c$$$ enddo c$$$ close(kindex) c$$$ if (idatindex .ge. 1) then c$$$ call concat_int_string(ifile,datafile,filename) c$$$ call concat_strings(path,filename,templine) c$$$ call open_file(kdat,templine,'OLD',lerror) c$$$ do i=1,idatindex-1 c$$$ read(kdat,'(a)')testline(1:1) c$$$ enddo c$$$ call get_swiss_entry(MAXSQ,kdat,lbinary,nres,name,compnd, c$$$ + ACCESSION,pdbref,seq,lend) c$$$ close(kdat) c$$$ return c$$$ endif c$$$900 WRITE(6,*)'*** ERROR: index in fetch_sw_seq is 0 ;' c$$$ WRITE(6,*)' or nothing found' c$$$ nres=0 c$$$ name=' ' c$$$ compnd=' ' c$$$ ACCESSION=' ' c$$$ pdbref=' ' c$$$ seq=' ' c$$$ return c$$$ end C====================================================================== C...................................................................... C...................................................................... C SUB FILLSIMMETRIC SUBROUTINE FILLSIMMETRIC(MAXRES,NTRANS,MAXSTRSTATES,maxiostates, + NSTRSTATES_1,NSTRSTATES_2,CSTRSTATES,SIMMETRIC,NRES, + LSEQ,LSTR,LACC,POSSIMMETRIC) IMPLICIT NONE INTEGER NTRANS,MAXRES,NRES INTEGER MAXSTRSTATES,maxiostates INTEGER NSTRSTATES_1,NSTRSTATES_2 CHARACTER*(*) CSTRSTATES REAL SIMMETRIC(NTRANS,NTRANS,MAXSTRSTATES,MAXIOSTATES, + MAXSTRSTATES,MAXIOSTATES) INTEGER LSEQ(*),LACC(*),LSTR(*) REAL POSSIMMETRIC(MAXRES,*) C internal INTEGER I,J,ISTR C IF (NSTRSTATES_2 .GT. 1) THEN WRITE(6,*)' **** ERROR: nstrstates_2 .gt. 1' WRITE(6,*)' not possible to fill position dependend metric' STOP ENDIF DO I=1,NRES IF (NSTRSTATES_1 .GT.1) THEN ISTR=LSTR(I) IF (ISTR .EQ. 0)ISTR=1 ELSE ISTR=1 ENDIF IF (LSEQ(I) .EQ. 0) THEN DO J=1,NTRANS WRITE(6,*)'fillsimmetric: lseq unknown: ',lseq(i) POSSIMMETRIC(I,J)=0.0 ENDDO ELSE DO J=1,NTRANS c WRITE(6,'(a)')'fill i,j,lseq,lstr,lacc: ' c WRITE(6,'(5(i4))')i,j,lseq(i),istr,lacc(i) POSSIMMETRIC(I,J)=SIMMETRIC(LSEQ(I),J,ISTR,LACC(I),1,1) ENDDO ENDIF ENDDO RETURN END C END FILLSIMMETRIC C...................................................................... C...................................................................... C SUB FINDBRKFILE SUBROUTINE FINDBRKFILE(PDBFILE,PDBPATH,PID,KPDB,KLOG,LERROR) IMPLICIT NONE CHARACTER*(*) PDBFILE,PDBPATH,PID CHARACTER CEXT*30 LOGICAL LERROR INTEGER KPDB,KLOG C internal CHARACTER*200 LOGSTRING LERROR=.FALSE. cext='.brk' c cext='.pdb' IF (PDBPATH.EQ.' ') THEN CALL CONCAT_STRINGS(PID,CEXT,PDBFILE) ELSE CALL CONCAT_STRINGS(PID,CEXT,LOGSTRING) CALL CONCAT_STRINGS(PDBPATH,LOGSTRING,PDBFILE) ENDIF CALL OPEN_FILE(KPDB,PDBFILE,'OLD,READONLY',LERROR) IF (LERROR) THEN CALL CONCAT_STRINGS('PDB-FILE NOT FOUND: ',PDBFILE,LOGSTRING) CALL LOG_FILE(KLOG,LOGSTRING,1) ENDIF CLOSE(KPDB) RETURN END C END FINDBRKFILE C...................................................................... C...................................................................... C SUB GET_DEFAULT SUBROUTINE GET_DEFAULT() C get the system specific location of files C MAXHOM_DEFAULT is a logical name pointing to the maxhom.default file C VMS : assign $1:[schneider.public]maxhom.default C UNIX: setenv maxhom_default /home/schneider/public/maxhom.default C a file "maxhom.default" in the current directory has higher priority C METRIC_PATH : location of exchange metrices C SWISSPROT_SEQ : location of swissprot files C RELEASE_NOTES : release notes of EMBL/SWISSPROT C PDB_PATH : location of Brookhaven files C DSSP_PATH : location of DSSP files C COREPATH : directory path for corefile C COREFILE : where to put the temporary binary file to store the C alignments IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' c internal INTEGER IBEG,IEND LOGICAL LEXIST,LERROR CHARACTER*200 LINE CHARACTER*1 CDIVIDE C INIT LEXIST=.FALSE. METRICPATH=' ' SWISSPROT_SEQ=' ' SW_CURRENT=' ' SPLIT_DB_NAMES=' ' C SW_DATA=' ' ; SW_INDEX=' ' ; SW_PATH=' ' RELNOTES=' ' PDBPATH=' ' DSSP_PATH=' ' COREPATH=' ' COREFILE=' ' FILTER_FASTA_EXE=' ' FASTA_EXE=' ' FILTER_BLASTP_EXE=' ' BLASTP_EXE=' ' CONVERTSEQ_EXE=' ' CDIVIDE=':' C check existence of default file and open IF (MAXHOM_DEFAULT .EQ. ' ') THEN MAXHOM_DEFAULT= 'maxhom.default' ENDIF IF (MAXHOM_DEFAULT .NE. ' ') THEN INQUIRE(FILE=MAXHOM_DEFAULT,EXIST=LEXIST) ENDIF IF (LEXIST) THEN CALL STRPOS(MAXHOM_DEFAULT,IBEG,IEND) WRITE(6,*)' default file is: ',maxhom_default(ibeg:iend) CALL FLUSH_UNIT(6) LINE='OLD,READONLY' CALL OPEN_FILE(KDEF,MAXHOM_DEFAULT,LINE,LERROR) ELSE WRITE(6,*)' ERROR: can not find default file ' WRITE(6,*)' Check environment variable MAXHOM_DEFAULT or ' WRITE(6,*)' specify default file with option -d=filename ' call flush_unit(6) STOP ENDIF C read defaults DO WHILE(.TRUE.) c read(kdef,'(a)',end=999)line READ(KDEF,'(A)',END=999,ERR=999)LINE c WRITE(6,*)line(1:40) IF (LINE(1:2) .EQ. '##') THEN GOTO 999 ENDIF IF (LINE(1:1) .NE. '#' .AND. LINE .NE.' ') THEN CALL EXTRACT_STRING(LINE,CDIVIDE,'MACHINE',CMACHINE) CALL EXTRACT_STRING(LINE,CDIVIDE,'COREPATH',COREPATH) CALL EXTRACT_STRING(LINE,CDIVIDE,'COREFILE',COREFILE) CALL EXTRACT_STRING(LINE,CDIVIDE,'METRIC_PATH',METRICPATH) CALL EXTRACT_STRING(LINE,CDIVIDE,'SWISSPROT_SEQ', + SWISSPROT_SEQ) CALL EXTRACT_STRING(LINE,CDIVIDE,'SWISSPROT_CURRENT', + SW_CURRENT) CALL EXTRACT_STRING(LINE,CDIVIDE,'SPLIT_DB',SPLIT_DB_NAMES) c call extract_string(line,cdivide,'SWISSPROT_INDEX',sw_index) c call extract_string(line,cdivide,'SWISSPROT_PATH',sw_path) c call extract_string(line,cdivide,'SWISSPROT_DATA',sw_data) CALL EXTRACT_STRING(LINE,CDIVIDE,'RELEASE_NOTES', + RELNOTES) CALL EXTRACT_STRING(LINE,CDIVIDE,'PDB_PATH',PDBPATH) CALL EXTRACT_STRING(LINE,CDIVIDE,'DSSP_PATH',DSSP_PATH) CALL EXTRACT_STRING(LINE,CDIVIDE,'FILTER_FASTA_EXE', + FILTER_FASTA_EXE) CALL EXTRACT_STRING(LINE,CDIVIDE,'FASTA_EXE',FASTA_EXE) CALL EXTRACT_STRING(LINE,CDIVIDE,'FILTER_BLASTP_EXE', + FILTER_BLASTP_EXE) CALL EXTRACT_STRING(LINE,CDIVIDE,'BLASTP_EXE',BLASTP_EXE) CALL EXTRACT_STRING(LINE,CDIVIDE,'CONVERTSEQ_EXE', + CONVERTSEQ_EXE) ENDIF ENDDO 999 CLOSE(KDEF) IF (INDEX(CMACHINE,'UNIX').NE.0) THEN CMACHINE='UNIX' ELSE IF (INDEX(CMACHINE,'VMS').NE.0) THEN CMACHINE='VMS' ELSE WRITE(6,*)' *** MACHINE type UNKNOWN (assume UNIX) ***' CMACHINE='UNIX' ENDIF IF (COREFILE .EQ. ' ' ) THEN WRITE(6,*)' ERROR: COREFILE UNDEFINED' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' STOP ELSE IF (COREPATH .EQ. ' ' ) THEN WRITE(6,*)' WARNING: COREPATH UNDEFINED' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (METRICPATH .EQ. ' ') THEN WRITE(6,*)' ERROR: METRIC_PATH undefined' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' STOP ELSE IF (SWISSPROT_SEQ .EQ. ' ') THEN WRITE(6,*)' WARNING: SWISSPROT_SEQ undefined ' WRITE(6,*)' no search against database possible ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (SPLIT_DB_NAMES .EQ. ' ') THEN WRITE(6,*)' WARNING: SPLIT_DB undefined ' WRITE(6,*)' no parallel search against database possible ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' c else if (sw_data .eq. ' ') then c WRITE(6,*)' WARNING: SW_DATA undefined ' c WRITE(6,*)' no search against database possible ' c WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' c else if (sw_index .eq. ' ') then c WRITE(6,*)' WARNING: SW_INDEX undefined ' c WRITE(6,*)' no search against database possible ' c WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' c else if (sw_path .eq. ' ') then c WRITE(6,*)' WARNING: SW_PATH undefined ' c WRITE(6,*)' no search against database possible ' c WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (SW_CURRENT .EQ. ' ') THEN WRITE(6,*)' WARNING: SWISSPROT_CURRENT undefined ' WRITE(6,*)' no search with blastp possible ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (RELNOTES .EQ. ' ') THEN WRITE(6,*)' WARNING: RELEASE_NOTES undefined ' WRITE(6,*)' no information about database ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (PDBPATH .EQ. ' ') THEN WRITE(6,*)' WARNING: PDB_PATH undefined ' WRITE(6,*)' no superposition in 3-D possible ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (DSSP_PATH .EQ. ' ') THEN WRITE(6,*)' WARNING: DSSP_PATH undefined ' WRITE(6,*)' no check of pdb-pointers from SwissProt ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (FILTER_FASTA_EXE .EQ. ' ') THEN WRITE(6,*)' WARNING: FILTER_FASTA_EXE undefined ' WRITE(6,*)' no pre-filtered run against database ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (FASTA_EXE .EQ. ' ') THEN WRITE(6,*)' WARNING: FASTA_EXE undefined ' WRITE(6,*)' no FASTA-pre-filtered run against database ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (FILTER_BLASTP_EXE .EQ. ' ') THEN WRITE(6,*)' WARNING: FILTER_BLASTP_EXE undefined ' WRITE(6,*)' no pre-filtered run against database ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (BLASTP_EXE .EQ. ' ') THEN WRITE(6,*)' WARNING: BLASTP_EXE undefined ' WRITE(6,*)' no BLASTP-pre-filtered run against database ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (CONVERTSEQ_EXE .EQ. ' ') THEN WRITE(6,*)' WARNING: CONVERTSEQ_EXE undefined' WRITE(6,*)' at least no FASTA pre-filter possible ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ENDIF c WRITE(6,*)' get_default end' RETURN END C END GET_DEFAULT C...................................................................... C...................................................................... C SUB GET_FASTA_DB_ENTRY SUBROUTINE GET_FASTA_DB_ENTRY(MAXSQ,KUNIT,NRES,NAME,COMPOUND, + ACCESSION,PDBREF,SEQ,LEND) IMPLICIT NONE C import INTEGER MAXSQ,KUNIT C export CHARACTER*(*) SEQ,NAME,COMPOUND,ACCESSION,PDBREF INTEGER NRES LOGICAL LEND C internal INTEGER I,J,K,LINELEN PARAMETER (LINELEN= 500) CHARACTER LINE*(LINELEN) C====================================================================== LEND=.FALSE. NRES=0 C===================================================================== READ(KUNIT,'(A)',END=900,ERR=999)LINE J=INDEX(LINE,'|') IF (J .GT. 0) THEN K=INDEX(LINE(J+1:),'|') IF (K .GT. 0) THEN I=INDEX(LINE,' ') NAME(1:)=LINE(2:I-1) COMPOUND(1:)=LINE(I+1:) LINE(J:J)=' ' K=INDEX(LINE,'|') ACCESSION(1:)=LINE(J+1:K-1) ELSE J=INDEX(LINE,' ') NAME(1:)=LINE(2:J) COMPOUND(1:)=LINE(2:) ACCESSION(1:)=' ' WRITE(6,*)'WARNING from get_fasta_db: '// + 'entry line looks strange (no |)' WRITE(6,*)LINE(1:60) ENDIF ELSE J=INDEX(LINE,' ') NAME(1:)=LINE(2:J) COMPOUND(1:)=LINE(2:) ACCESSION(1:)=' ' WRITE(6,*)'WARNING from get_fasta_db: '// + 'entry line looks strange (no |)' WRITE(6,*)LINE(1:60) ENDIF SEQ=' ' c sequences starts in next line 100 READ(KUNIT,'(A)',ERR=999,END=900) LINE IF (LINE(1:1) .EQ. '>' .AND. NRES .NE. 0) THEN BACKSPACE(KUNIT) ELSE DO I=1,LINELEN IF ( LINE(I:I) .NE. ' ' ) THEN NRES=NRES+1 IF (NRES .LE. MAXSQ ) THEN SEQ(NRES:NRES)=LINE(I:I) ELSE c truncate if needed WRITE(6,*)' SEQ CUT TO MAXSQ: ',MAXSQ CALL FLUSH_UNIT(6) NRES=MAXSQ 200 READ(KUNIT,'(A)',ERR=999,END=900) LINE IF (LINE(1:1) .EQ. '>' ) THEN BACKSPACE(KUNIT) RETURN ENDIF GOTO 200 ENDIF ENDIF ENDDO GOTO 100 ENDIF C====================================================================== 900 IF (NRES .EQ. 0)LEND=.TRUE. RETURN 999 WRITE(6,*)' ERROR in get_fasta_db_entry ',name,nres c call flush_unit(6) STOP END C END GET_FASTA_DB_ENTRY C...................................................................... C...................................................................... C SUB GET_LDIREC SUBROUTINE GET_LDIREC(ND1,ND2,LH2,II,JJ,LDEL_DIREC) IMPLICIT NONE INTEGER ND1,ND2,II,JJ,LDEL_DIREC INTEGER*2 LH2(0:ND1,0:ND2) c real lh(0:nd1,0:nd2,2) LDEL_DIREC =ABS( LH2(II,JJ) ) c scratch once used trace LH2(II,JJ)=-1 RETURN END C END GET_LDIREC C...................................................................... C...................................................................... C SUB GET_LDIREC_FAST SUBROUTINE GET_LDIREC_FAST(ND1,ND2,LH2,II,JJ,LDEL_DIREC) IMPLICIT NONE INTEGER ND1,ND2,II,JJ,LDEL_DIREC INTEGER*2 LH2(0:ND1,0:ND2) c real lh(0:nd1,0:nd2) LDEL_DIREC =ABS( LH2(II,JJ) ) c scratch once used trace LH2(II,JJ)=-1 RETURN END C END GET_LDIREC_FAST C====================================================================== C NOTE: ONLY TEMPRARY TO REDUCE MEMORY REQUIREMENTS FOR MAXHOM C MIXED ROUTINES FROM: C SYSTEM-LIB C UTILITY-LIB C PROTEIN-LIB C HSSP-LIB C====================================================================== C...................................................................... C SUB GET_SEQ SUBROUTINE GET_SEQ(KIN,FILENAME,TRANS,CHAINS,COMPND,ACCESSION, + PDBREF,PDBNO,NRES,SEQ,STRUC,ACC,TRUNCATED,ERROR) C 13.5.93 IMPLICIT NONE C Import INTEGER KIN CHARACTER*(*) CHAINS CHARACTER*(*) TRANS, FILENAME, COMPND, ACCESSION, PDBREF C Export INTEGER NRES INTEGER PDBNO(*), ACC(*) CHARACTER*(*) SEQ, STRUC LOGICAL TRUNCATED, ERROR C Internal INTEGER I,J, RLEN CHARACTER*20 FORMATNAME LOGICAL LACCZERO C====================================================================== ACCESSION=' ' PDBREF=' ' COMPND=' ' TRUNCATED=.FALSE. CALL CHECKFORMAT(KIN,FILENAME,FORMATNAME,ERROR) IF ( ERROR ) THEN WRITE(6,*)'GET_SEQ: FILE OPEN ERROR, SET NRES=0 AND RETURN' WRITE(6,*)'FILENAME: ', FILENAME RETURN ENDIF CALL STRPOS(FILENAME,I,J) C..initialize NRES = 0 PDBREF = ' ' DO I = 1,LEN(SEQ) SEQ(I:I) = '-' STRUC(I:I) = 'U' ACC(I) = 0 ENDDO INQUIRE(KIN,RECL=RLEN) IF (FORMATNAME .EQ. 'BRK') THEN CALL READ_BRK(KIN,FILENAME,CHAINS,TRANS,RLEN,NRES, 1 COMPND,SEQ,PDBNO,TRUNCATED,ERROR) ELSE IF (FORMATNAME .EQ. 'FASTA') THEN CALL READ_FASTA(KIN,FILENAME,TRANS,RLEN,NRES,ACCESSION, 1 COMPND,SEQ,TRUNCATED,ERROR) ELSE IF (FORMATNAME .EQ. 'PIR') THEN CALL READ_PIR(KIN,FILENAME,TRANS,RLEN,NRES,ACCESSION, 1 COMPND,SEQ,TRUNCATED,ERROR) ELSE IF (FORMATNAME .EQ. 'EMBL') THEN CALL READ_EMBL(KIN,FILENAME,TRANS,RLEN,NRES, 1 COMPND,ACCESSION,PDBREF,SEQ,TRUNCATED,ERROR) ELSE IF (FORMATNAME .EQ. 'GCG') THEN CALL READ_GCG(KIN,FILENAME,TRANS,RLEN,NRES, 1 COMPND,SEQ,TRUNCATED,ERROR) ELSE IF (FORMATNAME .EQ. 'STAR') THEN COMPND = ' ' CALL READ_STAR(KIN,FILENAME,TRANS,RLEN,NRES, 1 SEQ,TRUNCATED,ERROR) ELSE IF (FORMATNAME .EQ. 'DSSP') THEN CALL READ_SEQ_FROM_DSSP(KIN,FILENAME,CHAINS,TRANS,RLEN, 1 SEQ,STRUC,ACC,PDBNO,COMPND,NRES,LACCZERO,TRUNCATED,ERROR) IF (LACCZERO) THEN WRITE(6,*)'***************************************' WRITE(6,*)'* WARNING: accessibility values are 0 *' WRITE(6,*)'***************************************' ENDIF ELSE IF (FORMATNAME .EQ. 'HSSP') THEN CALL READ_SEQ_FROM_HSSP(KIN,FILENAME,CHAINS,TRANS,RLEN, 1 SEQ,STRUC,ACC,PDBNO,COMPND,NRES,LACCZERO,TRUNCATED,ERROR ) ENDIF IF ( ERROR ) RETURN CALL STRPOS(FILENAME,I,J) WRITE(6,'(A,A10,A,A,A,I5)')'GET_SEQ: ',FORMATNAME,':', + FILENAME(1:J),' ',NRES IF ( TRUNCATED ) THEN WRITE(6,*)'TRUNCATED TO ',len(seq),nres,' RESIDUES' WRITE(6,*)'!!! INCREASE DIMENSION !!!' NRES=LEN(SEQ) ENDIF RETURN END C END GET_SEQ C...................................................................... C...................................................................... C SUB GET_SEQ_FROM_ALISEQ SUBROUTINE GET_SEQ_FROM_ALISEQ(ALISEQ,IFIR,ILAS,ALIPOINTER, 1 ALILEN,ALINO,SEQUENCE,NRES,ERROR) C 8.7.93 IMPLICIT NONE C Import INTEGER ALILEN, ALINO INTEGER IFIR(*),ILAS(*),ALIPOINTER(*) CHARACTER ALISEQ(*) C EXPORT INTEGER NRES CHARACTER*(*) SEQUENCE LOGICAL ERROR C INTERNAL INTEGER IPOS CHARACTER CGAPCHAR CGAPCHAR = '.' IF ( ALILEN .GT. LEN(SEQUENCE) ) THEN ERROR = .TRUE. WRITE(6,'(A)') 1 ' MAXRES over in GET_SEQ_FROM_ALISEQ(lib-convert)!' WRITE(6,'(A,I8,A,I8)') 1 ' len=',LEN(SEQUENCE),' should be=',ALILEN RETURN ENDIF NRES = 0 DO WHILE ( NRES .LT. IFIR(ALINO)-1 ) NRES = NRES + 1 SEQUENCE(NRES:NRES) = CGAPCHAR ENDDO DO IPOS = ALIPOINTER(ALINO), 1 ALIPOINTER(ALINO)+ILAS(ALINO)-IFIR(ALINO) NRES = NRES + 1 SEQUENCE(NRES:NRES) = ALISEQ(IPOS) ENDDO DO WHILE ( NRES .LT. ALILEN ) NRES = NRES + 1 SEQUENCE(NRES:NRES) = CGAPCHAR ENDDO RETURN END C END GET_SEQ_FROM_ALISEQ C...................................................................... C...................................................................... C SUB GETALIGN SUBROUTINE GETALIGN(KFILE,IRECORD,IFIR,LEN1,LENOCC,JFIR,JLAS, + IDEL,NDEL,VALUE,RMS,HOM,SIM,SDEV,DISTANCE,CHECKVAL) C GET ONE ALIGNMENT AS WRITTEN BY TRACE C an alignment is: C * LDSSP_2 NAME_2 COMPOUND ACCESSION PDBREF VALUE IFIR LEN1 LENOCC C C JFIR JLAS N2IN IDEL NDEL NSHIFTED RMS HOM SIM DISTANCE C C AL_2 [ SAL_2 (if ldssp_2 ] C C====================================================================== IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' c input INTEGER KFILE,IRECORD REAL CHECKVAL c output C CHARACTER AL_1*(*) C CHARACTER AL_2*(*),SAL_2*(*) INTEGER IFIR,JFIR,JLAS,IDEL,NDEL,LEN1,LENOCC REAL VALUE,SIM,SDEV,HOM,RMS,DISTANCE C INSERTIONS IN SEQ 2 C INTEGER IINS,INSLEN_LOCAL(*),INSBEG_1_LOCAL(*),INSBEG_2_LOCAL(*) C CHARACTER INSSEQ*(*) C INTERNAL INTEGER INSPOINTER_LOCAL CHARACTER LINE(4)*(MAXRECORDLEN) CHARACTER C*1 INTEGER K,IALIPOS,JALIPOS,IPOS,I,NLINE,IBEG,IEND REAL XCHECK C INIT C AL_1= ' ' C= ' ' LDSSP_2= .FALSE. NAME_2= ' ' C OMPND_2=' ' ACCESSION_2=' ' PDBREF_2= ' ' AL_2= ' ' SAL_2= ' ' LINE(1)= ' ' LINE(2)= ' ' LINE(3)= ' ' INSSEQ= ' ' IFIR= 0 LEN1= 0 LENOCC= 0 JFIR= 0 JLAS= 0 N2IN= 0 IDEL= 0 NDEL= 0 NSHIFTED= 0 VALUE= 0.0 RMS= 0.0 HOM= 0.0 SIM= 0.0 DISTANCE= 0.0 IINS= 0 INSLEN_LOCAL(1)= 0 INSBEG_1_LOCAL(1)=0 INSBEG_2_LOCAL(1)=0 SDEV= 0.0 LCONSIDER= .TRUE. READ(KFILE,REC=IRECORD)C,LCONSIDER,VALUE IF (C .NE. '*') THEN WRITE(6,*)C,IRECORD WRITE(LOGSTRING,'(A)') + '*** ERROR: INCORRECT RECORD BOUNDARY IN GETALIGN' CALL LOG_FILE(KLOG,LOGSTRING,1) STOP ENDIF C WRITE(6,*)LCONSIDER,VALUE ; CALL FLUSH_UNIT(6) C---- -------------------------------------------------- C---- only for alignments to take! C---- -------------------------------------------------- IF (LCONSIDER) THEN IRECORD=IRECORD+1 READ(KFILE,REC=IRECORD)NAME_2 IRECORD=IRECORD+1 READ(KFILE,REC=IRECORD)COMPND_2 IRECORD=IRECORD+1 READ(KFILE,REC=IRECORD)ACCESSION_2,PDBREF_2,LDSSP_2 IRECORD=IRECORD+1 READ(KFILE,REC=IRECORD)IFIR,LEN1,LENOCC,JFIR,JLAS,N2IN, + IDEL,NDEL,NSHIFTED,RMS,HOM,SIM,SDEV, + DISTANCE,IINS XCHECK=0.0 IF (CSORTMODE .EQ. 'DISTANCE' ) THEN XCHECK = DISTANCE ELSE IF (CSORTMODE .EQ.'VALUE' .OR. CSORTMODE.EQ.'ZSCORE') THEN XCHECK = VALUE ELSE IF (CSORTMODE .EQ. 'WSIM' ) THEN XCHECK = SIM ELSE IF (CSORTMODE .EQ. 'SIM' ) THEN XCHECK = SIM ELSE IF (CSORTMODE .EQ. 'SIGMA' ) THEN XCHECK = VALUE / SDEV ELSE IF (CSORTMODE .EQ. 'IDENTITY' ) THEN XCHECK = HOM ELSE IF (CSORTMODE .EQ. 'VALPER' ) THEN XCHECK = VALUE/FLOAT(LENOCC) ELSE IF (CSORTMODE .EQ. 'VALFORM' ) THEN XCHECK=VALUE*(LENOCC**(-0.56158)) ENDIF IF (CSORTMODE .NE. 'ZSCORE' .AND. CSORTMODE .NE. 'NO' ) THEN IF ( ABS (XCHECK-CHECKVAL) .GT. 0.01 ) THEN LOGSTRING=' ' WRITE(LOGSTRING,'(A,F7.2,A,F7.2,A,A)') + '** ERROR: XCHECK.NE.CHECKVAL ',XCHECK,' ', + CHECKVAL,' ',CSORTMODE CALL LOG_FILE(KLOG,LOGSTRING,1) STOP ENDIF ENDIF C IF (.NOT. LDSSP_2) THEN DO K=1,LEN1 SAL_2(K:K)='U' ENDDO ENDIF IALIPOS=1 JALIPOS=MIN(LEN1,MAXRECORDLEN) DO WHILE(IALIPOS .LE. LEN1) IRECORD=IRECORD+1 READ(KFILE,REC=IRECORD)LINE(2) IF (LDSSP_2) THEN IRECORD=IRECORD+1 READ(KFILE,REC=IRECORD)LINE(3) IRECORD=IRECORD+1 READ(KFILE,REC=IRECORD)LINE(4) ENDIF IPOS=1 DO I=IALIPOS,JALIPOS AL_2(I:I)=LINE(2)(IPOS:IPOS) IF (LDSSP_2) THEN SAL_2(I:I)=LINE(3)(IPOS:IPOS) READ(LINE(4)(IPOS:IPOS),'(I1)')LACC_2(I) ENDIF IPOS=IPOS+1 ENDDO IALIPOS=JALIPOS+1 JALIPOS=MIN(LEN1,JALIPOS+MAXRECORDLEN) ENDDO C READ INSERTIONS IF (IINS .GT. 0) THEN INSPOINTER_LOCAL=1 DO I=1,IINS IRECORD=IRECORD+1 READ(KFILE,REC=IRECORD)INSLEN_LOCAL(I),INSBEG_1_LOCAL(I), + INSBEG_2_LOCAL(I) INSPOINTER_LOCAL=INSPOINTER_LOCAL+INSLEN_LOCAL(I)+3 ENDDO IF ( MOD(FLOAT(INSPOINTER_LOCAL),FLOAT(MAXRECORDLEN)) .EQ. + 0.0) THEN NLINE= INSPOINTER_LOCAL/MAXRECORDLEN ELSE NLINE=(INSPOINTER_LOCAL/MAXRECORDLEN ) +1 ENDIF IBEG=1 IEND=MAXRECORDLEN DO I=1,NLINE IRECORD=IRECORD+1 READ(KFILE,REC=IRECORD)INSSEQ(IBEG:IEND) IBEG=IEND+1 IEND=IEND+MAXRECORDLEN ENDDO ENDIF ENDIF C end of LCONSIDER RETURN END C END GETALIGN C...................................................................... C...................................................................... ***** ------------------------------------------------------------------ ***** SUB GETARRAYINDEX ***** ------------------------------------------------------------------ C---- C---- NAME : GETARRAYINDEX C---- ARG : 1 CARRAY(1:NMAX) = array with strings C---- ARG : 2 CSTRING = string to find in array C---- ARG : 3 NMAX = maximal number of elements of carray C---- ARG : 4 INDEX = index of element matching C---- DES : Checks whether or not the string CSTRING equals C---- DES : any of the strings in CARRAY. C---- DES : if yes: returns the number of the array element matching C---- DES : if not: returns 0 C---- *----------------------------------------------------------------------* SUBROUTINE GETARRAYINDEX(CARRAY,CSTRING,NMAX,IINDEX) IMPLICIT NONE C does not contain CSTRING C Import INTEGER NMAX C---- br 99.03: watch hard_coded here, see maxhom.param CHARACTER*200 CARRAY(NMAX) C---- --> REASON: the following produces warnings on SGI C CHARACTER*(*) CARRAY(*) C CHARACTER*(*) CARRAY(NMAX) CHARACTER*(*) CSTRING C internal INTEGER IINDEX,i LOGICAL LNOT ******------------------------------*-----------------------------****** C---- ini IINDEX= 1 LNOT= .TRUE. C---- count up until ctest matches DO WHILE (LNOT) C---- leave when at end of string IF (LNOT .AND. IINDEX.GT.NMAX) LNOT=.FALSE. C---- leave when match C---- hack br 99.03: SGI compiler crashes if IINDEX too high C---- (if in one if (not and carry=string)!) IF (LNOT) THEN IF (CARRAY(IINDEX).EQ.CSTRING) LNOT=.FALSE. ENDIF C------- count up IF (LNOT) IINDEX=IINDEX+1 ENDDO C---- none found -> return 0 IF (IINDEX .GT. NMAX ) IINDEX = 0 RETURN END C END GETARRAYINDEX C...................................................................... C...................................................................... C SUB GETBEST SUBROUTINE GETBEST(IPOSBEG,IPOSEND,JPOSBEG,JPOSEND,NREGION, + NTEST,LH1,LH2,ND1,ND2,BESTVAL,BESTIIPOS,BESTJJPOS) C search the LH matrix for the best value, where the trace was not C used in a previous alignment IMPLICIT NONE INCLUDE 'maxhom.param' c import INTEGER IPOSBEG,IPOSEND,JPOSBEG,JPOSEND,NREGION,NTEST INTEGER ND1,ND2 REAL LH1(0:ND1,0:ND2) INTEGER*2 LH2(0:ND1,0:ND2) C REAL LH(0:ND1,0:ND2,*) C EXPORT INTEGER BESTIIPOS,BESTJJPOS REAL BESTVAL C INTERNAL INTEGER I,J,II,JJ,LDIREC LOGICAL LDONE_BEFORE REAL BEST,BEST_II(0:MAXSQ+1) INTEGER ITEMP,JTEMP,TEMP_II(0:MAXSQ+1), + TEMP_JJ(0:MAXSQ+1) *----------------------------------------------------------------------* C INIT BESTVAL=0.00000000 BESTIIPOS=0 BESTJJPOS=0 C horizontal path : ldirec=40000 ; ldel<=MAXSQ C vertical path : ldirec=30000 ; ldel<=MAXSQ C diagonal match : ldirec=20000 ; ldel=0 C unmatched terminal sequence : ldirec=10000 ; ldel=0 IF (NTEST .LT. NREGION) THEN C GET BEST VALUE DO I=IPOSBEG,IPOSEND BEST_II(I)=0.0 TEMP_II(I)=0 TEMP_JJ(I)=0 ENDDO DO J=JPOSEND,JPOSBEG,-1 DO I=IPOSBEG,IPOSEND IF (LH1(I,J) .GT. BEST_II(I)+0.0001 ) THEN BEST_II(I)= LH1(I,J) TEMP_II(I) = I TEMP_JJ(I) = J ENDIF ENDDO ENDDO DO I=IPOSEND,IPOSBEG,-1 IF (BEST_II(I) .GT. BESTVAL+0.0001) THEN BESTVAL=BEST_II(I) BESTIIPOS=TEMP_II(I) BESTJJPOS=TEMP_JJ(I) C WRITE(6,*)BESTVAL,BEST_II(I),BESTIIPOS,BESTJJPOS ENDIF ENDDO ELSE C TRACE BACK TILL END FOR EACH NEW BEST VALUE DO J=JPOSEND,JPOSBEG,-1 DO I=IPOSEND,IPOSBEG,-1 IF ( LH1(I,J) .GT. BESTVAL+0.0001 ) THEN LDONE_BEFORE=.FALSE. BEST=LH1(I,J) ITEMP=I JTEMP=J II=I JJ=J DO WHILE ( .NOT. LDONE_BEFORE .AND. + LH2(II,JJ) .NE. 0 .AND. + II .GT. IPOSBEG .AND. JJ .GT. JPOSBEG) LDIREC= ABS( LH2(II,JJ) ) IF (LDIREC .GT. 20000 ) THEN II=II - ( LDIREC - 20000 ) ELSE IF (LDIREC .GT. 10000 ) THEN JJ=JJ - ( LDIREC - 10000 ) ELSE IF (LH2(II,JJ) .EQ. -1) THEN LDONE_BEFORE=.TRUE. ELSE IF (LDIREC .EQ. 1) THEN II=II-1 JJ=JJ-1 ELSE WRITE(6,*)'GETBEST: LDIREC UNKNOWN: ',LDIREC ENDIF ENDDO IF (.NOT. LDONE_BEFORE) THEN BESTVAL=BEST BESTIIPOS=ITEMP BESTJJPOS=JTEMP ENDIF ENDIF ENDDO ENDDO ENDIF RETURN END C END GETBEST C...................................................................... C...................................................................... C SUB GETCHAINBREAKS SUBROUTINE GETCHAINBREAKS(NRES,LSQ,STRUC,TRANS,NBREAK,IBREAKPOS) C RS 89 C search for chain break(s) and store position(s) in array IBREAKPOS C total number of breaks in protein are in NBREAK C used to disallow alignments over chain breaks C and to check pieces from DSSP and BRK if superpositon in 3-D wanted C import INTEGER LSQ(*) CHARACTER TRANS*(*) C EXPORT INTEGER IBREAKPOS(*),NBREAK CHARACTER*(*) STRUC(*) C INTERNAL INTEGER ILEN ILEN=LEN(TRANS) NBREAK=0 IBREAKPOS(1)=0 IND=INDEX(TRANS(1:ILEN),'!') DO IRES=1,NRES IF (LSQ(IRES) .EQ. IND) THEN NBREAK=NBREAK+1 IBREAKPOS(NBREAK)=IRES STRUC(IRES)='!' C WRITE(6,*)' CHAINBREAK : ',IRES ENDIF ENDDO RETURN END C END GETCHAINBREAKS C...................................................................... C...................................................................... C SUB GETCHAR SUBROUTINE GETCHAR(KCHAR,CHARARR,CTEXT) C prompts for characters CHARACTER*(*) CTEXT,CHARARR CHARACTER*(KCHAR) LINE INTEGER IMAX IMAX=LEN(CHARARR) WRITE(6,*)'================================================='// + '==============================' CALL WRITELINES(CTEXT) 10 CONTINUE WRITE(6,*) WRITE(6,'(a,i3,a)')' Enter string of length < ',imax, + ' [CR=default]' WRITE(6,*)' ' CALL STRPOS(CHARARR,IBEG,IEND) IF (IBEG .GT. 0 .AND. IEND .GT. 0) THEN WRITE(6,'(a,a)')' Default: ',chararr(ibeg:iend) ELSE WRITE(6,'(a,a)')' Default: ',chararr ENDIF WRITE(6,*)' ' LINE=' ' READ(*,'(A)',ERR=10,END=11) LINE IF ( LINE .NE. ' ' ) THEN C assuming default values were set outside .... CALL STRPOS(LINE,IBEG,IEND) c do i=1,iend c iascii=ichar(line(i:i)) c if (iascii .lt. 32 .or. iascii .gt. 126) then c WRITE(6,*)'*** Characters only, NOT: ',line(1:iend) c GOTO 10 c endif c enddo c iend=min(iend,imax) CHARARR(1:)=LINE(1:IEND) ENDIF 11 WRITE(6,'(a,a)')' echo: ',chararr(1:iend) RETURN END C END GETCHAR C...................................................................... C...................................................................... C SUB GETCONSWEIGHT SUBROUTINE GETCONSWEIGHT(NRES,IALIGN,LSEQ_1) C conservation weights: C fix weights between 1.0 and 0.1 C where 0.0 means random distribution, because of moise its possible C that cons-weights have small negative values C so cons-weights <0.1 are set to 0.1 C ISAFE is here +5 C======================================================================= IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C import INTEGER NRES,IALIGN,LSEQ_1(*) C internal INTEGER ISMALL,KFILE,ISAFERANGE,I,IRES,IALNEW,IDEL,NDEL,IAL, + IFIR,JFIR,JLAS,LEN1,LENOCC,IBEG,IEND,IAGR,ILEN,JPOS, + NPOS,IPOS,IRECORD,IND REAL SEQDIST,CHECKVAL,RMS,VALUE,HOM,SIM,DISTANCE,SUM, + MEAN,XVAL,SDEV CHARACTER*500 CONSEVOLUTION CHARACTER*200 CTMP LOGICAL LEVOLUTION,LDUMMY,LERROR C---- ------------------------------------------------------------------ C---- C---- defaults, ini C---- C---- FORMULA+ISAFERANGE -> include into averaging ISAFERANGE= 5 C---- BR 99.0x: make 'safer' for weights ISAFERANGE= 5 LDUMMY= .TRUE. DO IRES=1,NRES NOCC(IRES)=0 ENDDO C---- C---- write cons-weights after each alignment (if lconsider=.true.) for C---- inspection of the conservation-weights evolution LEVOLUTION=.FALSE. C LEVOLUTION=.TRUE. IF (LEVOLUTION .AND. LFIRSTWEIGHT) THEN DO I=1,MAXSQ SUMDISTANCE(I)= 0.0 SUMVARIABILITY(I)=0.0 ENDDO CALL CONCAT_STRINGS(HSSPID_1,'_EVOLUTION.DAT',CONSEVOLUTION) CALL OPEN_FILE(KCONS,CONSEVOLUTION,'NEW',LERROR) WRITE(6,*)' BACK OPEN' WRITE(KCONS,'(A,A)')'## ',NAME_1 WRITE(KCONS,'(A)')'## EVOLUTION OF CONSERVATION WEIGHTS' WRITE(KCONS,'(A)')'## IALIGN: Number of alignment above '// + 'threshold (list position )' WRITE(KCONS,'(A)')'## IRES: residue number (test-seq)' WRITE(KCONS,'(A)')'## WEIGHT: conservation weight' WRITE(KCONS,'(A)')'## I4,2X,I4,F7.2' WRITE(kcons,'(a)')'## IALIGN IRES WEIGHT ' DO IRES=1,NRES WRITE(KCONS,'(I4,2X,I4,F7.2)')0,IRES,CONSWEIGHT_1(IRES) ENDDO LFIRSTWEIGHT=.FALSE. ENDIF C---- C---- loop over new alis (depends on NBEST and/or no of chain breaks) CAUTION kfile has to be open CHANGE in future DO IALNEW=IALIGNOLD+1,IALIGN IRECORD=IRECPOI(IALNEW) IF (IRECORD .GT. 0 ) THEN C KFILE=KCORE -ISMALL + IFILEPOI(IALNEW) KFILE=KCORE CHECKVAL=ALISORTKEY(IALNEW) CALL GETALIGN(KFILE,IRECORD,IFIR,LEN1,LENOCC,JFIR,JLAS, + IDEL,NDEL,VALUE,RMS,HOM, + SIM,SDEV,DISTANCE,CHECKVAL) C LDUMMY (=LFORMULA) is true C use formula+ISAFERANGE percent for the calculation of cons-weight C---- new switch: parameter set in maxhom.param IF (LNEWCURVE) THEN CALL CHECKHSSPCUT99(LENOCC,HOM*100.0,ISOLEN,ISOIDE,NSTEP, + LDUMMY,LALL,ISAFERANGE,LCONSIDER,DISTANCE) ELSE CALL CHECKHSSPCUT(LENOCC,HOM*100.0,ISOLEN,ISOIDE,NSTEP, + LDUMMY,LALL,ISAFERANGE,LCONSIDER,DISTANCE) ENDIF IABOVE(IALNEW)=0 C BR 99.09: found a bug (this was missing) IF (.NOT. LCONSIDER) THEN AL_EXCLUDEFLAG(IALNEW)='*' ELSE IABOVE(IALNEW)=1 IFIRST(IALNEW)=IFIR ILAST(IALNEW)= IFIR+LEN1-1 C C FIRST: convert lower case characters in HSSP-alignment C to UPPER CASE C AND convert to INTEGER C IPOS=IFIR IF (ISEQPOS+LEN1+1 .LE. MAXSEQBUFFER) THEN DO IRES=1,LEN1 C write(6,*)'ipos=',ipos,' ires=',ires, C + ' al2=',al_2(ires:ires) IF (AL_2(IRES:IRES) .GE. 'a' .AND. + AL_2(IRES:IRES) .LE. 'z') THEN C hack br 2003-10-13 CTMP=AL_2(IRES:IRES) CALL LOWTOUP(CTMP,1) Cold SEQBUFFER(ISEQPOS+IRES-1)= Cold + CHAR(ICHAR(AL_2(IRES:IRES))-32 ) SEQBUFFER(ISEQPOS+IRES-1)=CTMP(1:1) C hack br 2003-10-13 IND=TRANSPOS(ICHAR(CTMP(1:1))) ELSE SEQBUFFER(ISEQPOS+IRES-1)=AL_2(IRES:IRES) IND=TRANSPOS(ICHAR(AL_2(IRES:IRES))) ENDIF IF (IND .NE. 0) THEN LSEQ_2(IPOS)=IND ELSE LSEQ_2(IPOS)=0 WRITE(6,'(A)')'** UNKNOWN RESIDUE 1: ' + //AL_2(IRES:IRES) ENDIF IPOS=IPOS+1 ENDDO ISEQPOINTER(IALNEW)=ISEQPOS ISEQPOS=ISEQPOS+LEN1+1 SEQBUFFER(ISEQPOS)='/' ELSE WRITE(6,*)' ERROR: MAXSEQBUFFER OVERFLOW' STOP ENDIF C--- accumulate SUMVARIABILITY/SUMDISTANCE C--- for the pair test-seq - new ali SEQDIST=1.0-HOM DO IRES=IFIRST(IALNEW),ILAST(IALNEW) IF (LSEQ_1(IRES).NE.0 .AND. LSEQ_2(IRES).NE.0) THEN SUMVARIABILITY(IRES)= + SUMVARIABILITY(IRES) + + (SEQDIST * SIMCONSERV(LSEQ_1(IRES),LSEQ_2(IRES)) ) ENDIF SUMDISTANCE(IRES)=SUMDISTANCE(IRES) + SEQDIST NOCC(IRES)=NOCC(IRES)+1 ENDDO C If profiles are used, conservation weights are calculated from C the comparison between test sequence and aligned sequences (not C between aligned seqs) IF (.NOT. LPROFILE_1 .AND. .NOT. LPROFILE_2) THEN DO IAL=1,IALIGNOLD IF (IABOVE(IAL) .EQ. 1) THEN C DO the 2 alignments overlap ? C---- C---- 98-10: br C---- correct bug C---- Cbug IF (IFIRST(IAL) .LT. IFIRST(IALNEW) .OR. Cbug + ILAST(IAL) .LT. ILAST(IALNEW)) THEN Cbug SEQDIST=0.0 IF (IFIRST(IAL) .LT. ILAST(IALNEW) .OR. + ILAST(IAL) .LT. IFIRST(IALNEW)) THEN SEQDIST=0.0 ELSE C GET OVERLAP RANGE IBEG=MAX(IFIRST(IAL),IFIRST(IALNEW)) IEND=MIN(ILAST(IAL),ILAST(IALNEW)) IRES=IBEG DO JPOS=IBEG-IFIRST(IAL)+1,IEND-IFIRST(IAL)+1 IND= + TRANSPOS(ICHAR(SEQBUFFER(ISEQPOINTER(IAL)+JPOS-1))) IF (IND .NE. 0) THEN LSEQTEMP(IRES)=IND ELSE LSEQTEMP(IRES)=0 WRITE(6,'(A)')'* UNKNOWN RESIDUE 2:'// + SEQBUFFER(ISEQPOINTER(IAL)+JPOS-1) ENDIF IRES=IRES+1 ENDDO C GET THE IDENTITIES AND LENGTH OF THE OVERLAPPING PART IAGR=0 ILEN=0 DO IRES=IBEG,IEND IF (LSEQ_2(IRES).NE.0 .AND. + LSEQTEMP(IRES).NE.0) THEN ILEN=ILEN+1 IF (LSEQ_2(IRES) .EQ. + LSEQTEMP(IRES))IAGR=IAGR+1 ENDIF IBOTH_LEGAL(IRES)=0 IF (LSEQ_2(IRES).NE.0 .AND. + LSEQTEMP(IRES).NE.0) THEN IBOTH_LEGAL(IRES)=1 SIMVAL(IRES)= + SIMCONSERV(LSEQ_2(IRES),LSEQTEMP(IRES)) ENDIF ENDDO C--- accumulate SUMVARIABILITY/SUMDISTANCE C--- for the pair NEW_ALI OLD_ALI IF (ILEN.NE.0) THEN SEQDIST=1-(FLOAT(IAGR)/FLOAT(ILEN)) DO IRES=IBEG,IEND IF (IBOTH_LEGAL(IRES) .EQ. 1) THEN SUMVARIABILITY(IRES)= + SUMVARIABILITY(IRES)+(SEQDIST*SIMVAL(IRES)) SUMDISTANCE(IRES)= + SUMDISTANCE(IRES)+SEQDIST ENDIF ENDDO ENDIF ENDIF ENDIF C LOOP OVER OLD ALIS ENDDO C .NOT. LPROFILE ENDIF C UPDATE WEIGHTS FOR OVERLAPPING RANGE BETWEEN TEST-SEQ AND NEW ALI DO IRES=IFIRST(IALNEW),ILAST(IALNEW) IF (SUMDISTANCE(IRES).NE.0.0) THEN CONSWEIGHT_1(IRES)= + (SUMVARIABILITY(IRES)/SUMDISTANCE(IRES)) C NO NEGATIVE VALUES FOR CONS-WEIGHT IF (CONSWEIGHT_1(IRES).LT.CONSMIN) THEN CONSWEIGHT_1(IRES)=CONSMIN ENDIF ENDIF ENDDO C WRITE CONSERVATION WEIGHTS TO FILE IF (LEVOLUTION) THEN C CALL CONCAT_STRINGS(HSSPID_1,'_EVOLUTION.DAT', C + CONSEVOLUTION) C CALL OPEN_FILE(KCONS,CONSEVOLUTION,'OLD,APPEND',LERROR) WRITE(KCONS,'(A,A)')'## ',NAME_2(1:50) DO IRES=1,NRES WRITE(KCONS,'(I4,2X,I4,F7.2)')IALIGN,IRES, + CONSWEIGHT_1(IRES) ENDDO CLOSE(KCONS) ENDIF C C else: do NOT take (said CHECKHSSPCUT) -> updata flags! C ENDIF C LCONSIDER ENDIF C LOOP OVER NEW ALIS ENDDO 99 SUM=0.0 NPOS=0 MEAN=1.0 DO I=1,NRES IF (NOCC(I).NE.0) THEN SUM=SUM+CONSWEIGHT_1(I) NPOS=NPOS+1 ENDIF ENDDO IF (NPOS .NE. 0) THEN MEAN=SUM/NPOS ENDIF C WRITE(6,*)'GETCONSWEIGHT: SUM,MEAN ',SUM,MEAN IF (MEAN.GT. 0.99 .AND. MEAN .LT. 1.01) RETURN XVAL=1.0-MEAN DO I=1,NRES IF (NOCC(I).NE.0) CONSWEIGHT_1(I)=CONSWEIGHT_1(I)+XVAL ENDDO GOTO 99 END C END GETCONSWEIGHT C...................................................................... C...................................................................... C SUB GETCOORFORHSSP SUBROUTINE GETCOORFORHSSP(INFILE,INUNIT,CIDPROT,NRES,NATM, + MXRES,MXATM,CIDRES,IPATMRES,RCA,CIDATM,IPRESATM,R) C AUTION HERE 'TER' LINES (CHAIN TERMINATORS) ARE COUNT AS RESIDUES C BECAUSE PIECES COME FROM DSSP-SEQUENCE (CHAIN BREAKS INCREMENT C RESIDUE COUNTER) *RS 89 C C GET-COOR-BROOK:SYMB.....CHRIS SANDER....MAY 1983... C FINAL DEFINITIVE PROTEIN DATA BANK COORDINATE INPUT C ADAPTED FROM GCOOR OF SEGSEG, BUT WIHTOUT ADDED HYDROGENS AND C WITH ALTERED DATA STRUCTURE C FILE ATTRIBUTES CHARACTER*(*) INFILE INTEGER INUNIT C PROTEIN ATTRIBUTES C HEADER,COMPOUND,SOURCE,AUTHOR,RESOLUTION CHARACTER*(*) CIDPROT(*) C NUMBER OF RESIDUES, ATOMS INTEGER NRES,NATM C RESIDUE ATTRIBUTES CHARACTER*(*) CIDRES(*) C POINTS TO FIRST, LAST AND CA ATOM. INTEGER IPATMRES(3,*) C C(ALPHA) COORDINATES REAL RCA(3,*) C ATOM ATTRIBUTES CHARACTER*(*) CIDATM(MXATM) C ATOM BELONGS TO RESIDUE NUMBER IPRESATM INTEGER IPRESATM(*) REAL R(3,*) C LOCAL STORAGE CHARACTER SEQ*3,LINE*200,ALT*1 INTEGER NLIN LOGICAL OVERFLOW,LERROR C EXECUTE NRES=0 NATM=0 DO KI=1,5 CIDPROT(KI)=' ' ENDDO OVERFLOW=.FALSE. WRITE(6,*)'GETCOOR: OPEN ',infile(1:40) CALL OPEN_FILE(INUNIT,INFILE,'OLD,READONLY',LERROR) IF (LERROR) THEN WRITE(6,*)' OPEN FILE ERROR IN GETCOOR: ',infile(1:40) WRITE(6,*)' ....return with NRES=NATM=0 ' RETURN ENDIF C LOOP OVER LINES IA=0 IR=0 C ATOM, RESIDUE AND LINE COUNTERS NLIN=0 10 READ(INUNIT,'(A)',END=999) LINE NLIN=NLIN+1 C ATOMS IF (LINE(1:4) .EQ. 'ATOM') THEN IA=IA+1 IR=IR+1 IF (IA .GT. MXATM) OVERFLOW=.TRUE. IF (IR .GT. MXRES) OVERFLOW=.TRUE. IF (OVERFLOW) THEN IA=IA-1 IR=IR-1 WRITE(6,*)'***GETCOOR: CORE OVERFLOW FOR MXATM OR MXRES' WRITE(6,*)' MXATM,IA, MXRES,IR',MXATM,IA,MXRES,IR WRITE(6,*)' MOLECULE TRUNCATED' GOTO 999 ENDIF C MAIN INPUT C EXAMPLE FROM 3PTI: C REAL FIELDS: 111111112222222233333333 C TOM 101 N PRO 13 12.250 12.909 15.223 1.00 0.00 3PTI 160 C TOM 102 CA PRO 13 11.486 11.965 16.047 1.00 0.00 3PTI 161 C... :....1....:....2....:....3....:....4....:....5....:....6....:....7....:....8 CIDATM(IA)=LINE(13:16) ALT=LINE(17:17) SEQ=LINE(18:20) CIDRES(IR)=LINE(22:27) READ(LINE,'(30X,3F8.3)')(R(K,IA),K=1,3) C SKIP ALTERNATE ATOM POSITIONS IF ( ALT .NE. ' ' .AND. IA .NE. 1 .AND. + CIDATM(IA) .EQ. CIDATM(IA-1) ) THEN WRITE(6,'(A,I5,1X,A4,A1,A3,1X,A6,3X,3F8.3)') + 'GETCOOR ALTERNATE ATOM IGNORED: ', + IA,CIDATM(IA),ALT,SEQ,CIDRES(IR),(R(K,IA),K=1,3) IA=IA-1 IR=IR-1 GOTO 10 ENDIF calt ignore ace residue IF (SEQ .EQ. 'ACE' ) THEN IA=IA-1 IR=IR-1 WRITE(6,*)'GETCOOR: ACE ignored at res ',ir GOTO 10 ENDIF c set atom pointer IPATMRES(1,IR)=IA IF (IR .NE. 1) IPATMRES(2,IR-1)=IA-1 c is it a new residue ? IF (IR .NE. 1) THEN IF ( CIDRES(IR-1) .EQ. CIDRES(IR) ) IR=IR-1 ENDIF c now valid ir and ia - stash away IPRESATM(IA)=IR IF (CIDATM(IA) .EQ. ' CA ') THEN IPATMRES(3,IR)=IA DO K=1,3 RCA(K,IR)=R(K,IA) ENDDO ENDIF ELSE IF (LINE(1:4) .NE. 'ATOM' ) THEN IF (LINE(1:4) .EQ. 'HEAD'.AND.CIDPROT(1).EQ.' ')CIDPROT(1)=LINE IF (LINE(1:4) .EQ. 'COMP'.AND.CIDPROT(2).EQ.' ')CIDPROT(2)=LINE IF (LINE(1:4) .EQ. 'SOUR'.AND.CIDPROT(3).EQ.' ')CIDPROT(3)=LINE IF (LINE(1:4) .EQ. 'AUTH'.AND.CIDPROT(4).EQ.' ')CIDPROT(4)=LINE IF ( INDEX(LINE,'RESOLUTION') .NE. 0 .AND. + CIDPROT(5).EQ.' ') THEN CIDPROT(5)=LINE ENDIF IF (LINE(1:3) .EQ. 'TER') THEN IR=IR+1 SEQ='---' ENDIF ENDIF c next line GOTO 10 c end of file 999 IR=IR-1 NATM=IA NRES=IR IPATMRES(2,NRES)=NATM CLOSE(INUNIT) WRITE(6,*)'CLOSED: ',INFILE(1:40) WRITE(6,'(a,3(i5,a))')' exit getcoor:',nres,' residues', + natm,' atoms',nlin,' lines' RETURN END C END GETCOORFORHSSP C...................................................................... C...................................................................... C SUB GETDSSPFORHSSP SUBROUTINE GETDSSPFORHSSP(IN,FILE,MAXSQ,CHAINREMARK,PROT, + HEAD,COMP,SOURCE,AUTHOR,NRES,LRES,NCHAIN,KCHAIN,PDBNO, + PDBCHAINID,PDBSEQ,SECSTR,COLS,BP1,BP2,SHEETLABEL,ACC) c reads header etc from files of type dssp. modified getdssp rs dez 88. c reads dssp-data as line of length 38 (no h-bond-data) INTEGER IN,MAXSQ CHARACTER*(*) FILE,PROT,COMP,HEAD,SOURCE,AUTHOR,CHAINREMARK CHARACTER PDBSEQ(*) CHARACTER*(*) PDBCHAINID(*),SECSTR(*) CHARACTER*1 SHEETLABEL(*) C LENGHT*7 CHARACTER*7 COLS(*) INTEGER PDBNO(*),BP1(*),BP2(*),ACC(*) C INTERNAL PARAMETER (MAXCHAIN= 100) CHARACTER CHAINMODE*20,CHAINID(MAXCHAIN) CHARACTER LINE*200,TEMPNAME*124 LOGICAL ERRFLAG,LKEEP,LCHAIN(MAXCHAIN) *----------------------------------------------------------------------* C INIT NSELECT=1 TEMPNAME=' ' I=INDEX(CHAINREMARK,'_!_') IF (I.NE.0) THEN TEMPNAME(1:)=FILE(1:I-1) ELSE TEMPNAME(1:)=FILE(1:) ENDIF CALL OPEN_FILE(IN,TEMPNAME,'READONLY,OLD',ERRFLAG) IF (ERRFLAG)GOTO 999 C GET PROTEIN IDENTIFIER, HEADER AND COMPOUND etc DO LL=1,3 READ(IN,'(A200)',END=777,ERR=999) LINE ENDDO PROT=LINE(63:66) PROT=LINE(63:66) HEAD=LINE(11:50) READ(IN,'(A200)',END=777,ERR=999)LINE COMP=LINE(11:) READ(IN,'(A200)',END=777,ERR=999)LINE SOURCE=LINE(11:) READ(IN,'(A200)',END=777,ERR=999)LINE AUTHOR=LINE(11:) C...........FIND SEQUENCE......... 70 READ(IN,'(A200)',END=777,ERR=999)LINE IF (INDEX(LINE(1:5),'#').EQ.0) GOTO 70 CD WRITE(6,*)' # found sequence ' C............READ STRUCTURE......... C...:....1....:....2....:....3....:... C # RESIDUE AA STRUCTURE BP1 BP2 ACC C 22 36 A S E > -I 24 0C 60 C DO I=1,MAXCHAIN LCHAIN(I)=.TRUE. ENDDO I=INDEX(CHAINREMARK,'!') C RS 90 C extract selected chains C fx: $pdb:4hhb.dssp_!_1,2 C or: $pdb:4hhb.dssp_!_A IF (I.NE.0) THEN DO J=1,MAXCHAIN LCHAIN(J)=.FALSE. ENDDO NSELECT=1 CALL STRPOS(CHAINREMARK,ISTART,ISTOP) DO J=ISTOP,I+1,-1 IF (CHAINREMARK(J:J).EQ.',')NSELECT=NSELECT+1 ENDDO CHAINMODE='CHARACTER' c WRITE(6,*)' WILL READ CHAINS ACCORDING TO CHARACTER' ISTART=INDEX(CHAINREMARK,'!')+2 DO J=1,NSELECT READ(CHAINREMARK(ISTART:),'(A1)')CHAINID(J) CALL LOWTOUP(CHAINID(J),1) ISTART=ISTART+2 ENDDO c WRITE(6,*)' GETDSSPFORHSSP: extract the chain(s)' c DO J=1,NSELECT c WRITE(6,*)' CHAIN: ',CHAINID(J) c ENDDO ELSE CHAINMODE='NONE' IF (KCHAIN.NE.0) THEN WRITE(6,*)' will extract chain number: ',KCHAIN ENDIF DO J=1,MAXCHAIN LCHAIN(J)=.TRUE. ENDDO ENDIF I=1 NCHAIN=1 NPICK=0 80 READ(IN,'(A38)',END=777,ERR=999)LINE LKEEP=.FALSE. IF (LINE(14:14).EQ.'!') THEN NCHAIN=NCHAIN+1 ELSE IF (KCHAIN.EQ.NCHAIN)LKEEP=.TRUE. ENDIF C KCHAIN=0 => all chains IF (KCHAIN.EQ.0)LKEEP=.TRUE. C if chains are identified by filename IF (CHAINMODE.EQ.'NUMBER') THEN IF (LCHAIN(NCHAIN)) THEN C if the first chain wanted is not the first chain in DSSP-file, skip C the first position ('!') IF (NPICK.EQ.0) THEN IF (LINE(14:14).EQ.'!') THEN LKEEP=.FALSE. ENDIF ELSE LKEEP=.TRUE. ENDIF NPICK=1 ELSE LKEEP=.FALSE. ENDIF ELSE IF (CHAINMODE.EQ.'CHARACTER') THEN LKEEP=.FALSE. IF (LINE(14:14).EQ.'!') THEN IF (NPICK.EQ.0) THEN LKEEP=.FALSE. ELSE LKEEP=.TRUE. ENDIF ELSE CALL LOWTOUP(LINE(12:12),1) DO JCHAIN=1,NSELECT IF (CHAINID(JCHAIN).EQ. LINE(12:12)) THEN LKEEP=.TRUE. NPICK=1 ENDIF ENDDO IF (.NOT. LKEEP .AND. I.GT.1) THEN IF (pdbseq(i-1).EQ.'!')I=I-1 ENDIF ENDIF ENDIF c pdbno,chainid,dsspseq,secstr,cols,bp1,bp2,sheetlabel,acc IF (LKEEP) THEN READ(LINE,'(6x,I4,1X,A1,1X,A1,2X,A1,1X,A7,I4,I4,A1,I4)', + END=777,ERR=999)pdbno(i),pdbchainid(i),pdbseq(i), + secstr(i),cols(i)(1:7),bp1(i),bp2(i),sheetlabel(i), + acc(i) I=I+1 CALL CHECKRANGE (I,1,MAXSQ,'MAXSQ','GETDSSP ') ENDIF GOTO 80 C...............done.................. 777 NRES=I-1 c WRITE(6,*) NRES,' RESIDUES READ IN GETDSSPFORHSSP ' IF (NRES.LE.0) THEN PROT=' ' HEAD=' ' COMP=' ' SOURCE=' ' AUTHOR=' ' ENDIF C.......DO NOT COUNT CHAIN BREAKS... LRES=NRES KCHAIN=1 DO I=1,NRES IF (pdbseq(i).EQ.'!') THEN LRES=LRES-1 KCHAIN=KCHAIN+1 ENDIF ENDDO c WRITE(6,*) LRES,' RESIDUES ',NRES,' POSITIONS ' CLOSE(IN) RETURN 999 WRITE(6,*)' *** READ ERROR ***' NRES=0 PROT=' ' HEAD=' ' COMP=' ' SOURCE=' ' AUTHOR=' ' RETURN END C END GETDSSPFORHSSP C...................................................................... C...................................................................... C SUB GETHSSPCUT SUBROUTINE GETHSSPCUT(KIN,MAXSTEP,INFILE,ISOLEN,ISOIDE,NSTEP) C RS 89 C read in isosignificance data from file C C............................................................. C* isosignificance data / 70% secondary structure identity C* a "*" indicates a comment line C* alignments longer than the length specified in the last line C* have the same cutoff C* format=(2X,I4,7X,F7.2) C*.1234..... 1234567 C* length minimum % sequence identity <===== start-line C 10 67.41 C 20 50.22 C .. .. C> 200 24.53 C............................................................. IMPLICIT NONE INTEGER MAXSTEP,KIN,I CHARACTER*(*) INFILE INTEGER ISOLEN(MAXSTEP),NSTEP REAL ISOIDE(MAXSTEP) LOGICAL LERROR CHARACTER LINE*200 CALL OPEN_FILE(KIN,INFILE,'READONLY,OLD',LERROR) 10 READ(KIN,'(A)',ERR=999)LINE WRITE(6,*)LINE CALL LOWTOUP(LINE,200) IF (INDEX(LINE,'* LENGTH') .EQ. 0) GOTO 10 I=1 20 READ(KIN,'(2X,I4,7X,F7.2)',END=888)ISOLEN(I),ISOIDE(I) I=I+1 IF (I .GT. MAXSTEP) THEN WRITE(6,*)' GETHSSPCUT: maxstep overflow: ',maxstep ENDIF GOTO 20 888 NSTEP=I-1 WRITE(6,*)' GETHSSPCUT: ',nstep,' steps ' cd do i=1,nstep cd WRITE(6,*)isolen(i),isoide(i) cd enddo CLOSE(KIN) RETURN 999 WRITE(6,*)' GETHSSPCUT: ERROR READING ',INFILE CLOSE(KIN) STOP END C END GETHSSPCUT C...................................................................... C...................................................................... C SUB GETINT SUBROUTINE GETINT(KINT,INTARR,CTEXT) C by Chris Sander, June 1985, Feb 1986, June 1987, RS89 C For interactive use via terminal. C Prompts for KINT integers from input unit *. C Returns new values in INTARR(1..KINT) C Offers previous values as default. CUG INTEGER LINELEN PARAMETER (LINELEN= 200) CHARACTER*(LINELEN) LINE CHARACTER*(*) CTEXT INTEGER INTARR(*) LOGICAL EMPTYSTRING CUG INTEGER NUMSTART CHARACTER*20 CTEMP *----------------------------------------------------------------------* WRITE(6,*) WRITE(6,*)'===================================================='// + '===========================' CALL WRITELINES(CTEXT) IF (KINT.LT.1.OR.KINT.GT.100) THEN WRITE(6,*)'*** INTPROMPT: KINT no good',KINT RETURN ENDIF 10 WRITE(6,*) WRITE(6,'(2X,''Default: '',10I4)') (INTARR(K),K=1,KINT) IF (KINT.GT.1) THEN WRITE(6,'(2X,''Enter'',I3,'' integers [CR=default]: '')')KINT ELSE WRITE(6,'(2X,''Enter one integer [CR=default]: '')') ENDIF LINE=' ' READ(*,'(A200)',ERR=10,END=11) LINE IF (.NOT.EMPTYSTRING(LINE)) THEN C remove comments ( 34535345 !$ comment ) KCOMMENT=INDEX(LINE,'!$') IF (KCOMMENT.NE.0) LINE(KCOMMENT:linelen)=' ' C check for legal string DO I=1,linelen IF (INDEX(' ,+-0123456789',LINE(I:I)).EQ.0) THEN WRITE(6,'(2X,''*** not an integer: '',A40)') LINE(1:40) GOTO 10 ENDIF ENDDO CALL STRPOS(LINE,ISTART,ISTOP) DO INUM = 1,KINT CALL GETTOKEN(LINE,LINELEN,INUM,NUMSTART,CTEMP) CALL RIGHTADJUST(CTEMP,1,20) READ(CTEMP,'(I20)') INTARR(INUM) ENDDO ENDIF 11 WRITE(6,'(2X,'' echo:'',10I4)') (INTARR(K),K=1,KINT) RETURN END C END GETINT C...................................................................... C...................................................................... C SUB GETINDEX SUBROUTINE GETINDEX(CTEST,STRINGPOS,IPOS) C get index of ctest in cstring INTEGER STRINGPOS(*),IPOS CHARACTER CTEST I=ICHAR(CTEST) IPOS=STRINGPOS(I) c if (ipos .eq. 0) then c WRITE(6,*)' WARNING: UNKNOWN character: ',ctest c endif RETURN END C END GETINDEX C...................................................................... C...................................................................... C SUB GETPIDCODE SUBROUTINE GETPIDCODE(FILENAME,PID) C extract protein ID from file name CHARACTER*(*) FILENAME, PID CHARACTER NAME*500,TEMPNAME*500 C PID=' ' TEMPNAME=' ' CALL STRPOS(FILENAME,ISTART,IEND) IF (IEND .GT. LEN(TEMPNAME)) THEN WRITE(6,*)' ERROR in GETPIDCODE' WRITE(6,*)' tempname variable too short' STOP ENDIF TEMPNAME(1:IEND)=FILENAME(1:IEND) CALL LOWTOUP(TEMPNAME,IEND) NAME=FILENAME(ISTART:IEND) C DO IR=IEND,1,-1 IF (TEMPNAME(IR:IR) .EQ. '.') then IEND=IR-1 GOTO 111 ENDIF ENDDO 111 TEMPNAME=' ' DO IL=IEND,ISTART,-1 IF ((NAME(IL:IL) .EQ. '/') .OR. (NAME(IL:IL) .EQ. ':') + .OR. (NAME(IL:IL) .EQ. ']') ) THEN ISTART=IL+1 GOTO 222 ENDIF ENDDO 222 PID(1:)=FILENAME(ISTART:IEND) c444 il=index(name(:ir),'.') c if (il .gt. 0) then c name(il:il)='|' c goto 444 c else c goto 555 c endif c 555 if (iend .gt. len(pid)) then c WRITE(6,*)' ERROR in GETPIDCODE' c WRITE(6,*)' pid variable too short' c STOP c endif c PID=NAME(:IR) RETURN END C END GETPIDCODE C...................................................................... C...................................................................... C SUB GETPOS SUBROUTINE GETPOS(CSTRING,STRINGPOS,N) C RS JAN 90 C store ASCII code of cstring in array stringpos INTEGER STRINGPOS(*),N CHARACTER*(*) CSTRING DO I=1,N STRINGPOS(I)=0 ENDDO ILEN=LEN(CSTRING) DO I=1,ILEN J=ICHAR(CSTRING(I:I)) STRINGPOS(J)=I ENDDO RETURN END C END GETPOS C...................................................................... C...................................................................... C SUB GETSEQ SUBROUTINE GETSEQ(IN,NDIM,NRES,CRESID,CSQ,STRUC,KACC, + LDSSP,FILENAME,COMPND,ACCESSION,CDUMMY,IOP,TRANS,NTRANS, + KCHAIN,NCHAIN,CCHAIN) C RS 89 changed to read from PDB-file (used in MAXHOM) C by Chris Sander, 1982 and later C and Brigitte Altenberg, 1987 and later C GET SEQUENCE FROM DSSP-FILE, HSSP SWISSPROT....OR FREE FORMAT FILE. CAUTION: used by MAXHOM, PUZZLE, WINDOW-DNA (?), SEG-PRED (?) etc. C C NDIM - MAX SPACE IN SEQUENCE ARRAY C NREAD - NUMBER OF RESIDUES READ C NRES - NUMBER OF RESIDUES PASSED ON C IN - LOGICAL UNIT NUMBER OF SEQ FILE C IOP - LOGICAL UNIT NUMBER OF OUTPUT FILE C KCHAIN - KCHAINTH CHAIN WANTED (K=0 => ALL CHAINS,K<>0 => KTH C CHAIN) BUT ONLY IF "_!_A,B" IS NOT SPECIFIED !! C NCHAIN - NUMBER OF CHAINS IN *.DSSP DATA-SET C CCHAIN - NAME OF CHAIN C LCHAIN() - true if 'x' chain wanted PARAMETER (MAXCHAIN= 40) PARAMETER (MAXRECLEN= 9999) C PARAMETER (MAXRECLEN= 10000) C PARAMETER (MAXRECLEN= 30011) CHARACTER LOWER*26,PUNCTUATION*10,FORMATNAME*4 CHARACTER TRANS*26,CS*1,CC*1 CHARACTER LINE*(MAXRECLEN) cx character*200 FILENAME CHARACTER*(*) FILENAME C compound for DSSP CHARACTER*(*) COMPND C accession number and dummy string (fx. pdb-pointer from swissprot) CHARACTER*(*) ACCESSION,CDUMMY CHARACTER*1 CSQ(*),STRUC(*),CH,CCHAIN CHARACTER*6 CRESID(*),CR LOGICAL TRUNCATED,ERRFLAG,LKEEP,LCHAIN(MAXCHAIN) LOGICAL LDSSP,LACCZERO,LHSSP INTEGER KACC(*),KCHAIN INTEGER IOP C INTERNAL CHARACTER CTEST*1,CHAINMODE*20,CHAINID(MAXCHAIN)*1 LOGICAL LCHAINBREAK,LEGALRES CHARACTER*100 CTEMP C dont use INDEX command (CPU time) INTEGER NASCII PARAMETER (NASCII= 256) INTEGER TRANSPOS(NASCII) C read from BRK CHARACTER SEQ(9999)*3,CIDRES(9999)*6 C CHARACTER SEQ(10000)*3,CIDRES(10000)*6 C CHARACTER SEQ(30011)*3,CIDRES(30011)*6 C====================================================================== IEND=0 ISEQLEN=0 ISTART=0 ISTOP=0 LOWER='abcdefghijklmnopqrstuvwxyz' LDSSP=.FALSE. LHSSP=.FALSE. IF (IOP.NE.0)WRITE(IOP,*)FILENAME CDUMMY=' ' ACCESSION=' ' LINE=' ' CAUTION.. recommendation: C calling program should allow "!" as legal residue for DSSP format C *BA* IF (NTRANS.EQ.0) THEN WRITE(6,*)'GETSEQ: NTRANS was 0 !!!!' NTRANS=26 TRANS='GAVLISTDENQKHRFYWCMPBZX!-.' WRITE(6,*)'GETSEQ: TRANS set to:', TRANS ENDIF IF (NTRANS.GT.26) THEN WRITE(6,*)'trans:#',TRANS,'# ntrans:',NTRANS STOP'GETSEQ ERROR *** NTRANS.GT.26' ENDIF L=INDEX(TRANS(1:NTRANS),'-') IF (L.EQ.0) THEN WRITE (*,*)'GETSEQ: WARNING: Trans must include"-" ' ENDIF CALL GETPOS(TRANS,TRANSPOS,NASCII) C *BA*BEGIN NRES=0 C......................defaults........ C in general, only blanks are allowed PUNCTUATION=' ' DO I=1,NDIM KACC(I)=0 C implies that unknown residues are named - CSQ(I)='-' C undefined STRUC(I)='U' C *BA*END ENDDO COMPND=FILENAME C read only the kth chain *BA* C NAME OF CHAIN CCHAIN=' ' C CHAIN COUNTER NCHAIN=1 C RES LINE COUNTER NRESLINE=0 NSELECT=0 CALL strpos(FILENAME,i,LENFILNAM) WRITE(6,*) 'GETSEQ: ', FILENAME(1:LENFILNAM) IF (LENFILNAM .LE. 1) THEN WRITE(6,*)'GETSEQ: *** empty file name, return with NRES=0' RETURN ENDIF I=INDEX(FILENAME,'_!_') C RS 90 C extract selected chains C fx: $pdb:4hhb.dssp_!_1,2 IF (I.NE.0) THEN DO J=1,MAXCHAIN LCHAIN(J)=.FALSE. ENDDO NSELECT=1 IEND=LEN(FILENAME) DO J=IEND,I+1,-1 IF (FILENAME(J:J).EQ.',')NSELECT=NSELECT+1 ENDDO ISTART=INDEX(FILENAME,'!_')+2 READ(FILENAME(ISTART:ISTART),'(A1)')CTEST IF (INDEX('1234567890',CTEST).NE.0) THEN CHAINMODE='NUMBER' WRITE(6,*)' WILL READ CHAINS ACCORDING TO NUMBER' ELSE CHAINMODE='CHARACTER' WRITE(6,*)' WILL READ CHAINS ACCORDING TO CHARACTER' ENDIF DO J=1,NSELECT IF (CHAINMODE.EQ.'NUMBER') THEN CALL READ_INT_FROM_STRING(FILENAME(ISTART:),K) IF (K.GT.0 .AND. K.LE.MAXCHAIN) THEN LCHAIN(K)=.TRUE. ELSE WRITE(6,*)'*** ERROR: K<1 OR K>MAXCHAIN IN GETSEQ' STOP ENDIF ELSE READ(FILENAME(ISTART:ISTART),'(A1)')CHAINID(J) CALL LOWTOUP(CHAINID(J),1) ENDIF ISTART=ISTART+2 ENDDO WRITE(6,*)' **** GETSEQ: extract the chain(s)' IF (CHAINMODE.EQ.'NUMBER') THEN DO J=1,MAXCHAIN IF (LCHAIN(J))WRITE(6,*)' CHAIN: ',J ENDDO ELSE DO J=1,NSELECT WRITE(6,*)' CHAIN: ',CHAINID(J) ENDDO ENDIF ISTOP=INDEX(FILENAME,'_!')-1 FILENAME=FILENAME(1:ISTOP) ELSE CHAINMODE='NONE' IF (KCHAIN.NE.0) THEN WRITE(6,*)' will extract chain number: ',KCHAIN ENDIF DO J=1,MAXCHAIN LCHAIN(J)=.TRUE. ENDDO ENDIF C *BA*BEGIN CALL CHECKFORMAT(IN,FILENAME,FORMATNAME,ERRFLAG) c WRITE(6,*) ' GETSEQ: format is ',FORMATNAME IF (INDEX(FORMATNAME,'DSSP').NE.0) THEN LDSSP=.TRUE. ENDIF IF (INDEX(FORMATNAME,'HSSP').NE.0) THEN LHSSP=.TRUE. ENDIF IF (ERRFLAG) THEN WRITE(6,*)'GETSEQ: file open error, set NRES=0 and return' WRITE(6,*)'filename: ', FILENAME RETURN ENDIF CTEMP=' ' write(ctemp,'(a,i5)')'READONLY,OLD,RECL=',maxreclen CALL OPEN_FILE(IN,FILENAME,ctemp,ERRFLAG) C *BA*END IF (FORMATNAME.EQ.'DSSP') GOTO 100 IF (FORMATNAME.EQ.'BRK ') GOTO 200 IF (FORMATNAME.EQ.'PIR ') GOTO 300 IF (FORMATNAME.EQ.'EMBL') GOTO 400 IF (FORMATNAME.EQ.'GCG ') GOTO 500 IF (FORMATNAME.EQ.'UWGC') GOTO 600 IF (FORMATNAME.EQ.'HSSP') GOTO 700 C--------------NOT DSSP----NOT PIR----NOT EMBL--NOT GCG---------------- C--------------simple STAR FORMAT, probably DO WHILE(.TRUE.) READ(IN,'(A)',END=900) LINE IF (LINE(1:1).EQ.'*') THEN IF (IOP.NE.0)WRITE(IOP,*) LINE C NOT A COMMENT LINE ELSE CALL STRPOS(LINE,IBEG,IEND) DO J=1,IEND CS=LINE(J:J) CALL GETINDEX(CS,TRANSPOS,I) C star format allows chainbreak IF ( .NOT. LCHAINBREAK(CS,NRES+1) .AND. I.NE.0) THEN NRES=NRES+1 IF (NRES.LE.NDIM) THEN CSQ(NRES)=CS ELSE WRITE(IOP,'(A)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ ***' WRITE(6,'(A)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ ***' ENDIF ENDIF C J, CHARACTERS IN LINE ENDDO C COMMENT OR SEQUENCE LINE ENDIF C NEXT LINE ENDDO C-------------------------------READ FROM :DSSP----------------- C ** SECONDARY STRUCTURE DEFINITION BY THE PROGRAM DSSP, \\ C VERSION OCT. 1985 C FERENCE W. KABSCH AND C.SANDER, BIOPOLYMERS 22 (1983) 2577-2637 C ADER PANCREATIC HORMONE 16-JAN-81 1PPT C MPND AVIAN PANCREATIC POLYPEPTIDE 100 READ(IN,'(A124)',END=199)LINE IF (INDEX(LINE,'SECONDARY').EQ.0) THEN WRITE(6,*)'***GETAASEQ ERROR: DSSP file assumed, but...' WRITE(6,*)' the word /SECONDARY/ is missing in first line' RETURN ENDIF C reference - ignore READ(IN,'(A)',END=199)LINE C header READ(IN,'(A)',END=199)LINE C* LINE='*'//LINE IF (IOP.NE.0)WRITE(IOP,*)LINE C compnd READ(IN,'(A)',END=199)LINE C* LINE='*'//LINE IF (IOP.NE.0)WRITE(IOP,*)LINE COMPND=LINE(11:200) C C C repeat until # 105 READ(IN,'(A)',END=199)LINE IF (INDEX(LINE(1:5),'#').EQ.0) GOTO 105 C C23456123451c1cc1 Ccccccaaaaaacaccacccccccccccccccccciii C 9 9 A S E -aB 35 15A 0 24,-2.3 27,-2.9 -2,-0.4 28,-0.5 -0.939 14.7-175.8-120.8 141.0 -5.5 9.8 13.0 C 21 21 Y E -AB 32 45A 68 24,-3.1 24,-2.9 -2,-0.3 C DSSP: seqstr acc hbonds C NPICK=0 DO WHILE (.TRUE.) READ(IN,'(6X,A5,A1,1X,A1,2X,A1,18X,I3)',END=900) + CR(1:5),CH,CS,CC,IACC C Res line counter. Note: NRES = # of res passed NRESLINE=NRESLINE+1 LKEEP=.FALSE. C ......CONVERT SS-BRIDGES TO 'C'.... IF (INDEX(LOWER,CS).NE.0) CS='C' IF (NRES.LT.NDIM) THEN C incr.chains *BA* IF (LCHAINBREAK(CS,NRESLINE)) THEN NCHAIN=NCHAIN+1 ELSE IF (KCHAIN.EQ.NCHAIN)LKEEP=.TRUE. ENDIF C KCHAIN=0 => all chains IF (KCHAIN.EQ.0)LKEEP=.TRUE. C if chains are identified by filename IF (CHAINMODE.EQ.'NUMBER') THEN IF (LCHAIN(NCHAIN)) THEN C if the first chain wanted is not the first chain in DSSP-file, skip C the first position ('!') IF (NPICK.EQ.0) THEN IF (LCHAINBREAK(CS,NRESLINE)) THEN LKEEP=.FALSE. ENDIF ELSE LKEEP=.TRUE. ENDIF NPICK=1 ELSE LKEEP=.FALSE. ENDIF ELSE IF (CHAINMODE.EQ.'CHARACTER') THEN LKEEP=.FALSE. IF (LCHAINBREAK(CS,NRESLINE)) THEN IF (NPICK.EQ.0) THEN LKEEP=.FALSE. ELSE LKEEP=.TRUE. ENDIF ELSE CALL LOWTOUP(CH,1) DO JCHAIN=1,NSELECT IF (CHAINID(JCHAIN).EQ.CH) THEN LKEEP=.TRUE. NPICK=1 ENDIF ENDDO IF (.NOT.LKEEP) THEN IF (CSQ(NRES).EQ.'!')NRES=NRES-1 ENDIF ENDIF ENDIF C keep only the kth chain IF (LKEEP) THEN CALL GETINDEX(CS,TRANSPOS,I) IF (I .NE. 0) THEN CAUTION: change here (or in SEQTOINT) to L=0 implying chain break C CAUTION: INCREMENT ONLY OF LEGAL AA OR - OR ! NRES=NRES+1 CRESID(NRES)=CR(1:5)//CH CSQ(NRES)=CS KACC(NRES)=IACC CCHAIN=CH STRUC(NRES)=CC c WRITE(6,*)cchain C ### ILLegal RESIDUES ENDIF C CHAINS WANTED ENDIF C DIMENSION OVERFLOW ELSE WRITE(IOP,'(A)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ ***' WRITE(6,'(A,I9)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ ***',MAXSQ GOTO 900 ENDIF C NEXT LINE ENDDO C--------------DSSP read error ----------------------------------- 199 WRITE(6,*)'***GETAASEQ: incomplete DSSP file (EOF) ' NRES=0 NCHAIN=0 CALL STRPOS(FILENAME,I,LENFILNAM) WRITE(6,*) 'FILE: ',FILENAME(1:LENFILNAM) CLOSE(IN) RETURN C----------------READ FROM BROOKHAVEN-------------------------------- 200 READ(IN,'(A)',END=900,ERR=999)LINE IF (INDEX(LINE,'HEADER').EQ.0) THEN WRITE(6,*)'***GETAASEQ ERROR: BRK file assumed, but...' WRITE(6,*)' the word /HEADER/ is missing in first line' RETURN ENDIF IF (IOP.NE.0)WRITE(IOP,*)LINE(1:200) C compnd READ(IN,'(A)',END=900,ERR=999)LINE IF (IOP.NE.0)WRITE(IOP,*)LINE(1:200) COMPND=LINE(1:200) C read only the kth chain C NAME OF CHAIN CCHAIN=' ' C CHAIN COUNTER NCHAIN=1 C RES LINE COUNTER NRESLINE=0 NRES=0 210 READ(IN,'(A)',END=280,ERR=999)LINE NRESLINE=NRESLINE+1 IF (LINE(1:4).EQ.'ATOM') THEN C if chains are identified by filename IF (CHAINMODE.EQ.'CHARACTER') THEN LKEEP=.FALSE. DO J=1,NSELECT IF (CHAINID(J).EQ.LINE(22:22))LKEEP=.TRUE. ENDDO ELSE LKEEP=.TRUE. ENDIF IF (LKEEP) THEN IF (NRES.LE.NDIM) THEN NRES=NRES+1 SEQ(NRES)=LINE(18:20) CIDRES(NRES)=LINE(22:27) IF (SEQ(NRES).EQ.'ACE') THEN NRES=NRES-1 WRITE(6,*)' GETAASEQ: ACE ignored at res ',NRES GOTO 210 ENDIF IF (NRES.NE.1) THEN IF (CIDRES(NRES-1).EQ.CIDRES(NRES))NRES=NRES-1 ENDIF ELSE WRITE(IOP,'(A,I9)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ ***',MAXSQ WRITE(6,'(A,I9)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ ***',MAXSQ GOTO 900 ENDIF ENDIF ELSE IF (LINE(1:3).EQ.'TER') THEN IF (NRES.NE.0) THEN NRES=NRES+1 SEQ(NRES)='!!!' ENDIF ENDIF GOTO 210 280 CALL S3TOS1(SEQ,CSQ,NRES) 290 IF (SEQ(NRES).NE.'!!!') THEN GOTO 900 ELSE SEQ(NRES)=' ' CIDRES(NRES)=' ' NRES=NRES-1 GOTO 290 ENDIF C====== C---------------------------READ FROM :PIR--------------------*BA*BEGIN C C 300 CONTINUE PUNCTUATION=',.:;()+ ' C Header line 1, ignore READ (IN,'(A)',END=999)LINE C Header line 2, ignore READ (IN,'(A)',END=999)LINE c IF (INDEX(LINE,'>').NE.0) THEN C THERE are TWO SPECIAL LINES c READ (IN,'(A)',END=999)LINE c ENDIF CALL STRPOS(LINE,IBEG,IEND) LINE(1:)='*'//LINE(IBEG:IEND) C WRITE HEADER IF (IOP.NE.0) then WRITE(IOP,*)LINE(ibeg:iend) ENDIF DO WHILE(.TRUE.) C IN THE NEXT LINES ARE RESIDUES READ (IN,'(A)',END=900)LINE CALL STRPOS(LINE,IBEG,IEND) DO J=1,IEND CS=LINE(J:J) IF (CS.EQ.'*') GOTO 900 IF (LEGALRES(CS,NRES,TRANS,NTRANS,PUNCTUATION)) THEN NRES=NRES+1 C CHECK FOR OVERFLOW IF (NRES.LE.NDIM) THEN CSQ(NRES)=CS C OVERFLOW ELSE WRITE(IOP,'(A)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ ***' WRITE(6,'(A)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ ***' GOTO 900 ENDIF C LEGAL RESIDUES ENDIF C NEXT RESIDUE ENDDO C NEXT LINE ENDDO C---------------------------READ FROM :EMBL ---------------------------- 400 CONTINUE ID=0 DO WHILE (.TRUE.) READ (IN,'(A)',END=999)LINE C LOOK FOR ACCESSION NUMBER AND TAKE THE FIRST ONE IF (INDEX(LINE(:2),'AC').NE.0) THEN I=INDEX(LINE,';')-1 ACCESSION(1:)=LINE(6:I) ENDIF C LOOK FOR DEFINITION IF (INDEX(LINE(:2),'DE').NE.0) THEN COMPND=' ' COMPND(1:74)=LINE(6:79) C WRITE ONLY DEFINITION IF (IOP.NE.0)WRITE (IOP,*)LINE GOTO 410 ENDIF ENDDO 410 DO WHILE (.TRUE.) C LOOK FOR LINE BEGINNING READ (IN,'(A)',END=999)LINE C WITH "SQ" IF (INDEX(LINE(:2),'SQ').NE.0) THEN GOTO 420 C*RS 89 C look for PDB-database pointer and store them in CDUMMY ELSE IF (INDEX(LINE(:2),'DR').NE.0 .AND. + INDEX(LINE,'PDB;').NE.0) THEN CALL STRPOS(CDUMMY,ISTART,ISTOP) CALL STRPOS(LINE,JSTART,JSTOP) IF (ISTOP+JSTOP+10 .LE. LEN(CDUMMY) ) THEN IF (ID .LE. 0) THEN CDUMMY(ISTOP+1:)=LINE(10:JSTOP) ELSE CDUMMY(ISTOP+1:)='|'//LINE(10:JSTOP) ENDIF ID=ID+1 ELSE WRITE(6,*)'**** PDBREF-LINE DIMENSION OVERFLOW ***' ENDIF ENDIF ENDDO 420 CALL STRPOS(CDUMMY,ISTART,ISTOP) IF (ID .GT. 0) THEN IF ( (ISTOP+7) .LE. LEN(CDUMMY) ) THEN WRITE(CDUMMY(ISTOP+1:),'(A,I4)')'||',ID ELSE WRITE(6,*)'**** PDBREF-LINE DIMENSION OVERFLOW ***' ENDIF ENDIF DO WHILE (.TRUE.) C SEQUENCES NEXT LINE READ (IN,'(A)',ERR=999,END=900) LINE C END OF ROUTINE IF (INDEX(LINE(:2),'//').NE.0) GOTO 900 C NO MORE TEXT ALLOWED IF (INDEX(LINE(:2),' ').NE.0) THEN CALL STRPOS(LINE,IBEG,IEND) DO J=1,iend CS=LINE (J:J) CALL GETINDEX(CS,TRANSPOS,I) IF (I .NE. 0) THEN NRES=NRES+1 C CHECK FOR OVERFLOW IF (NRES.LE.NDIM) THEN CSQ(NRES)=CS C OVERFLOW ELSE WRITE(IOP,'(A)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ ***' WRITE(6,'(A,I9)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ=',MAXSQ GOTO 900 ENDIF C LEGAL RESIDUES ENDIF C NEXT RESIDUE ENDDO C NO TEXT BETWEEN THE LINES ENDIF C NEXT LINE ENDDO C------------------------END EMBL-READING------------------------------ C------------------------READ FROM:GCG-FORMAT-------------------------- Cold500 DO WHILE (.TRUE.) Cold READ (IN,'(A124)',END=999)LINE Cold IF (INDEX(LINE(:2),'ID').NE.0) THEN Cold IF (IOP.NE.0)WRITE (IOP,*)LINE Cold GOTO 510 Cold ENDIF Cold ENDDO 500 DO WHILE (.TRUE.) C GET SEQUENCE WHILE READ (IN,'(A)',END=999)LINE IF (IOP.NE.0)WRITE (IOP,*)LINE IF (INDEX(LINE,'..').NE.0)GOTO 520 ENDDO 520 DO WHILE (.TRUE.) C GET THE SEQUENCES READ (IN,'(A)',ERR=999,END=900)LINE CALL STRPOS(LINE,IBEG,IEND) DO J=1,IEND CS=LINE (J:J) C CHECK FOR LEGAL RESIDUES CALL GETINDEX(CS,TRANSPOS,I) IF (I .NE.0) THEN NRES=NRES+1 C CHECK FOR OVERFLOW IF (NRES.LE.NDIM) THEN CSQ(NRES)=CS ELSE WRITE(IOP,'(A)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ ***' WRITE(6,'(A,I9)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ=',MAXSQ GOTO 900 ENDIF C LEGAL RESIDUE ENDIF C NEXT RESIDUE ENDDO C NEXT LINE ENDDO C---------------------------READ FROM :UWGCG------------------*BA*BEGIN C HEADER 600 READ (IN,'(A)',END=999)LINE IF (INDEX(LINE,'Check').EQ.0) THEN C THERE IS AN EMPTY LINE READ (IN,'(A)',END=999)LINE ENDIF LINE='*'//LINE(1:len(line)-1) C WRITE HEADER IF (IOP.NE.0)WRITE(IOP,*)LINE DO WHILE (.TRUE.) C GET SEQUENCE WHILE READ (IN,'(A)',END=999)LINE C LOOKING FOR A LINE IF (INDEX(LINE(3:50),'Length').NE.0) THEN C WITH 'LENGHT'AND'CHECK' IF (INDEX(LINE(50:124),'Check').NE.0)GOTO 610 ENDIF ENDDO 610 DO WHILE(.TRUE.) C EMPTY LINE READ (IN,'(A)',END=900,ERR=999)LINE READ (IN,'(A)',END=900,ERR=999)LINE CALL STRPOS(LINE,IBEG,IEND) DO J=9,iend CS=LINE(J:J) IF (CS.EQ.'*') GOTO 900 C CHECK FOR LEGAL RESIDUES CALL GETINDEX(CS,TRANSPOS,I) IF (I .NE. 0) THEN NRES=NRES+1 C CHECK FOR OVERFLOW IF (NRES.LE.NDIM) THEN CSQ(NRES)=CS C OVERFLOW ELSE WRITE(IOP,'(A,I9)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ=',MAXSQ WRITE(6,'(A,I9)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ=',MAXSQ GOTO 900 ENDIF C LEGAL RESIDUES ENDIF C NEXT RESIDUE ENDDO C NEXT LINE ENDDO C C---------------------------READ FROM :HSSP---------------------------- 700 READ(IN,'(A)',END=199)LINE IF (INDEX(LINE,'HOMOLOGY').EQ.0) THEN WRITE(6,*)'***GETAASEQ ERROR: HSSP file assumed, but...' WRITE(6,*)' the word /HOMOLOGY/ is missing in first line' RETURN ENDIF DO WHILE(INDEX(LINE,'NOTATION ').EQ.0) READ(IN,'(A)',END=199)LINE IF (INDEX(LINE,'HEADER').NE.0) THEN IF (IOP.NE.0)WRITE(IOP,*)LINE(12:) ELSE IF (INDEX(LINE,'COMPND').NE.0) THEN IF (IOP.NE.0)WRITE(IOP,*)LINE COMPND=LINE(12:200) ELSE IF (INDEX(LINE,'SEQLENGTH ').NE.0) THEN call read_int_from_string(LINE(12:),iseqlen) ENDIF ENDDO DO WHILE(INDEX(LINE,'## ALIGNMENTS').EQ.0) READ(IN,'(A)',END=199)LINE ENDDO READ(IN,'(A)',END=199)LINE NPICK=0 DO IRES=1,ISEQLEN READ(IN,'(7X,A5,A1,1X,A1,2X,A1,18X,I3)',END=799) + CR(1:5),CH,CS,CC,IACC C Res line counter. Note: NRES = # of res passed NRESLINE=NRESLINE+1 LKEEP=.FALSE. C CONVERT SS-BRIDGES TO 'C'.... IF (INDEX(LOWER,CS).NE.0) CS='C' IF (NRES.LT.NDIM) THEN C incr.chains IF (LCHAINBREAK(CS,NRESLINE)) THEN NCHAIN=NCHAIN+1 ELSE IF (KCHAIN.EQ.NCHAIN)LKEEP=.TRUE. ENDIF C KCHAIN=0 => all chains IF (KCHAIN.EQ.0)LKEEP=.TRUE. C if chains are identified by filename IF (CHAINMODE.EQ.'NUMBER') THEN IF (LCHAIN(NCHAIN)) THEN C if the first chain wanted is not the first chain in DSSP-file, skip C the first position ('!') IF (NPICK.EQ.0) THEN IF (LCHAINBREAK(CS,NRESLINE)) THEN LKEEP=.FALSE. ENDIF ELSE LKEEP=.TRUE. ENDIF NPICK=1 ELSE LKEEP=.FALSE. ENDIF ELSE IF (CHAINMODE.EQ.'CHARACTER') THEN LKEEP=.FALSE. IF (LCHAINBREAK(CS,NRESLINE)) THEN IF (NPICK.EQ.0) THEN LKEEP=.FALSE. ELSE LKEEP=.TRUE. ENDIF ELSE CALL LOWTOUP(CH,1) DO JCHAIN=1,NSELECT IF (CHAINID(JCHAIN).EQ.CH) THEN LKEEP=.TRUE. NPICK=1 ENDIF ENDDO IF (.NOT.LKEEP) THEN IF (CSQ(NRES).EQ.'!')NRES=NRES-1 ENDIF ENDIF ENDIF C keep only the kth chain IF (LKEEP) THEN C CHECK FOR LEGAL RESIDUES CALL GETINDEX(CS,TRANSPOS,I) IF (I .NE. 0) THEN CAUTION: INCREMENT ONLY OF LEGAL AA OR - OR ! NRES=NRES+1 CRESID(NRES)=CR(1:5)//CH CSQ(NRES)=CS KACC(NRES)=IACC CCHAIN=CH STRUC(NRES)=CC C ILLegal RESIDUES OR LEGAL PUNCTATION ENDIF C CHAINS WANTED ENDIF C DIMENSION OVERFLOW ELSE WRITE(IOP,'(A)')'*** ERROR: DIMENSION OVERFLOW MAXSQ ***' WRITE(6,'(A,I9)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ=',MAXSQ GOTO 900 ENDIF C NEXT LINE ENDDO IF (NRESLINE .EQ. ISEQLEN)GOTO 900 C--------------HSSP read error ----------------------------------- 799 WRITE(6,*)'***GETSEQ: incomplete HSSP file ' NRES=0 NCHAIN=0 CALL STRPOS(FILENAME,I,LENFILNAM) WRITE(6,*) 'FILE: ',FILENAME(1:LENFILNAM) CLOSE(IN) RETURN C----------------------READ FILE ERROR--------------------------------- 999 WRITE (*,*)'****GETSEQ:INCOMPLETE FILE ',FILENAME(1:LENFILNAM) NRES=0 CLOSE (IN) RETURN C--------------------------------------------------------------*BA*END C---all formats: -----------FINISHED READING----------------------- 900 CLOSE(IN) IF (LDSSP .OR. LHSSP) THEN LACCZERO=.TRUE. DO I=1,NRES IF (KACC(I).NE.0) THEN LACCZERO=.FALSE. GOTO 910 ENDIF ENDDO 910 IF (LACCZERO) THEN WRITE(6,*)'*******************************************' WRITE(6,*)'* WARNING: all accessibility values are 0 *' WRITE(6,*)'*******************************************' IF (IOP.NE.0) THEN WRITE(IOP,'(A)')'***************************************' WRITE(IOP,'(A)')'* WARNING: accessibility values are 0 *' WRITE(IOP,'(A)')'***************************************' ENDIF ENDIF ENDIF C TRUNCATE IF NEEDED TRUNCATED=(NRES.GE.NDIM) NREAD=NRES NRES=MIN(NDIM,NRES) IF (TRUNCATED) THEN WRITE(6,*)'TRUNCATED TO ',NDIM,' RESIDUES' WRITE(6,*)'**** INCREASE DIMENSION ****' ENDIF C PRINT SEQ AND STRUC IF (IOP.NE.0) then WRITE(IOP,*)'LENGTH ',NRES IF (TRUNCATED)WRITE(IOP,*)'**** TRUNCATED FROM ',NREAD C some machines have problems with list directed I/O !! RS 94 c DO N=0,NRES/100 c N1=1+N*100 c N2=min(nres,100+N*100) c WRITE(IOP,*)(CSQ(I),I=N1,N2) c IF (LDSSP)WRITE(IOP,*)(STRUC(I),I=N1,N2) c ENDDO c WRITE(IOP,*)' ' ENDIF RETURN END C END GETSEQ C...................................................................... C...................................................................... C SUB GETSEQPROF SUBROUTINE GETSEQPROF(CSEQ,TRANS,IRES,NOCC,SEQPROF,MAXRES,MAXAA) C RS 89 C counts frequencies of amino acids C 'B' and 'Z' are assigned as well to the acid as to the amide form C with respect to their occurence in EMBL/SWISSPROT 13.0 IMPLICIT NONE INTEGER IRES,MAXRES,MAXAA CHARACTER*(*) TRANS,CSEQ INTEGER NOCC(*) INTEGER SEQPROF(MAXRES,MAXAA) REAL BTOD,BTON,ZTOE,ZTOQ C INTEGER I,J C================ BTOD=0.521 BTON=0.439 ZTOE=0.623 ZTOQ=0.41 C lower case character CALL LOWTOUP(CSEQ,1) IF (INDEX('BZ',CSEQ).EQ.0) THEN I=INDEX(TRANS(1:MAXAA),CSEQ) IF (I.EQ.0 .OR. I .GT. MAXAA) THEN WRITE(6,*)' GETSEQPROF: unknown residue symbol: ',cseq RETURN ELSE SEQPROF(IRES,I)=SEQPROF(IRES,I)+1 NOCC(IRES)=NOCC(IRES)+1 ENDIF ELSE IF (CSEQ.EQ.'B') THEN CD WRITE(6,*)' GETSEQPROF: convert B' I=INDEX(TRANS,'D') J=INDEX(TRANS,'N') SEQPROF(IRES,I)=NINT( SEQPROF(IRES,I)+BTOD) SEQPROF(IRES,J)=NINT( SEQPROF(IRES,J)+BTON) NOCC(IRES)=NOCC(IRES)+1 ELSE IF (CSEQ.EQ.'Z') THEN CD WRITE(6,*)' GETSEQPROF: convert Z' I=INDEX(TRANS,'E') J=INDEX(TRANS,'Q') SEQPROF(IRES,I)=NINT(SEQPROF(IRES,I)+ZTOE) SEQPROF(IRES,J)=NINT(SEQPROF(IRES,J)+ZTOQ) NOCC(IRES)=NOCC(IRES)+1 ENDIF RETURN END C END GETSEQPROF C...................................................................... C...................................................................... C SUB GETSIMMETRIC SUBROUTINE GETSIMMETRIC(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES_1,NIOSTATES_1,NSTRSTATES_2,NIOSTATES_2, + CSTRSTATES,CIOSTATES, + IORANGE,KSIM,SIMFILE,SIMMETRIC) IMPLICIT NONE C import INTEGER NTRANS CHARACTER*(*) TRANS INTEGER MAXSTRSTATES,MAXIOSTATES INTEGER KSIM CHARACTER*(*) SIMFILE c export INTEGER NSTRSTATES_1,NIOSTATES_1,NSTRSTATES_2,NIOSTATES_2 REAL IORANGE(MAXSTRSTATES,MAXIOSTATES) CHARACTER*(*) CSTRSTATES,CIOSTATES REAL SIMMETRIC(NTRANS,NTRANS,MAXSTRSTATES,MAXIOSTATES, + MAXSTRSTATES,MAXIOSTATES) c internal INTEGER I,J,K,L,I1,I2,J1,J2,ITRANS,IBEG,IEND INTEGER NSTR,NIO,ISTR1,IO1,ISTR2,IO2 INTEGER MATRIXPOS CHARACTER CSTR,CIO,LINE*250 CHARACTER*250 TESTSTRING CHARACTER*30 CTRANS LOGICAL LERROR C====================================================================== C init C====================================================================== MATRIXPOS=22 I= (NTRANS * NTRANS) * (MAXSTRSTATES * MAXSTRSTATES) * + (MAXIOSTATES * MAXIOSTATES) CALL INIT_REAL_ARRAY(1,I,SIMMETRIC,0.0) C accessibility cut to 200% = take all I= MAXSTRSTATES * MAXIOSTATES CALL INIT_REAL_ARRAY(1,I,IORANGE,200.0) TESTSTRING=' ' LINE=' ' CSTRSTATES=' ' CIOSTATES=' ' ITRANS=0 NSTRSTATES_1=1 NIOSTATES_1=1 NSTRSTATES_2=1 NIOSTATES_2=1 NSTR=0 NIO=0 c----------------------------------------------------------------------- TESTSTRING='AA STR I/O V L I M '// + 'F W Y G A P S T C '// + 'H R K Q E N D B Z' WRITE(6,'(a,a)')' GETSIMMATRIX open metric: ',simfile(1:50) CALL OPEN_FILE(KSIM,SIMFILE,'READONLY,OLD',LERROR) IF (LERROR)GOTO 99 C---------------------------------------------------------------------- DO WHILE(INDEX(LINE,TESTSTRING).EQ.0) READ(KSIM,'(A)',END=99)LINE IF (INDEX(LINE,'STRUCTURE-STATES_1:') .NE. 0) THEN I=INDEX(LINE,':')+1 CALL STRPOS(LINE,IBEG,IEND) CALL READ_INT_FROM_STRING(LINE(I:IEND),NSTRSTATES_1) ELSE IF (INDEX(LINE,'STRUCTURE-STATES_2:') .NE. 0) THEN I=INDEX(LINE,':')+1 CALL STRPOS(LINE,IBEG,IEND) CALL READ_INT_FROM_STRING(LINE(I:IEND),NSTRSTATES_2) ELSE IF (INDEX(LINE,'I/O-STATES_1:') .NE. 0) THEN I=INDEX(LINE,':')+1 CALL STRPOS(LINE,IBEG,IEND) CALL READ_INT_FROM_STRING(LINE(I:IEND),NIOSTATES_1) ELSE IF (INDEX(LINE,'I/O-STATES_2:') .NE. 0) THEN I=INDEX(LINE,':')+1 CALL STRPOS(LINE,IBEG,IEND) CALL READ_INT_FROM_STRING(LINE(I:IEND),NIOSTATES_2) ELSE IF (INDEX(LINE,'DSSP-STRUCTURE') .NE. 0) THEN DO I=1,NSTRSTATES_1 DO J=1,NIOSTATES_1 READ(KSIM,'(A)')LINE READ(LINE,'(4X,A1,13X,A1)')CSTR,CIO K=INDEX(CSTRSTATES,CSTR) IF (K.EQ.0) THEN NSTR=NSTR+1 K=NSTR IF (NSTR .GT. MAXSTRSTATES) THEN WRITE(6,*)'*** ERROR: struct-states overflow' STOP ENDIF CALL STRPOS(CSTRSTATES,IBEG,IEND) IF (IEND+1 .GT. LEN(CSTRSTATES)) THEN WRITE(6,*) + '*** ERROR: CSTRSTATES string too short' STOP ENDIF WRITE(CSTRSTATES(IEND+1:IEND+1),'(A1)')CSTR ENDIF L=INDEX(CIOSTATES,CIO) IF (L.EQ.0) THEN NIO=NIO+1 L=NIO IF (NIO .GT. MAXIOSTATES) THEN WRITE(6,*)'*** ERROR: I/O-states overflow' STOP ENDIF CALL STRPOS(CIOSTATES,IBEG,IEND) IF (IEND+1 .GT. LEN(CSTRSTATES)) THEN WRITE(6,*) + '*** ERROR: CIOSTATES string too short' STOP ENDIF WRITE(CIOSTATES(IEND+1:IEND+1),'(A1)')CIO ENDIF READ(LINE,'(26X,F3.0)')IORANGE(K,L) ENDDO ENDDO ENDIF ENDDO C---------------------------------------------------------------------- WRITE(6,*)' STRUCTURE-STATES_1: ',cstrstates,nstrstates_1 WRITE(6,*)' I/O-STATES_1 : ',ciostates,niostates_1 WRITE(6,*)' STRUCTURE-STATES_2: ',cstrstates,nstrstates_2 WRITE(6,*)' I/O-STATES_2 : ',ciostates,niostates_2 IF (NSTRSTATES_1 .EQ. 1)NSTR=1 IF (NIOSTATES_1 .EQ. 1)NIO=1 IF (NSTR .NE. NSTRSTATES_1 .OR. NIO .NE. NIOSTATES_1 ) THEN WRITE(6,*)'*** ERROR: number of structure-states .ne. NSTR' WRITE(6,*)' OR number of I/O-states .ne. NIO' STOP ENDIF C---------------------------------------------------------------------- DO WHILE(.TRUE.) ITRANS=ITRANS+1 DO ISTR1=1,NSTRSTATES_1 DO IO1=1,NIOSTATES_1 DO ISTR2=1,NSTRSTATES_2 DO IO2=1,NIOSTATES_2 READ(KSIM,'(A)',END=11)LINE I1=INDEX(CSTRSTATES,LINE(5:5)) J1=INDEX(CIOSTATES,LINE(8:8)) I2=INDEX(CSTRSTATES,LINE(6:6)) J2=INDEX(CIOSTATES,LINE(9:9)) IF (I1.EQ.0.OR.I2.EQ.0.OR.J1.EQ.0.OR.J2.EQ.0) THEN IF (I1.EQ.0)I1=1 IF (J1.EQ.0)J1=1 IF (I2.EQ.0)I2=1 IF (J2.EQ.0)J2=1 ENDIF READ(LINE,'(1X,A1,7X,22(1X,F5.2))') + CTRANS(ITRANS:ITRANS), + (SIMMETRIC(ITRANS,K,I1,J1,I2,J2), + K=1,MATRIXPOS) ENDDO ENDDO ENDDO ENDDO ENDDO 11 CLOSE(KSIM) ITRANS=ITRANS-1 C======================================================================= C reset value for chain breaks etc... C add 'X' ITRANS=ITRANS+1 CTRANS(ITRANS:ITRANS)='X' I=INDEX(TRANS,'X') DO J=1,NTRANS DO ISTR1=1,NSTRSTATES_1 DO IO1=1,NIOSTATES_1 DO ISTR2=1,NSTRSTATES_2 DO IO2=1,NIOSTATES_2 SIMMETRIC(I,J,ISTR1,IO1,ISTR2,IO2)=0.0 SIMMETRIC(J,I,ISTR1,IO1,ISTR2,IO2)=0.0 ENDDO ENDDO ENDDO ENDDO ENDDO c add '!' ITRANS=ITRANS+1 CTRANS(ITRANS:ITRANS)='!' I=INDEX(TRANS,'!') DO J=1,NTRANS DO ISTR1=1,NSTRSTATES_1 DO IO1=1,NIOSTATES_1 DO ISTR2=1,NSTRSTATES_2 DO IO2=1,NIOSTATES_2 SIMMETRIC(I,J,ISTR1,IO1,ISTR2,IO2)=0.0 SIMMETRIC(J,I,ISTR1,IO1,ISTR2,IO2)=0.0 ENDDO ENDDO ENDDO ENDDO ENDDO c add '-' ITRANS=ITRANS+1 CTRANS(ITRANS:ITRANS)='-' I=INDEX(TRANS,'-') DO J=1,NTRANS DO ISTR1=1,NSTRSTATES_1 DO IO1=1,NIOSTATES_1 DO ISTR2=1,NSTRSTATES_2 DO IO2=1,NIOSTATES_2 SIMMETRIC(I,J,ISTR1,IO1,ISTR2,IO2)=0.0 SIMMETRIC(J,I,ISTR1,IO1,ISTR2,IO2)=0.0 ENDDO ENDDO ENDDO ENDDO ENDDO c add '.' ITRANS=ITRANS+1 CTRANS(ITRANS:ITRANS)='.' I=INDEX(TRANS,'.') DO J=1,NTRANS DO ISTR1=1,NSTRSTATES_1 DO IO1=1,NIOSTATES_1 DO ISTR2=1,NSTRSTATES_2 DO IO2=1,NIOSTATES_2 SIMMETRIC(I,J,ISTR1,IO1,ISTR2,IO2)=0.0 SIMMETRIC(J,I,ISTR1,IO1,ISTR2,IO2)=0.0 ENDDO ENDDO ENDDO ENDDO ENDDO C---------------------------------------------------------------------- C check input order of amino acids C======================================================================= IF (TRANS(1:NTRANS) .NE. CTRANS(1:ITRANS)) THEN WRITE(6,*)' *** ERROR: CTRANS from metric-file and TRANS'// + ' are not the same' WRITE(6,*)'GETSIMMATRIX: ',ctrans,itrans WRITE(6,*)'GETSIMMATRIX: ',trans,ntrans STOP ENDIF C======================================================================= C debug C======================================================================= c do istr1=1,nstrstates_1 c do io1=1,niostates_1 c do istr2=1,nstrstates_2 c do io2=1,niostates_2 c WRITE(6,*)(simmetric(1,j,istr1,io1,istr2,io2),j=1,26) c enddo c enddo c enddo c enddo C======================================================================= RETURN C======================================================================= C unknown metric or read error C======================================================================= 99 CLOSE(KSIM) WRITE(6,'(a)') + '** ERROR reading metric in GETSIMMATRIX **' STOP END C END GETSIMMETRIC C...................................................................... C...................................................................... C SUB GETSWISSBASE SUBROUTINE GETSWISSBASE (KUNIT,MAXRES,KLOG,NRES,CSEQ,NAME, + COMPND,ACCESSION,CPDBREF,LENDFILE) c implicit none INTEGER KUNIT,MAXRES,KLOG,NRES CHARACTER*(*) CSEQ,NAME,COMPND,ACCESSION,CPDBREF LOGICAL LENDFILE CHARACTER*500 LOGSTRING c internal INTEGER LINELEN PARAMETER (LINELEN= 200) CHARACTER LINE*(LINELEN) INTEGER NID,ISTART,ISTOP,JSTART,JSTOP,I,J C====================================================================== LENDFILE=.FALSE. NID=0 NRES=0 NAME=' ' COMPND=' ' ACCESSION=' ' CPDBREF=' ' CSEQ=' ' LINE=' ' ISTOP=0 JSTOP=0 C===================================================================== DO WHILE (.TRUE.) READ(KUNIT,'(A)',END=900,ERR=900)LINE C identifier IF ( LINE(1:2) .EQ. 'ID' ) THEN NAME(1:)=LINE(6:17) c accession number ELSE IF ( LINE(1:2) .EQ. 'AC' ) THEN I=INDEX(LINE,';')-1 ACCESSION(1:)=LINE(6:I) c name ELSE IF ( LINE(1:2) .EQ. 'DE' ) THEN COMPND=' ' COMPND(1:200)=LINE(6:) GOTO 410 ENDIF ENDDO c search for sequence 410 READ(KUNIT,'(A)',END=999)LINE IF ( LINE(1:2) .EQ. 'SQ' ) THEN GOTO 420 C STORE LATEST BROOKHAVEN-POINTER IN CPDBREF ELSE IF ( LINE(1:2) .EQ. 'DR' .AND. + INDEX(LINE,'PDB;') .NE. 0) THEN CALL STRPOS(CPDBREF,ISTART,ISTOP) CALL STRPOS(LINE,JSTART,JSTOP) IF (LINE(JSTOP:JSTOP) .EQ. '.')JSTOP=JSTOP-1 IF (ISTOP+JSTOP .LE. LEN(CPDBREF) ) THEN IF (NID .LE. 0) THEN CPDBREF(ISTOP+1:)=LINE(11:JSTOP) ELSE CPDBREF(ISTOP+1:)='|'//LINE(11:JSTOP) ENDIF NID=NID+1 c else c WRITE(6,*)'**** PDBREF-LINE DIMENSION OVERFLOW ***' ENDIF ENDIF GOTO 410 420 IF (NID .GT. 0) THEN CALL STRPOS(CPDBREF,ISTART,ISTOP) IF ( (ISTOP+6) .LE. LEN(CPDBREF) ) THEN WRITE(CPDBREF(ISTOP+1:),'(A,I4)')'||',NID ELSE WRITE(6,*)'**** PDBREF-LINE DIMENSION OVERFLOW ***' ENDIF ENDIF c sequences starts in next line 430 READ(KUNIT,'(A)',ERR=999,END=999) LINE c end of database file reached ? IF ( LINE(1:2) .EQ. '//' ) RETURN c call strpos(line,istart,istop) DO ISTART=LINELEN,1,-1 IF (LINE(ISTART:ISTART).NE.' ') THEN ISTOP=ISTART GOTO 440 ENDIF ENDDO 440 DO J=1,ISTOP IF ( LINE(J:J) .NE. ' ' .AND. NRES+1 .LE. MAXRES) THEN NRES=NRES+1 CSEQ(NRES:NRES)=LINE(J:J) ELSE IF (NRES+1 .GT. MAXRES ) THEN WRITE(6,'(A)')'** DIMENSION OVERFLOW MAXSQ ***' GOTO 910 ENDIF ENDDO GOTO 430 C===================================================================== C END of SWISSPROT reached C===================================================================== 900 LENDFILE=.TRUE. NRES=0 RETURN C===================================================================== C TRUNCATE IF NEEDED C===================================================================== 910 WRITE(LOGSTRING,'(A,I8,A)')'TRUNCATED TO ',MAXRES, + ' RESIDUES: INCREASE DIMENSION ' c call log_file(klog,logstring,1) RETURN C====================================================================== 999 WRITE(LOGSTRING,'(A)') + '*** ERROR READING SWISSPROT, SET NRES=0 AND RETURN' c call log_file(klog,logstring,1) NRES=0 RETURN END C END GETSWISSBASE C...................................................................... C...................................................................... C SUB GET_SWISS_ENTRY SUBROUTINE GET_SWISS_ENTRY(MAXSQ,KUNIT,LBINARY,NRES,NAME, + COMPOUND,ACCESSION,PDBREF,SEQ,LEND) IMPLICIT NONE C IMPORT INTEGER MAXSQ,KUNIT LOGICAL LBINARY,LEND C EXPORT CHARACTER*(*) SEQ,NAME,COMPOUND,ACCESSION,PDBREF INTEGER NRES C INTERNAL INTEGER NSIZE,NSIZE2 PARAMETER (NSIZE= 12) PARAMETER (NSIZE2= 200) C====================================================================== LEND=.FALSE. C===================================================================== IF (LBINARY) THEN READ(KUNIT,END=900,ERR=900)NRES,NAME(1:NSIZE), + ACCESSION(1:NSIZE),PDBREF(1:NSIZE), + COMPOUND(1:NSIZE2) READ(KUNIT,END=900,ERR=999)SEQ(1:NRES) ELSE READ(KUNIT,'(I6,A,A,A,A,A)',END=900,ERR=999)NRES, + NAME(1:NSIZE),ACCESSION(1:NSIZE),PDBREF(1:NSIZE), + COMPOUND(1:NSIZE2),SEQ c read(kunit,'(i6,a,a,a,a,a)',end=900,err=999)nres,name, c + ACCESSION,pdbref, c + compound,seq ENDIF c truncate if needed IF (NRES .GT. MAXSQ) THEN c WRITE(6,*)' SEQ CUT TO MAXSQ: ',nres,MAXSQ NRES=MAXSQ CALL FLUSH_UNIT(6) ENDIF RETURN C====================================================================== 900 LEND=.TRUE. NRES=0 SEQ=' ' NAME=' ' ACCESSION=' ' PDBREF=' ' COMPOUND=' ' RETURN 999 WRITE(6,*)' ERROR in get_swiss_entry ',name,nres CALL FLUSH_UNIT(6) STOP END C END GET_SWISS_ENTRY C...................................................................... C...................................................................... ***** ------------------------------------------------------------------ ***** SUB GETTOKEN ***** ------------------------------------------------------------------ C---- C---- NAME : GETTOKEN C---- ARG : 1 CSTRING(1:LEN) = string of length LEN C---- ARG : 2 LEN = length of string C---- ARG : 3 ITOKEN = number of string to matcho C---- ARG : 4 FIRSTPOS = position where CSTRING matches C---- ARG : the ITOKEN nth STRING (non blank) C---- ARG : 5 CTOKEN = returns ITOKEN nth string that matched C---- ARG : that matched in CSTRING C---- DES : Builds up the ITOKEN nth string in the string CSTRING C---- DES : that is not having any blank. The first position of C---- DES : this string (FIRSTPOS) and the string (CTOKEN) are returned C---- DES : if no match: returns 0 (i.e. never matched) C---- *----------------------------------------------------------------------* C SUB GETTOKEN SUBROUTINE GETTOKEN(CSTRING,LEN,ITOKEN,FIRSTPOS,CTOKEN) IMPLICIT NONE C Import INTEGER LEN,ITOKEN CHARACTER*(*) CSTRING C Export INTEGER FIRSTPOS CHARACTER*(*) CTOKEN C Internal INTEGER IPOS,THISTOKEN,TPOS LOGICAL FINISHED,INSIDE ******------------------------------*-----------------------------****** C---- C---- initialise C---- CTOKEN= ' ' TPOS= 0 FINISHED= .FALSE. IF ( CSTRING(1:1) .EQ. ' ' ) THEN THISTOKEN= 0 INSIDE= .FALSE. ELSE THISTOKEN= 1 INSIDE= .TRUE. FIRSTPOS= 1 IF ( THISTOKEN .EQ. ITOKEN ) THEN TPOS= TPOS + 1 CTOKEN(TPOS:TPOS)= CSTRING(1:1) ENDIF ENDIF C---- C---- loop over string C---- IPOS = 2 DO WHILE ((IPOS .LE. LEN) .AND. (.NOT. FINISHED) ) IF ( CSTRING(IPOS:IPOS) .EQ. ' ' .OR. 1 IPOS .EQ. LEN ) THEN IF ( INSIDE ) THEN INSIDE = .FALSE. IF ( THISTOKEN .EQ. ITOKEN ) FINISHED = .TRUE. ENDIF ELSE IF ( .NOT. INSIDE ) THEN INSIDE = .TRUE. FIRSTPOS = IPOS THISTOKEN = THISTOKEN + 1 ENDIF IF ( THISTOKEN .EQ. ITOKEN ) THEN TPOS = TPOS + 1 CTOKEN(TPOS:TPOS) = CSTRING(IPOS:IPOS) ENDIF ENDIF IPOS = IPOS + 1 ENDDO IF ( .NOT. FINISHED ) FIRSTPOS = 0 RETURN END C END GETTOKEN C...................................................................... C...................................................................... C SUB HSSPHEADER SUBROUTINE HSSPHEADER(KHSSP,HSSPFILE,HSSPLINE,BRKID,CDATE, + DATABASE,CPARAMETER,NPARALINE,ISOSIGFILE,ISAFE,LFORMULA, + HEAD,COMP,SOURCE,AUTHOR,LRES,NCHAIN,KCHAIN,CHAINREMARK, + NALIGN) IMPLICIT NONE C import CHARACTER*(*) HSSPLINE,DATABASE,CPARAMETER(*), + HSSPFILE,ISOSIGFILE,CDATE INTEGER KHSSP,NPARALINE,ISAFE,LRES,NCHAIN,KCHAIN, + NALIGN LOGICAL LFORMULA,LERROR C Attributes of DSSP-file CHARACTER*(*) HEAD,COMP,SOURCE,AUTHOR,BRKID,CHAINREMARK C internal INTEGER I,LEN,ISTART,ISTOP C---- -------------------------------------------------- C---- open HSSP-file and write header C---- -------------------------------------------------- CALL OPEN_FILE(KHSSP,HSSPFILE,'NEW',LERROR) CALL STRPOS(HSSPLINE,ISTART,ISTOP) WRITE(KHSSP,'(A)')HSSPLINE(1:ISTOP) CALL STRPOS(BRKID,ISTART,ISTOP) WRITE(KHSSP,'(A,A)')'PDBID ',BRKID(ISTART:ISTOP) WRITE(KHSSP,'(A,A)')'DATE file generated on ',CDATE CALL STRPOS(DATABASE,ISTART,ISTOP) WRITE(KHSSP,'(A)')DATABASE(1:ISTOP) DO I=1,NPARALINE CALL STRPOS(CPARAMETER(I),ISTART,ISTOP) WRITE(KHSSP,'(A,A)')'PARAMETER ',CPARAMETER(I)(ISTART:ISTOP) ENDDO C---- which formula used for filtering? IF (LFORMULA) THEN IF (ISAFE.EQ.0) THEN WRITE(KHSSP,'(A,A)')'THRESHOLD ', + ' according to: t(L)=290.15 * L ** -0.562' ELSE IF (ISAFE.GT.0) THEN WRITE(KHSSP,'(A,A,I3)')'THRESHOLD ', + ' according to: t(L)=(290.15 * L ** -0.562) +',isafe ELSE IF (ISAFE.LT.0) THEN WRITE(KHSSP,'(A,A,I3)')'THRESHOLD ', + ' according to: t(L)=(290.15 * L ** -0.562) ',isafe ENDIF C---- no FORMULA filtering ELSE CALL STRPOS(ISOSIGFILE,ISTART,ISTOP) WRITE(KHSSP,'(A,A)')'THRESHOLD according to: ', + ISOSIGFILE(ISTART:ISTOP) ENDIF WRITE(KHSSP,'(A,A)')'REFERENCE ',' Sander C., Schneider R.'// + ' : Database of homology-derived protein structures.'// + ' Proteins, 9:56-68 (1991).' CALL STRPOS(HEAD,ISTART,ISTOP) WRITE(KHSSP,'(A,A)') 'HEADER ',HEAD(1:ISTOP) CALL STRPOS(COMP,ISTART,ISTOP) WRITE(KHSSP,'(A,A)') 'COMPND ',COMP(1:ISTOP) CALL STRPOS(SOURCE,ISTART,ISTOP) WRITE(KHSSP,'(A,A)') 'SOURCE ',SOURCE(1:ISTOP) CALL STRPOS(AUTHOR,ISTART,ISTOP) WRITE(KHSSP,'(A,A)') 'AUTHOR ',AUTHOR(1:ISTOP) WRITE(KHSSP,'(A,I4)') 'SEQLENGTH ',LRES CALL STRPOS(BRKID,ISTART,ISTOP) WRITE(KHSSP,'(A,I4,A,A,A)')'NCHAIN ',NCHAIN, + ' chain(s) in ',brkid(istart:istop),' data set' c WRITE(6,*)'chainremark: ',chainremark IF (CHAINREMARK .NE. ' ') THEN CALL STRPOS(CHAINREMARK,ISTART,ISTOP) WRITE(KHSSP,'(A,I4,A,A)')'KCHAIN ',KCHAIN, + ' chain(s) used here ; chain(s) : ',chainremark(1:istop) ENDIF WRITE(KHSSP,'(A,I4)') 'NALIGN ',NALIGN C---- C---- NOTATION part C---- WRITE(KHSSP,'(A)')'NOTATION : ID: EMBL/SWISSPROT identifier'// + ' of the aligned (homologous) protein' WRITE(KHSSP,'(A)')'NOTATION : STRID: if the 3-D structure of'// + ' the aligned protein is known, then STRID is the Protein'// + ' Data Bank identifier as taken' WRITE(KHSSP,'(A)')'NOTATION : from the database'// + ' reference or DR-line of the EMBL/SWISSPROT entry' WRITE(KHSSP,'(A)')'NOTATION : %IDE: percentage of residue'// + ' identity of the alignment' WRITE(KHSSP,'(A)')'NOTATION : %SIM (%WSIM): '// + ' (weighted) similarity of the alignment' WRITE(KHSSP,'(A)')'NOTATION : IFIR/ILAS: first and last resid'// + 'ue of the alignment in the test sequence' WRITE(KHSSP,'(A)')'NOTATION : JFIR/JLAS: first and last resid'// + 'ue of the alignment in the alignend protein' WRITE(KHSSP,'(A)')'NOTATION : LALI: length of the alignment'// + ' excluding insertions and deletions' WRITE(KHSSP,'(A)')'NOTATION : NGAP: number of insertions'// + ' and deletions in the alignment' WRITE(KHSSP,'(A)')'NOTATION : LGAP: total length of all'// + ' insertions and deletions' WRITE(KHSSP,'(A)')'NOTATION : LSEQ2: length of the entire'// + ' sequence of the aligned protein' WRITE(KHSSP,'(A)')'NOTATION : ACCESSION: SwissProt accession'// + ' number' WRITE(KHSSP,'(A)')'NOTATION : PROTEIN: one-line description'// + ' of aligned protein' WRITE(KHSSP,'(A)')'NOTATION : SeqNo,PDBNo,AA,STRUCTURE,BP1,'// + 'BP2,ACC: sequential and PDB residue numbers, amino acid '// + '(lower case = Cys), secondary' WRITE(KHSSP,'(A)')'NOTATION : structure, bridge '// + 'partners, solvent exposure as in DSSP (Kabsch and Sander,'// + ' Biopolymers 22, 2577-2637(1983)' WRITE(KHSSP,'(A)')'NOTATION : VAR: sequence variability on'// + ' a scale of 0-100 as derived from the NALIGN alignments' WRITE(KHSSP,'(A)')'NOTATION : pair of lower case characters'// + ' (AvaK) in the alignend sequence bracket a point of'// + ' INSERTION IN THIS sequence' WRITE(KHSSP,'(A)')'NOTATION : dots (....) in the alignend'// + ' SEQUENCE INDICATE POINTS of deletion in this sequence' WRITE(KHSSP,'(A)')'NOTATION : SEQUENCE PROFILE: relative '// + 'frequency of an amino acid type at each position. Asx'// + ' and Glx are in their' WRITE(KHSSP,'(A)')'NOTATION : acid/amide'// + ' form in proportion to their database frequencies' WRITE(KHSSP,'(A)')'NOTATION : NOCC: number of aligned sequenc'// + 'es spanning this position (including the test sequence)' WRITE(KHSSP,'(A)')'NOTATION : NDEL: number of sequences with'// + ' a deletion in the test protein at this position' WRITE(KHSSP,'(A)')'NOTATION : NINS: number of sequences with'// + ' an insertion in the test protein at this position' WRITE(KHSSP,'(A)')'NOTATION : ENTROPY: entropy measure of'// + ' sequence variability at this position' WRITE(KHSSP,'(A)')'NOTATION : RELENT: relative entropy, i.e. '// + ' entropy normalized to the range 0-100' WRITE(KHSSP,'(a)')'NOTATION : WEIGHT: conservation weight' WRITE(KHSSP,*) RETURN END C END HSSPHEADER C...................................................................... C...................................................................... C SUB INIT_CHAR_ARRAY SUBROUTINE INIT_CHAR_ARRAY(IBEG,IEND,CARRAY,SYMBOL) IMPLICIT NONE INTEGER I,IBEG,IEND CHARACTER*(*) CARRAY DIMENSION CARRAY(IBEG:IEND) CHARACTER*(*) SYMBOL DO I=IBEG,IEND CARRAY(I)=SYMBOL ENDDO RETURN END C END INIT_CHAR_ARRAY C...................................................................... C...................................................................... C SUB INIT_REAL_ARRAY SUBROUTINE INIT_REAL_ARRAY(IBEG,IEND,ARRAY,VALUE) IMPLICIT NONE INTEGER I,IBEG,IEND REAL ARRAY DIMENSION ARRAY(IBEG:IEND) REAL VALUE DO I=IBEG,IEND ARRAY(I)=VALUE ENDDO RETURN END C END INIT_REAL_ARRAY C...................................................................... C...................................................................... C SUB INIT_INT_ARRAY SUBROUTINE INIT_INT_ARRAY(IBEG,IEND,ARRAY,VALUE) IMPLICIT NONE INTEGER I,IBEG,IEND INTEGER ARRAY DIMENSION ARRAY(IBEG:IEND) INTEGER VALUE DO I=IBEG,IEND ARRAY(I)=VALUE ENDDO RETURN END C END INIT_INT_ARRAY C...................................................................... C...................................................................... C SUB INIT_INT2_ARRAY SUBROUTINE INIT_INT2_ARRAY(IBEG,IEND,ARRAY,VALUE) IMPLICIT NONE INTEGER I,IBEG,IEND INTEGER*2 ARRAY DIMENSION ARRAY(IBEG:IEND) INTEGER VALUE DO I=IBEG,IEND ARRAY(I)=VALUE ENDDO RETURN END C END INIT_INT2_ARRAY C...................................................................... C...................................................................... C SUB INT_TO_SEQ SUBROUTINE INT_TO_SEQ(LSEQ,SEQ,NRES,CTRANS,INDELMARK,ENDMARK) C reverse SEQ_TO_INTEGER C DSSP SS bridges (lower case) are lost (converted to 'C' from seqtoint) C converts amino acid integers to string of amino acid characters C uses translation table CHARACTER CTRANS IMPLICIT NONE C import INTEGER NRES,LSEQ(*) INTEGER INDELMARK,ENDMARK CHARACTER*(*) CTRANS c export CHARACTER*1 SEQ(*) C internal INTEGER I C DO I=1,NRES IF (LSEQ(I) .EQ. 0) THEN WRITE(6,*)'** unknown res or chain separator in INT_TO_SEQ' ENDIF IF (LSEQ(I) .EQ. INDELMARK) THEN SEQ(I)='.' ELSE IF (LSEQ(I) .EQ. ENDMARK) THEN SEQ(I)='<' ELSE SEQ(I)=CTRANS(LSEQ(I):LSEQ(I)) ENDIF ENDDO RETURN END C END INT_TO_SEQ C...................................................................... C...................................................................... C INT_TO_STRCLASS SUBROUTINE INT_TO_STRCLASS(MAXSTRSTATES,MAXALSQ,NRES,LSTRUC, + STR_CLASSES,INDELMARK,ENDMARK,STRUC) IMPLICIT NONE INTEGER MAXSTRSTATES,MAXALSQ INTEGER NRES,LSTRUC(*),INDELMARK,ENDMARK C---- br 99.03: watch hard_coded here, see maxhom.param CHARACTER*10 STR_CLASSES(MAXSTRSTATES) C---- --> REASON: the following produces warnings on SGI C CHARACTER*(*) STR_CLASSES(MAXSTRSTATES) CHARACTER STRUC(MAXALSQ) c internal INTEGER I c======================================================================= DO I=1,NRES IF (LSTRUC(I) .EQ. INDELMARK) THEN STRUC(I)='.' ELSE IF (LSTRUC(I) .EQ. ENDMARK) THEN STRUC(I)='<' ELSE STRUC(I)=STR_CLASSES(LSTRUC(I))(1:1) ENDIF ENDDO RETURN END C END INT_TO_STRCLASS C...................................................................... C...................................................................... C SUB INTERPRET_LINE SUBROUTINE INTERPRET_LINE(LINE,MAXFIELD, + MACROLINE, CFIELD, CSTRING, CALFANUMERIC, + CALFAMIXED,CWORD,NFIELD,NSTRING,NALFANUMERIC, + NNUMBER, NREAL, NINTEGER,NPOSITIVE, NNEGATIVE, + NWORD, NALFAMIXED,IINTEGER,IPOSITIVE, + INEGATIVE,XNUMBER, XREAL,IFIELD_POS) IMPLICIT NONE c INCLUDE 'interpret_line' C input CHARACTER*(*) LINE INTEGER MAXFIELD CHARACTER*(*) MACROLINE, CFIELD(*), + CSTRING(*), CALFANUMERIC(*), + CALFAMIXED(*), CWORD(*) INTEGER NFIELD,NSTRING,NALFANUMERIC,NNUMBER,NREAL,NINTEGER INTEGER NPOSITIVE, NNEGATIVE, NWORD, NALFAMIXED INTEGER IINTEGER(*),IPOSITIVE(*), + INEGATIVE(*) REAL XNUMBER(*), XREAL(*) C POINTERS TO BEG AND END OF EACH FIELD INTEGER IFIELD_POS(2,*) C LOCAL LOGICAL LALFANUMERIC,LNUMBER,LREAL,LWORD INTEGER ID,I,ISTARTLINE,IENDLINE,IBEG,IEND LOGICAL LCONTINUE C interprets an input line C-------example---------------------------------------------------- C input: CA Q 110 CB W -203 5.5 C output: MACROLINE='LLCLLNR' C NFIELD=7 ; NSTRING=0 ; NALFANUMERIC=7 ; NNUMBER=3 ; NWORD=4 C NALFAMIXED=0 ; NREAL=1 ; NINTEGER=2 ; NPOSITIVE=1 ; NNEGATIVE=1 C CFIELD(2)='Q' ; XNUMBER(2)=-203. ; CSTRING(3)='CB' C IINTEGER(2)=-203 etc. C-----------hierarchy of field types-------- = macroline symbol---- C like 4PTI.COOR String = contains non-alfanumeric, like @#$%^&* C like CA5 Alfamixed = mixed letters and numbers C like TRP Letters only = word C like -.5E+5 Real number C

like 16 positive integer C like -16 Negative integer or 0 C C field = (alfanumeric,other ASCII) (filterted by ASCII-filter) C alfanumeric = (number, word, alfa-mixed) C number = (integer,real) C integer = (positive, negative) C C STRING C ALFANUMERIC C NUMBER C REAL C INTEGER C POSITIVE C NEGATIVE C WORD C ALFAMIXED C C macrosymbol is designed such that the whole world partitions into C S A L R P N, i.e. macrosymbol of a field is the lowest valid type C C---------------------------------------------------------------------- C step0: preliminaries NFIELD=0 MACROLINE=' ' NSTRING=0 NALFANUMERIC=0 NNUMBER=0 NREAL=0 NINTEGER=0 NPOSITIVE=0 NNEGATIVE=0 NWORD=0 NALFAMIXED=0 DO ID=1,MAXFIELD CFIELD(ID)=' ' CSTRING(ID)=' ' CALFANUMERIC(ID)=' ' XNUMBER(ID)=0.0 XREAL(ID)=0.0 IINTEGER(ID)=0 IPOSITIVE(ID)=0 INEGATIVE(ID)=0 CWORD(ID)=' ' CALFAMIXED(ID)=' ' DO I=1,2 IFIELD_POS(I,ID)=0 ENDDO ENDDO CALL ASCIIFILTER(LINE) C step1: find beg and end of each field CALL STRPOS(LINE,ISTARTLINE,IENDLINE) NFIELD=1 IFIELD_POS(1,NFIELD)=ISTARTLINE DO I=ISTARTLINE,IENDLINE-1 C " x" starts field IF (LINE(I:I) .EQ. ' ' .AND. LINE(I+1:I+1) .NE. ' ') THEN NFIELD=NFIELD+1 IF ( NFIELD .GT. MAXFIELD) THEN WRITE(6,*)'*** ERROR IN INTERPRETLINE: MAXFIELD OVERFLOW' NFIELD=MAXFIELD ENDIF IFIELD_POS(1,NFIELD)=I+1 ENDIF C "x " ends field IF (LINE(I:I) .NE. ' ' .AND. LINE(I+1:I+1) .EQ. ' ') THEN IFIELD_POS(2,NFIELD)=I ENDIF ENDDO IFIELD_POS(2,NFIELD)=IENDLINE C step3: process each field C----------------------------------------------------------------------- C NSTRING C NALFANUMERIC C NNUMBER C NREAL C NINTEGER C NPOSITIVE C NNEGATIVE C NWORD C NALFAMIXED C----------------------------------------------------------------------- LCONTINUE=.TRUE. DO ID=1,NFIELD C BR 2007/08/22 avoid GOTO IF (LCONTINUE) THEN C step 3.1: extract string i CFIELD(ID)=LINE(IFIELD_POS(1,ID):IFIELD_POS(2,ID)) CALL STRPOS(CFIELD(ID),IBEG,IEND) C step 3.2: determine type of field, store field, store macrosymbol C .not. lalfanumeric CALL IS_ALFANUMERIC(CFIELD(ID),LALFANUMERIC) IF (.NOT. LALFANUMERIC) THEN NSTRING=NSTRING+1 CSTRING(NSTRING)=CFIELD(ID) MACROLINE(ID:ID)='S' ELSE C lnumber / lword / mixed NALFANUMERIC=NALFANUMERIC+1 CALFANUMERIC(NALFANUMERIC)=CFIELD(ID) CALL IS_NUMBER(CFIELD(ID),LNUMBER) IF (LNUMBER) THEN NNUMBER=NNUMBER+1 CALL IS_REAL(CFIELD(ID),LREAL) C real / integer IF (LREAL) THEN NREAL=NREAL+1 CALL READ_REAL(CFIELD(ID),XREAL(NREAL)) XNUMBER(NNUMBER)=XREAL(NREAL) MACROLINE(ID:ID)='R' ELSE CALL READ_REAL_FROM_STRING(CFIELD(ID)(IBEG:IEND), + XNUMBER(NNUMBER) ) NINTEGER=NINTEGER+1 CALL READ_INT_FROM_STRING(CFIELD(ID)(IBEG:IEND), + IINTEGER(NINTEGER) ) IF (IINTEGER(NINTEGER).GT.0) THEN NPOSITIVE=NPOSITIVE+1 IPOSITIVE(NPOSITIVE)=IINTEGER(NINTEGER) MACROLINE(ID:ID)='P' ELSE NNEGATIVE=NNEGATIVE +1 INEGATIVE(NNEGATIVE )=IINTEGER(NINTEGER) MACROLINE(ID:ID)='N' ENDIF ENDIF ELSE CALL IS_WORD(CFIELD(ID),LWORD) IF (LWORD) THEN NWORD=NWORD+1 CWORD(NWORD)=CFIELD(ID) MACROLINE(ID:ID)='L' ELSE NALFAMIXED=NALFAMIXED+1 CALFAMIXED(NALFAMIXED)=CFIELD(ID) MACROLINE(ID:ID)='A' ENDIF ENDIF ENDIF LCONTINUE=.FALSE. ENDIF ENDDO RETURN END C END INTERPRET_LINE C...................................................................... C...................................................................... C SUB INTTOSTR SUBROUTINE INTTOSTR(NRES,LSTR,CSTR,LDSSP) IMPLICIT NONE INTEGER NRES,LSTR(*) CHARACTER CSTR(*) LOGICAL LDSSP C internal INTEGER I CHARACTER*25 STRSTATES STRSTATES=' LTCSltcsEBAPMebapmHGIhgi' IF (LDSSP) THEN DO I=1,NRES IF (LSTR(I) .EQ. 99) THEN CSTR(I)='.' ELSE IF (LSTR(I) .EQ. 999) THEN CSTR(I)='<' ELSE CSTR(I)=STRSTATES(LSTR(I):LSTR(I)) ENDIF ENDDO ELSE DO I=1,NRES CSTR(I)='U' ENDDO ENDIF RETURN END C END INTTOSTR C...................................................................... C...................................................................... C SUB LEFTADJUST(STRING,NDIM,NLEN) SUBROUTINE LEFTADJUST(STRING,NDIM,NLEN) C...left-adjust of a string IMPLICIT NONE CHARACTER*(*) STRING INTEGER NDIM, NLEN, l,il C...find position of first non-blank IF (NDIM .LT. 1 .OR. NLEN .LT. 1) RETURN IF (NDIM .gt. 1)STOP' update routine leftadjust' L=1 DO WHILE(STRING(L:L) .EQ. ' ' .AND. L .LT. NLEN) L=L+1 ENDDO IF (L .GT. 1) THEN C..L is position of first non-blank STRING(1:NLEN-L+1)=STRING(L:NLEN) C.C..fill rest with blanks up to NLEN DO IL=NLEN-L+2,NLEN STRING(IL:IL)=' ' ENDDO ENDIF c DO I=1,NDIM c L=1 c DO WHILE(STRINGS(I)(L:L).EQ.' '.AND.L.LT.NLEN) c L=L+1 c ENDDO c IF (L.GT.1) THEN C..L is position of first non-blank c STRINGS(I)(1:NLEN-L+1)=STRINGS(I)(L:NLEN) C.C..fill rest with blanks up to NLEN c DO IL=NLEN-L+2,NLEN c STRINGS(I)(IL:IL)=' ' c ENDDO c ENDIF c ENDDO RETURN END C END LEFTADJUST C...................................................................... C...................................................................... C SUB IS_INTEGER SUBROUTINE IS_INTEGER(STRING,LINTEGER) C LINTEGER = .TRUE. if first field of STRING is an INTEGER C LINTEGER = first non-blank byte is + or - or a digit, .AND. C all subsequent byte are digits, until blank. IMPLICIT NONE C import CHARACTER*(*) STRING C export LOGICAL LINTEGER C local CHARACTER DIGITS*10, SIGNED*12 INTEGER IBEG,IEND,K SIGNED='+-0123456789' DIGITS='0123456789' LINTEGER=.TRUE. CALL STRPOS(STRING,IBEG,IEND) K=IBEG IF (INDEX(SIGNED,STRING(K:K)).EQ.0) THEN LINTEGER=.FALSE. RETURN ENDIF K=K+1 DO WHILE( K .LE. IEND) IF (INDEX(DIGITS,STRING(K:K)).EQ.0) THEN LINTEGER=.FALSE. RETURN ENDIF K=K+1 ENDDO RETURN END C END IS_INTEGER C...................................................................... C...................................................................... C SUB IS_REAL SUBROUTINE IS_REAL(STRING,LREAL) C LREAL = .TRUE. if STRING is a real number C LREAL = integer / . / integer / E or e / integer C import IMPLICIT NONE CHARACTER*(*) STRING C export LOGICAL LREAL C local CHARACTER*15 REALSYMBOL LOGICAL LINTEGER INTEGER IBEG,IEND,K,IEXP,JEXP,IPOS,IDOT C REALSYMBOL='0123456789.-+Ee' LREAL=.TRUE. C not just an integer CALL IS_INTEGER(STRING,LINTEGER) IF (LINTEGER) THEN LREAL=.FALSE. RETURN ENDIF CALL STRPOS(STRING,IBEG,IEND) DO K=IBEG,IEND IF (INDEX(REALSYMBOL,STRING(K:K)).EQ. 0) THEN LREAL=.FALSE. RETURN ENDIF ENDDO C LREAL = integer / . / integer / E or e / integer IDOT=INDEX(STRING,'.') C we want one '.' IF (IDOT .EQ. 0) THEN LREAL=.FALSE. RETURN ENDIF C the part before the '.' must be an integer IF (IDOT .NE. 1) THEN CALL IS_INTEGER(STRING(1:IDOT-1),LINTEGER) ELSE C means: .345 LINTEGER=.TRUE. ENDIF IF (LINTEGER) THEN IEXP=INDEX(STRING(IDOT+1:),'E') JEXP=INDEX(STRING(IDOT+1:),'e') C if no exponent is specified only an integer after the '.' is allowed IF (IEXP .EQ.0 .AND. JEXP .EQ. 0) THEN CALL IS_INTEGER(STRING(IDOT+1:),LINTEGER) IF (.NOT. LINTEGER) THEN LREAL=.FALSE. RETURN ENDIF C exponent must be an integer ELSE IPOS=MAX(IEXP,JEXP) CALL IS_INTEGER(STRING(IDOT+1+IPOS:),LINTEGER) IF (.NOT. LINTEGER) THEN LREAL=.FALSE. RETURN ENDIF ENDIF ENDIF RETURN END C END IS_REAL C...................................................................... C...................................................................... C SUB IS_NUMBER SUBROUTINE IS_NUMBER(STRING,LNUMBER) C LNUMBER=.TRUE. if STRING is a real or integer CHARACTER*(*) STRING LOGICAL LNUMBER, LINTEGER, LREAL CALL IS_REAL(STRING,LREAL) CALL IS_INTEGER(STRING,LINTEGER) LNUMBER= LREAL .OR. LINTEGER RETURN END C END IS_NUMBER C...................................................................... C...................................................................... C SUB IS_ALFANUMERIC SUBROUTINE IS_ALFANUMERIC(STRING,LALFANUMERIC) C LALFANUMERIC=.TRUE. if STRING is alfanumeric C LALFANUMERIC = contains only letters and digits and number C punctuation (.+-) IMPLICIT NONE C import CHARACTER*(*) STRING C export LOGICAL LALFANUMERIC C local CHARACTER ALFANUMERIC*65 INTEGER IBEG,IEND,K C init ALFANUMERIC='ABCDEFGHIJKLMNOPQRSTUVWXYZ'// + 'abcdefghijklmnopqrstuvwxyz0123456789+-.' LALFANUMERIC=.TRUE. CALL STRPOS(STRING,IBEG,IEND) K=IBEG DO WHILE(K .LT. IEND) IF ( INDEX(ALFANUMERIC,STRING(K:K)) .EQ.0 ) THEN LALFANUMERIC=.FALSE. RETURN ENDIF K=K+1 ENDDO RETURN END C END IS_ALFANUMERIC C...................................................................... C...................................................................... C SUB IS_WORD SUBROUTINE IS_WORD(STRING,LWORD) C LWORD=.TRUE. if STRING is pure alfa. C LWORD = contains only letters IMPLICIT NONE C import CHARACTER*(*) STRING C export LOGICAL LWORD C local CHARACTER ALFA*52 INTEGER IBEG,IEND,K C init ALFA='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz' LWORD=.TRUE. C CALL STRPOS(STRING,IBEG,IEND) K=IBEG DO WHILE(K .LT. IEND) IF ( INDEX(ALFA,STRING(K:K)) .EQ.0 ) THEN LWORD=.FALSE. RETURN ENDIF K=K+1 ENDDO RETURN END C END IS_WORD C...................................................................... C...................................................................... C SUB LOG_FILE SUBROUTINE LOG_FILE(KLOG,STRING,IFLAG) C iflag =0 ===> only in file C iflag =1 ===> only std-out C iflag =2 ===> both (file and std-out) C IMPLICIT NONE INTEGER KLOG,IFLAG,IBEG,IEND,ILINE,I INTEGER ICUTBEGIN(20),ICUTEND(20) CHARACTER*(*) STRING CALL STRPOS(STRING,IBEG,IEND) ILINE=1 ICUTBEGIN(ILINE)=1 ICUTEND(ILINE)=IEND DO I=1,IEND-1 IF (STRING(I:I+1).EQ.'/n') THEN ILINE=ILINE+1 ICUTBEGIN(ILINE)=I+2 ICUTEND(ILINE-1)=I-1 ICUTEND(ILINE)=IEND ENDIF ENDDO DO I=1,ILINE IF (IFLAG .EQ. 0) THEN WRITE(KLOG,'(A)')STRING(ICUTBEGIN(I):ICUTEND(I)) ELSE IF (IFLAG .EQ. 1) THEN WRITE(6,*)STRING(ICUTBEGIN(I):ICUTEND(I)) CALL FLUSH_UNIT(6) ELSE IF (IFLAG .EQ. 2) THEN WRITE(KLOG,'(A)')STRING(ICUTBEGIN(I):ICUTEND(I)) WRITE(6,*)STRING(ICUTBEGIN(I):ICUTEND(I)) CALL FLUSH_UNIT(6) ENDIF ENDDO RETURN END C END LOG_FILE C...................................................................... C...................................................................... C SUB LOWER_TO_CYS SUBROUTINE LOWER_TO_CYS(SEQ,NRES) C import CHARACTER*(*) SEQ INTEGER NRES DO I=1,NRES IF ( (SEQ(I:I) .GE. 'a') .AND. (SEQ(I:I) .LE. 'z') ) THEN SEQ(I:I)='C' ENDIF ENDDO END C END LOWER_TO_CYS C...................................................................... C...................................................................... C SUB LOWTOUP SUBROUTINE LOWTOUP(STRING,LENGTH) C LOWTOUP.......CONVERTS STRING......CHRIS SANDER JULY 1983 C changed by RS (speed up) CHARACTER*(*) STRING INTEGER LENGTH CX CHARACTER UPPER*26, LOWER*26, STRING*(*) CX DATA UPPER/'ABCDEFGHIJKLMNOPQRSTUVWXYZ'/ CX DATA LOWER/'abcdefghijklmnopqrstuvwxyz'/ C DO I=1,LENGTH IF (STRING(I:I) .GE. 'a' .AND. STRING(I:I) .LE. 'z') THEN STRING(I:I)=CHAR( ICHAR(STRING(I:I))-32 ) CX K=INDEX(LOWER,STRING(I:I)) CX IF (K.NE.0) STRING(I:I)=UPPER(K:K) ENDIF ENDDO RETURN END C END LOWTOUP C...................................................................... C...................................................................... C SUB MAKE_FORMAT_INT SUBROUTINE MAKE_FORMAT_INT(ILEN,CFORMAT) INTEGER ILEN CHARACTER*(*) CFORMAT CHARACTER*20 CTEMP,CINT CFORMAT=' ' CINT=' ' WRITE(CINT,'(I20)')ILEN CALL CONCAT_STRINGS('(I',CINT,CTEMP) CALL CONCAT_STRINGS(CTEMP,')',CFORMAT) RETURN END C END MAKE_FORMAT_INT C...................................................................... C...................................................................... C SUB MARKALI SUBROUTINE MARKALI(S1,S2,N,AGREE,C) C marks equalitites between S1 and S2 with C in string AGREE c implicit none CHARACTER*(*) S1,S2(*),AGREE(*),C CHARACTER CTEST INTEGER N,I,IAGR IF (N .EQ. 0) THEN WRITE(6,*)'*** N=0 IN MARKALI' RETURN ENDIF IAGR=0 DO I=1,N CTEST=S2(I) c convert lower case letter of sequence 2 CALL LOWTOUP(CTEST,1) IF (S1(I:I) .EQ. CTEST) THEN AGREE(I)=C IAGR=IAGR+1 ELSE AGREE(I)=' ' ENDIF ENDDO RETURN END C END MARKALI C...................................................................... C...................................................................... C SUB MSFCHECKSEQ SUBROUTINE MSFCHECKSEQ(SEQCHECK,NSEQ,MSFCHECK) C IMPORT INTEGER NSEQ INTEGER SEQCHECK(NSEQ) C INTERNAL INTEGER CHECKTMP, I C EXPORT INTEGER MSFCHECK CHECKTMP = 0 DO I = 1, NSEQ CHECKTMP = CHECKTMP + SEQCHECK(I) ENDDO MSFCHECK = MOD(CHECKTMP, 10 000) RETURN END C END MSFCHECKSEQ C...................................................................... C...................................................................... C SUB OPEN_SW_DATA_FILE SUBROUTINE OPEN_SW_DATA_FILE(KUNIT,LBINARY,IFILE,DATA,PATH,HOST) C import CHARACTER*(*) DATA,PATH,HOST INTEGER KUNIT,IFILE LOGICAL LBINARY C internal CHARACTER*100 TEMPNAME,LINE LOGICAL LERROR CALL CONCAT_INT_STRING(IFILE,DATA,LINE) CALL CONCAT_STRINGS(PATH,LINE,TEMPNAME) IF ( HOST .NE. ' ' ) THEN CALL STRPOS(HOST,IBEG,IEND) IF ( INDEX(HOST(IBEG:IEND),'unknownHost') .LE. 0 ) THEN c WRITE(6,*)'host:',host,":",tempname c host(iend+1:iend+1)=':' CALL CONCAT_STRINGS(HOST,TEMPNAME,LINE) TEMPNAME(1:)=LINE(1:) ENDIF ENDIF c WRITE(6,*)'file: ',tempname(1:60) CALL FLUSH_UNIT(6) IF (LBINARY) THEN CAUTION RECL !!!! CALL OPEN_FILE(KUNIT,TEMPNAME, + 'OLD,READONLY,UNFORMATTED,RECL=500000',lerror) ELSE CALL OPEN_FILE(KUNIT,TEMPNAME,'OLD,READONLY',LERROR) ENDIF IF (LERROR) THEN WRITE(6,*)'ERROR: open file : ',tempname CALL FLUSH_UNIT(6) STOP ENDIF RETURN END C END OPEN_SW_DATA_FILE C...................................................................... C...................................................................... C SUB PREPARE_INSERTIONS SUBROUTINE PREPARE_INSERTIONS(MAXRES,MAXALIGNS, 1 NRES,NALIGN,IFIR,ILAS,INSNUMBER,INSALI,INSLEN, 2 INSAP,MAXLEN,INSLIST_POINTER,TOTALINSLEN,ERROR) C 21.6.93 C 18.11. : AliseqEnvironment -> prepare_insertions; C ........ return pointers to sublists of single alignments in ReadHSSP C ........ arrays ( 0 if there is no sublist ) ; C ........ + the maximal length of an insertion starting at any position IMPLICIT NONE C Import INTEGER MAXRES,MAXALIGNS INTEGER NRES,NALIGN INTEGER IFIR(*), ILAS(*) INTEGER INSNUMBER,INSALI(*),INSLEN(*) INTEGER INSAP(*) C Export INTEGER*2 TOTALINSLEN(MAXRES) INTEGER*2 MAXLEN(MAXRES), INSLIST_POINTER(MAXALIGNS) LOGICAL ERROR C Internal INTEGER*2 INT2_TEMP INTEGER ALINO INTEGER IAP,IINS,TIL IF ( NALIGN .GT. MAXALIGNS ) THEN WRITE(6,'(1X,A)') 1 'MAXALIGNS overflow in prepare_insertions!' ERROR = .TRUE. RETURN ENDIF IF ( NRES .GT. MAXRES ) THEN WRITE(6,'(1X,A)') 'MAXRES overflow in prepare_insertions !' ERROR = .TRUE. RETURN ENDIF CALL INIT_INT2_ARRAY(1,NRES,MAXLEN,0) CALL INIT_INT2_ARRAY(1,NALIGN,INSLIST_POINTER,0) ALINO = INSALI(1) INSLIST_POINTER(ALINO) = 1 MAXLEN(INSAP(1)) = INSLEN(1) DO IINS = 2,INSNUMBER IF ( ALINO .NE. INSALI(IINS) ) THEN ALINO = INSALI(IINS) INSLIST_POINTER(ALINO) = IINS ENDIF C NOTE: CONVERSION FROM INT4 IN INT2 INT2_TEMP = INSLEN(IINS) MAXLEN(INSAP(IINS))= + MAX(MAXLEN(INSAP(IINS)),INT2_TEMP) ENDDO TIL = 0 DO IAP = 1,NRES IF ( MAXLEN(IAP) .GT. 0 ) TIL = TIL + MAXLEN(IAP) TOTALINSLEN(IAP) = TIL ENDDO RETURN END C END PREPARE_INSERTIONS C...................................................................... C...................................................................... C SUB PUNISHGAP SUBROUTINE PUNISHGAP(NRES,LDSSP,STRUC,GAPOPEN,PUNISH) C====================================================================== C INDELs in secondary structure segments C---------------------------------------------------------------------- C if INDELS in secondary structure are NOT allowed (if DSSP-file(s)) C set gap-open(IPOS , SEQuence 1/SEQuence 2) in secondary structure segments C to a high value. C BUT NOT for the first and last residue in a segment C LLLLLHHHHHHHHHHLLLLLLLLL C ______^^^^^^^^__________ C punish C C definition of struture class is: unknown 'U' = 0 C ' TCLStclss' = 1 C 'EBAPMebapm' = 2 C 'HGIhgiiiii' = 3 C CAUTION: IF THE ASSIGNMENT OF STRUC CLASS IS CHANGED IN STRUCCLASS C ======== CHANGE IT ALSO HERE c======================================================================= IMPLICIT NONE c import INTEGER NRES CHARACTER STRUC(*) LOGICAL LDSSP REAL PUNISH C CHANGED REAL GAPOPEN(*) C INTERNAL INTEGER I,ICLASS1,ICLASS2,ICLASS3 CHARACTER C C IF (LDSSP) THEN DO I=2,NRES-1 CALL SECSTRUC_TO_3_STATE(STRUC(I-1),C,ICLASS1) CALL SECSTRUC_TO_3_STATE(STRUC(I ),C,ICLASS2) CALL SECSTRUC_TO_3_STATE(STRUC(I+1),C,ICLASS3) IF (ICLASS1.GT.1 .AND. ICLASS2.GT.1 .AND. ICLASS3.GT.1) THEN GAPOPEN(I)=PUNISH ENDIF ENDDO ENDIF RETURN END C END PUNISHGAP C...................................................................... C...................................................................... C SUB PUNISH_GAP SUBROUTINE PUNISH_GAP(NRES,STRUC,STRSTATES,PUNISH,GAPOPEN) C====================================================================== C INDELs in secondary structure segments C---------------------------------------------------------------------- C if INDELS in secondary structure are NOT allowed (passed in strstates) C set gap-open(IPOS , SEQuence 1/SEQuence 2) in secondary str segments C to a high value. C BUT NOT for the first and last residue in a segment C LLLLLHHHHHHHHHHLLLLLLLLL C ______^^^^^^^^__________ C punish C c======================================================================= IMPLICIT NONE C IMPORT INTEGER NRES CHARACTER*(*) STRUC(*),STRSTATES REAL PUNISH C CHANGED REAL GAPOPEN(*) C INTERNAL INTEGER I,IBEG,IEND C CALL STRPOS(STRSTATES,IBEG,IEND) DO I=2,NRES-1 IF (INDEX(STRSTATES(IBEG:IEND),STRUC(I)) .NE. 0) THEN IF (STRUC(I).EQ.STRUC(I-1).AND.STRUC(I).EQ.STRUC(I+1)) THEN GAPOPEN(I)=PUNISH ENDIF ENDIF ENDDO RETURN END C END PUNISH_GAP C...................................................................... C...................................................................... C SUB PUTHEADER SUBROUTINE PUTHEADER(KPLOT,CSQ_1,CSQ_2,STRUC_1,STRUC_2, + N1,N2,NAME_1,NAME_2) IMPLICIT NONE INTEGER KPLOT,N1,N2 CHARACTER*(*) NAME_1,NAME_2 CHARACTER*(*) CSQ_1,CSQ_2 CHARACTER*1 STRUC_1(*),STRUC_2(*) C internal INTEGER LINELEN,I,J,ISTOP,M CHARACTER*200 CTEMP C init CTEMP=' ' LINELEN=LEN(CTEMP)-1 write(kplot,*) '/number of residues in protein 1:' write(kplot,'(i10)')n1 write(kplot,*) '/number of residues in protein 2:' write(kplot,'(i10)')n2 write(kplot,*) ' ' write(kplot,*) '/file name for protein 1:' write(kplot,*)name_1 write(kplot,*)'/file name for protein 2:' write(kplot,*)name_2 write(kplot,*)' ' write(kplot,*)'/SEQUENCE 1:' DO I=1,N1,LINELEN J=1 ISTOP=I+LINELEN IF (ISTOP.GT.N1)ISTOP=N1 DO M=I,ISTOP WRITE(CTEMP(J:J),'(A)')CSQ_1(M:M) J=J+1 ENDDO WRITE(KPLOT,'(A)')CTEMP(:J-1) ENDDO WRITE(KPLOT,*)' ' WRITE(KPLOT,*)'/SEQUENCE 2:' DO I=1,N2,LINELEN J=1 ISTOP=I+LINELEN IF (ISTOP.GT.N2)ISTOP=N2 DO M=I,ISTOP WRITE(CTEMP(J:J),'(A)')CSQ_2(M:M) J=J+1 ENDDO WRITE(KPLOT,'(A)')CTEMP(:J-1) ENDDO WRITE(KPLOT,*)' ' WRITE(KPLOT,*) '/SECSTRUC 1:' DO I=1,N1,LINELEN J=1 ISTOP=I+LINELEN IF (ISTOP.GT.N1)ISTOP=N1 DO M=I,ISTOP WRITE(CTEMP(J:J),'(A)')STRUC_1(M) J=J+1 ENDDO WRITE(KPLOT,'(A)')CTEMP(:J-1) ENDDO WRITE(KPLOT,*)' ' WRITE(KPLOT,*) '/SECSTRUC 2:' DO I=1,N2,LINELEN J=1 ISTOP=I+LINELEN IF (ISTOP.GT.N2)ISTOP=N2 DO M=I,ISTOP WRITE(CTEMP(J:J),'(A)')STRUC_2(M) J=J+1 ENDDO WRITE(KPLOT,'(A)')CTEMP(:J-1) ENDDO WRITE(KPLOT,*) ' ' RETURN END C END PUTHEADER C...................................................................... C...................................................................... C SUB READ_BRK SUBROUTINE READ_BRK(KIN,INFILE,CHAINS,CTRANS,RLEN,NRES, 1 COMPND,SEQ,PDBNO,TRUNCATED,ERROR) CAUTION ctrans2 and seq3 are character strings here but C arrays in s1tos3 and s3tos1 ======>> BUG c RS dec. 94 C 14.5.93 CHEADER OXIDOREDUCTASE (SUPEROXIDE ACCEPTOR) 25-MAR-80 2SOD 2SOD CCOMPND CU,ZN SUPEROXIDE DISMUTASE (E.C.1.15.1.1) 2SOD C .. CATOM 4 N ALA O 1 -20.479 24.715 -21.334 1.00 16.16 2SOD CATOM 5 CA ALA O 1 -19.117 24.539 -21.395 1.00 15.65 2SOD C1..4......11.14..1820....26....32....3840....4648....54 IMPLICIT NONE C IMPORT INTEGER KIN,RLEN INTEGER PDBNO(*) CHARACTER*(*) CHAINS, CTRANS, INFILE C EXPORT CHARACTER*(*) COMPND,SEQ LOGICAL TRUNCATED,ERROR C INTERNAL INTEGER MAXRES_LOC,NTRANS_LOC,LINELEN PARAMETER (MAXRES_LOC= 9999) C PARAMETER (MAXRES_LOC= 10000) C PARAMETER (MAXRES_LOC= 30011) PARAMETER (NTRANS_LOC= 25) PARAMETER (LINELEN= 1000) INTEGER ICHAIN, JCHAIN, ISTART, ISTOP, IPOS, + NRES, N, NREAD, NTRANS CHARACTER*1 C, CHAIN CHARACTER*(3*MAXRES_LOC) SEQ3 CHARACTER*10 NUMBERS CHARACTER*(LINELEN) LINE CHARACTER*(3*NTRANS_LOC) CTRANS3 *----------------------------------------------------------------------* WRITE(6,*)' STOP UPDATE READ_BRK' C$$$ ERROR = .FALSE. c$$$ c$$$C try to open outfile; return if unsuccessful c$$$ call open_file(kin,infile,'old,readonly',error) c$$$C error messages are alredy issued by OPEN_FILE c$$$ if ( error ) return c$$$ c$$$ if ( linelen .lt. rlen ) then c$$$ WRITE(6,'(1x,a)') c$$$ 1 ' *** record length of input file too big ***' c$$$ goto 1 c$$$ endif c$$$ c$$$ error = .false. c$$$ numbers = '0123456789' c$$$ call strpos(ctrans,istart,istop) c$$$ ntrans = istop-istart+1 c$$$ call s1tos3(ctrans3,ctrans,ntrans) c$$$ read(kin,'(a)',err=1,end=2) line c$$$ compnd = line(7:) c$$$ c$$$ nres = 0 c$$$ ichain = 1 c$$$ jchain = 1 c$$$ seq3 = ' ' c$$$ call strpos(chains,istart,istop) c$$$ call gettoken(chains,len(chains),1,ipos,chain) c$$$ do while ( ipos .le. istop ) c$$$ do while ( line(1:4) .ne. 'ATOM' ) c$$$ read(kin,'(a)',err=1,end=2) line c$$$ enddo c$$$ c = line(22:22) c$$$ if ( index(numbers,chain ) .ne. 0 ) then c$$$ read(chain,'(i1)') n c$$$ if ( n .eq. ichain ) then c$$$ call read_brkchain(kin,nres,ctrans3,rlen,line,seq3, c$$$ 1 pdbno,nread,truncated,error) c$$$ nres = nres + nread c$$$ ichain = ichain + 1 c$$$ jchain = jchain + 1 c$$$ read(kin,'(a)',err=1,end=2) line c$$$ else c$$$ call skip_brkchain(kin,rlen,line,error) c$$$ ichain = ichain + 1 c$$$ read(kin,'(a)',err=1,end=2) line c$$$ endif c$$$ else c$$$ if ( c .eq. chain ) then c$$$ call read_brkchain(kin,nres,ctrans3,rlen,line,seq3, c$$$ 1 pdbno,nread,truncated,error) c$$$ nres = nres + nread c$$$ ichain = ichain + 1 c$$$ jchain = jchain + 1 c$$$ read(kin,'(a)',err=1,end=2) line c$$$ else c$$$ call skip_brkchain(kin,rlen,line,error) c$$$ ichain = ichain + 1 c$$$ read(kin,'(a)',err=1,end=2) line c$$$ endif c$$$ endif c$$$ call strpos(chains,istart,istop) c$$$ call gettoken(chains,len(chains),jchain,ipos,chain) c$$$ enddo c$$$ c$$$ goto 2 c$$$ 1 error = .true. c$$$ WRITE(6,'(a)') ' ** error reading BRK file **' c$$$ 2 continue c$$$ call s3tos1(seq3,seq,nres) c$$$ c$$$ close(kin) RETURN END C END READ_BRK C...................................................................... C...................................................................... C SUB READ_BRKCHAIN SUBROUTINE READ_BRKCHAIN(KIN,SEQPOS,CTRANS,RLEN,FIRSTLINE,SEQ, 1 PDBNO,NREAD,TRUNCATED,ERROR) C 15.5.93 CATOM 4 N ALA O 1 -20.479 24.715 -21.334 1.00 16.16 2SOD 232 CATOM 5 CA ALA O 1 -19.117 24.539 -21.395 1.00 15.65 2SOD 233 C SPECIAL CASES : CATOM 404 CA AASP 50 7.731 6.227 13.395 0.67 10.85 6PTI C1..4......11.14..1820....26....32....3840....4648....54 IMPLICIT NONE C Import INTEGER KIN,RLEN C .. the read pointer of kin is expected to point to the next line C .. TO BE INTERPRETED INTEGER SEQPOS C .. may be alread partially filled. last occupied position is "seqpos" INTEGER PDBNO(*) CHARACTER*(*) SEQ CHARACTER*(*) CTRANS, FIRSTLINE C EXPORT INTEGER NREAD LOGICAL TRUNCATED,ERROR C .. and "seq", with "nread" more symbols; "pdbno" C with "nread" more entries C Internal INTEGER LINELEN PARAMETER (LINELEN= 1000) INTEGER IPOS CHARACTER*3 C3 CHARACTER*4 CNUMBER CHARACTER*(LINELEN) LINE IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF ERROR = .FALSE. NREAD = 0 CNUMBER = ' ' IPOS = SEQPOS LINE = FIRSTLINE DO WHILE ( LINE(1:3) .NE. 'TER' .AND. 1 .NOT. TRUNCATED ) IF ( LINE(23:26) .NE. CNUMBER ) THEN CNUMBER = LINE(23:26) C3 = LINE(18:20) IF ( INDEX(CTRANS,C3) .NE. 0 ) THEN TRUNCATED = ( SEQPOS+NREAD+1 .GT. LEN(SEQ)/3 ) IF ( .NOT. TRUNCATED ) THEN IPOS = 3*(SEQPOS+NREAD) SEQ(IPOS+1:IPOS+3) = C3 NREAD = NREAD + 1 READ (CNUMBER,'(I4)') PDBNO(SEQPOS+NREAD) ENDIF ENDIF ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO GOTO 2 1 ERROR = .TRUE. WRITE(6,'(a)') ' ** error reading BRK file **' 2 CONTINUE RETURN END C END READ_BRKCHAIN C...................................................................... C...................................................................... C SUB READ_DSSPCHAIN SUBROUTINE READ_DSSPCHAIN(KIN,SEQPOS,CTRANS,RLEN,FIRSTLINE,SEQ, 1 STRUC,ACC,PDBNO,NREAD,LACCZERO,TRUNCATED,ERROR) C 18.5.93 C 1 1 O A 0 0 81 0, 0.0 149,-0.2 0, 0.0 104,-0.1 0.000 360.0 360.0 360.0 164.6 -19.1 24.5 -21.4 IMPLICIT NONE C IMPORT INTEGER KIN, SEQPOS, RLEN CHARACTER*(*) CTRANS, FIRSTLINE C EXPORT INTEGER NREAD,PDBNO(*), ACC(*) CHARACTER*(*) SEQ, STRUC LOGICAL LACCZERO,TRUNCATED,ERROR C INTERNAL INTEGER NASCII,LINELEN PARAMETER (NASCII= 256) PARAMETER (LINELEN= 1000) INTEGER LOWERPOS(NASCII),I CHARACTER*1 C CHARACTER*26 LOWER CHARACTER*(LINELEN) LINE *----------------------------------------------------------------------* IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF ERROR = .FALSE. C USED TO CONVERT LOWER CASE CHARACTERS FROM THE DSSP-SEQ TO 'C' (CYS) LOWER='abcdefghijklmnopqrstuvwxyz' CALL GETPOS(LOWER,LOWERPOS,NASCII) NREAD = 0 LINE = FIRSTLINE DO WHILE ( LINE(14:14) .NE. '!' .AND. 1 .NOT. TRUNCATED ) C = LINE(14:14) CALL GETINDEX(C,LOWERPOS,I) IF ( I.NE.0 ) C = 'C' IF ( INDEX(CTRANS,C) .NE. 0 ) THEN TRUNCATED = ( SEQPOS+NREAD+1 .GT. LEN(SEQ) ) IF ( .NOT. TRUNCATED ) THEN NREAD = NREAD + 1 SEQ(SEQPOS+NREAD:SEQPOS+NREAD) = C STRUC(SEQPOS+NREAD:SEQPOS+NREAD) = LINE(17:17) READ(LINE(6:10),'(I5)') PDBNO(SEQPOS+NREAD) READ(LINE(35:38),'(I4)') ACC(SEQPOS+NREAD) LACCZERO = LACCZERO .AND. (ACC(SEQPOS+NREAD) .EQ. 0) ENDIF ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO NREAD = NREAD + 1 SEQ(SEQPOS+NREAD:SEQPOS+NREAD) = '!' STRUC(SEQPOS+NREAD:SEQPOS+NREAD) = ' ' GOTO 2 1 ERROR = .TRUE. WRITE(6,'(A)') ' ** ERROR READING DSSP FILE **' 2 CONTINUE RETURN END C END READ_DSSPCHAIN C...................................................................... C...................................................................... C SUB READ_EMBL SUBROUTINE READ_EMBL(KIN,INFILE,CTRANS,RLEN,NRES,COMPND, 1 ACCESSION,PDB,SEQ,TRUNCATED,ERROR) C 14.5.93 CDE test.pep from: 1 to: 13 CDE test.pep CSQ SEQUEN C AAAAAAAAAA AAA C// IMPLICIT NONE C IMPORT INTEGER KIN,RLEN CHARACTER*(*) CTRANS,INFILE C EXPORT INTEGER NRES CHARACTER*(*) COMPND,ACCESSION,PDB, SEQ LOGICAL TRUNCATED,ERROR C INTERNAL INTEGER LINELEN PARAMETER (LINELEN= 1000) INTEGER I,ID, ISTART, ISTOP, JSTART, JSTOP, IPOS CHARACTER*1 C CHARACTER*(LINELEN) LINE *----------------------------------------------------------------------* ERROR = .FALSE. JSTOP=0 C try to open outfile; return if unsuccessful CALL OPEN_FILE(KIN,INFILE,'old,readonly',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF ERROR = .FALSE. ID = 0 PDB = ' ' READ(KIN,'(A)',ERR=1,END=2) LINE DO WHILE ( LINE(1:2) .NE. 'SQ' ) CALL STRPOS(LINE,ISTART,ISTOP) IF ( INDEX(LINE(1:2), 'AC') .NE. 0 ) THEN I = INDEX(LINE,';') - 1 ACCESSION = LINE(6:I) ELSE IF ( INDEX(LINE(1:2), 'DE') .NE. 0 ) THEN COMPND = LINE(6:200) ELSE IF ( INDEX(LINE(1:9), 'DR PDB;') .NE. 0 ) THEN C .PDB-DATABASE POINTER CALL STRPOS(PDB,ISTART,ISTOP) CALL STRPOS(LINE,JSTART,JSTOP) IF (LINE(JSTOP:JSTOP) .EQ. '.')JSTOP=JSTOP-1 C I = LEN(PDB) IF ( ISTOP+JSTOP-10 .LE. LEN(PDB)) THEN IF ( ID .LE. 0 ) THEN PDB(ISTOP+1:) = LINE(11:JSTOP) ELSE PDB(ISTOP+1:) = '|' // LINE(11:JSTOP) ENDIF ID = ID + 1 ELSE WRITE(6,*)'**** PDBREF-LINE DIMENSION OVERFLOW ***' ENDIF ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO CALL STRPOS(PDB,ISTART,ISTOP) IF ( ID .GT. 0 ) THEN IF ( ISTOP+7 .LE. LEN(PDB) ) THEN WRITE(PDB(ISTOP+1:),'(A,I4)') '||', ID ELSE WRITE(6,*)'**** PDBREF-LINE DIMENSION OVERFLOW ***' ENDIF ENDIF NRES = 0 READ(KIN,'(A)',ERR=1,END=2) LINE C SEQUENCE DO WHILE ( INDEX(LINE(1:2),'//') .EQ. 0 .AND. 1 .NOT. TRUNCATED ) CALL STRPOS(LINE,ISTART,ISTOP) DO IPOS = ISTART,ISTOP C = LINE(IPOS:IPOS) CALL LOWTOUP(C,1) IF ( INDEX(CTRANS,C) .NE. 0 ) THEN TRUNCATED = ( NRES+1 .GT. LEN(SEQ) ) IF ( .NOT. TRUNCATED ) THEN NRES = NRES + 1 SEQ(NRES:NRES) = C ENDIF ENDIF ENDDO READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO GOTO 2 1 ERROR = .TRUE. WRITE(6,'(a)') ' ** error reading EMBL/SWISSPROT file **' 2 CONTINUE CLOSE(KIN) RETURN END C END READ_EMBL C...................................................................... C...................................................................... C SUB READ_FASTA SUBROUTINE READ_FASTA(KIN,INFILE,CTRANS,RLEN,NRES, 1 ACCESSION,COMPND,SEQ,TRUNCATED,ERROR) C 11.4.96 C>test blablabla C A A A A A A A A A A A A A IMPLICIT NONE C IMPORT INTEGER KIN, RLEN CHARACTER*(*) CTRANS,INFILE C EXPORT INTEGER NRES CHARACTER*(*) ACCESSION,COMPND, SEQ LOGICAL TRUNCATED,ERROR C INTERNAL INTEGER LINELEN PARAMETER (LINELEN= 1000) INTEGER IPOS, ISTART, ISTOP CHARACTER*1 C CHARACTER*(LINELEN) LINE *----------------------------------------------------------------------* ERROR = .FALSE. ISTOP=0 C TRY TO OPEN OUTFILE; RETURN IF UNSUCCESSFUL CALL OPEN_FILE(KIN,INFILE,'old,readonly',error) C error messages are already issued by OPEN_FILE IF ( ERROR ) RETURN IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE DO WHILE ( LINE(1:1) .NE. '>' ) READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO CALL STRPOS(LINE,ISTART,ISTOP) ISTART=INDEX(LINE,' ') IF (ISTART .GT. 2 .AND. ISTART .LT. ISTOP) THEN ACCESSION(1:LEN(ACCESSION))=LINE(2:ISTART-1) COMPND = LINE(ISTART+1:ISTOP) ELSE ACCESSION(1:LEN(ACCESSION))=LINE(2:) COMPND=ACCESSION ENDIF NRES = 0 READ(KIN,'(A)',ERR=1,END=2) LINE DO WHILE ( .NOT. TRUNCATED ) CALL STRPOS(LINE,ISTART,ISTOP) IF ( ISTOP .NE. 0 ) THEN DO IPOS = ISTART,ISTOP C = LINE(IPOS:IPOS) CALL LOWTOUP(C,1) IF ( INDEX(CTRANS,C) .NE. 0 ) THEN TRUNCATED = ( NRES+1 .GT. LEN(SEQ) ) IF ( .NOT. TRUNCATED ) THEN NRES = NRES + 1 SEQ(NRES:NRES) = C ENDIF ELSE IF (C .EQ. '*') THEN GOTO 2 ENDIF ENDDO ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO GOTO 2 1 ERROR = .TRUE. WRITE(6,'(a)') ' ** error reading FASTA file **' 2 CONTINUE CLOSE(KIN) RETURN END C END READ_FASTA C...................................................................... C...................................................................... C SUB READ_GCG SUBROUTINE READ_GCG(KIN,INFILE,CTRANS,RLEN,NRES,COMPND, 1 SEQ,TRUNCATED,ERROR) C 14.5.93 C C Test.Pep Length: 13 May 10, 1993 10:48 Type: N Check: 5915 .. C C 1 AAAAAAAAAA AAA C IMPLICIT NONE C IMPORT INTEGER KIN,RLEN CHARACTER*(*) CTRANS,INFILE C EXPORT INTEGER NRES CHARACTER*(*) COMPND, SEQ LOGICAL TRUNCATED,ERROR C INTERNAL INTEGER LINELEN PARAMETER (LINELEN= 1000) INTEGER IPOS,JPOS, ISTART,JSTART,JSTOP, ISTOP CHARACTER*1 C CHARACTER*10 CTOKEN CHARACTER*(LINELEN) LINE *----------------------------------------------------------------------* ERROR = .FALSE. C try to open outfile; return if unsuccessful CALL OPEN_FILE(KIN,INFILE,'old,readonly',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF ERROR = .FALSE. C data start after a line ending with '..' READ(KIN,'(A)',ERR=1,END=2) LINE DO WHILE ( INDEX(LINE,'..') .EQ. 0 ) READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO C first word of '..' line CALL GETTOKEN(LINE,LINELEN,1,IPOS,COMPND) C sequence part NRES = 0 READ(KIN,'(A)',ERR=1,END=2) LINE DO WHILE ( .TRUE. ) CALL STRPOS(LINE,ISTART,ISTOP) IF ( ISTOP .GT. 0 ) THEN C .. FIRST WORD IS A NUMBER CALL GETTOKEN(LINE,LINELEN,1,JPOS,CTOKEN) CALL STRPOS(CTOKEN,JSTART,JSTOP) DO IPOS = JPOS+JSTOP-JSTART+1, ISTOP C = LINE(IPOS:IPOS) CALL LOWTOUP(C,1) IF ( INDEX(CTRANS,C) .NE. 0 ) THEN TRUNCATED = ( NRES+1 .GT. LEN(SEQ) ) IF ( .NOT. TRUNCATED ) THEN NRES = NRES + 1 SEQ(NRES:NRES) = C ENDIF ENDIF ENDDO ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO 1 ERROR = .TRUE. WRITE(6,'(a)') ' ** error reading GCG file **' 2 CONTINUE CLOSE(KIN) RETURN END C END READ_GCG C...................................................................... C...................................................................... C SUB READ_HSSPCHAIN SUBROUTINE READ_HSSPCHAIN(KIN,SEQPOS,CTRANS,RLEN,FIRSTLINE,SEQ, 1 STRUC,ACC,PDBNO,NREAD,LACCZERO,TRUNCATED,ERROR) C 18.5.93 C 1 1 O A 0 0 81 11 13 AAAAAAAA S A IMPLICIT NONE C IMPORT INTEGER KIN,RLEN CHARACTER*(*) CTRANS, FIRSTLINE C EXPORT INTEGER NREAD, SEQPOS INTEGER PDBNO(*), ACC(*) CHARACTER*(*) SEQ, STRUC LOGICAL LACCZERO,TRUNCATED,ERROR C INTERNAL INTEGER NASCII,LINELEN PARAMETER (NASCII= 256) PARAMETER (LINELEN= 1000) INTEGER LOWERPOS(NASCII) INTEGER I CHARACTER*1 C CHARACTER*26 LOWER CHARACTER*(LINELEN) LINE *----------------------------------------------------------------------* IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF ERROR = .FALSE. C used to convert lower case characters from the DSSP-seq to 'C' (Cys) LOWER='abcdefghijklmnopqrstuvwxyz' CALL GETPOS(LOWER,LOWERPOS,NASCII) NREAD = 0 LINE = FIRSTLINE DO WHILE ( LINE(15:15) .NE. '!' .AND. 1 LINE(1:2) .NE. '##' .AND. 2 .NOT. TRUNCATED ) C = LINE(15:15) CALL GETINDEX(C,LOWERPOS,I) IF ( I.NE.0 ) C = 'C' IF ( INDEX(CTRANS,C) .NE. 0 ) THEN TRUNCATED = ( SEQPOS+NREAD+1 .GT. LEN(SEQ) ) IF ( .NOT. TRUNCATED ) THEN NREAD = NREAD + 1 SEQ(SEQPOS+NREAD:SEQPOS+NREAD) = C STRUC(SEQPOS+NREAD:SEQPOS+NREAD) = LINE(18:18) READ(LINE(7:11),'(I5)') PDBNO(SEQPOS+NREAD) READ(LINE(36:39),'(I4)') ACC(SEQPOS+NREAD) LACCZERO = LACCZERO .AND. (ACC(SEQPOS+NREAD) .EQ. 0 ) ENDIF ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO NREAD = NREAD + 1 SEQ(SEQPOS+NREAD:SEQPOS+NREAD) = '!' STRUC(SEQPOS+NREAD:SEQPOS+NREAD) = ' ' GOTO 2 1 ERROR = .TRUE. WRITE(6,'(a)') '*** ERROR READ_HSSPCHAIN reading HSSP file' 2 CONTINUE RETURN END C END READ_HSSPCHAIN C...................................................................... C...................................................................... C SUB READ_INT_FROM_STRING SUBROUTINE READ_INT_FROM_STRING(CSTRING,INUMBER) C import CHARACTER*(*) CSTRING C export INTEGER INUMBER C internal CHARACTER*100 CFORMAT,CTEMP CHARACTER*12 CNUMBER CNUMBER='-=0123456789' CFORMAT=' ' INUMBER=0 CALL STRPOS(CSTRING,ISTART,ISTOP) ITOTAL=ISTOP-ISTART+1 DO I=ISTART,ISTOP J=INDEX(CNUMBER,CSTRING(I:I)) IF ( J .LE. 0) THEN ITOTAL=I-ISTART WRITE(6,*)' *** NOT AN INTEGER:',CSTRING(ISTART:ISTOP) ENDIF ENDDO CALL CONCAT_STRING_INT('(I',ITOTAL,CTEMP) CALL CONCAT_STRINGS(CTEMP,')',CFORMAT) READ(CSTRING(ISTART:ISTOP),CFORMAT)INUMBER RETURN END C END READ_INT_FROM_STRING C...................................................................... C...................................................................... C SUB READ_MSF SUBROUTINE READ_MSF(KUNIT,FILENAME,MAXALIGNS,MAXCORE, 1 ALISEQ,ALIPOINTER,IFIR,ILAS,JFIR,JLAS,TYPE, 2 SEQNAMES,WEIGHT,SEQCHECK,MSFCHECK,NRES_ALI,NALIGN, 3 ERROR) C Implicit None C IMPORT INTEGER MAXALIGNS, MAXCORE INTEGER KUNIT CHARACTER*(*) FILENAME C EXPORT INTEGER NALIGN INTEGER ALIPOINTER(MAXALIGNS) INTEGER NRES_ALI INTEGER MSFCHECK INTEGER IFIR(MAXALIGNS), ILAS(MAXALIGNS) INTEGER JFIR(MAXALIGNS), JLAS(MAXALIGNS) C 'P' = PROTEIN SEQUENCES, 'N' = NUCLEOTIDE SEQ CHARACTER*1 TYPE CHARACTER*(*) SEQNAMES(MAXALIGNS) CHARACTER ALISEQ(MAXCORE) REAL WEIGHT(MAXALIGNS) INTEGER SEQCHECK(MAXALIGNS) LOGICAL ERROR C INTERNAL INTEGER CODELEN_LOC,MAXALIGNS_LOC, MAXRES_LOC,LINELEN PARAMETER (CODELEN_LOC= 14) PARAMETER (MAXALIGNS_LOC= 8765) C PARAMETER (MAXALIGNS_LOC= 19999) PARAMETER (MAXRES_LOC= 9999) C PARAMETER (MAXRES_LOC= 10000) C PARAMETER (MAXRES_LOC= 30011) PARAMETER (LINELEN= 200) INTEGER TESTCHECK,I,IPOS,ISEQ,NPROT_THIS,ISTART,ISTOP, + IBEG,ITMP,ILEN,DIFF,CFREE,FPOS, + ISTART2,ISTOP2 INTEGER LASTOCCUPIED(MAXALIGNS_LOC),NRES(MAXALIGNS_LOC), + NSEQLINES(MAXALIGNS_LOC) CHARACTER CGAPCHAR CHARACTER*200 ERRORMESSAGE,CTOKEN,CTOKEN_ORIGINAL C---- br 99.03: watch when changing this: hard_coded in GETARRAYINDEX CHARACTER*200 SEQNAMES_UPPER(MAXALIGNS_LOC) + CHARACTER*(CODELEN_LOC) CNAME CHARACTER*(LINELEN) LINE, TMPSTRING,TMPSEQ CHARACTER*(MAXRES_LOC) STRAND CHARACTER*20 CFORMAT LOGICAL INSIDE(MAXALIGNS_LOC) LOGICAL INGAP(MAXALIGNS_LOC) LOGICAL NO_ENDGAPS LOGICAL LCHECK, LTYPE, LNRES_ALI LOGICAL NEXT_IS_NRES_ALI, NEXT_IS_CHECK, NEXT_IS_TYPE LOGICAL NEXT_IS_NAME, NEXT_IS_LEN, NEXT_IS_SEQCHECK LOGICAL NEXT_IS_WEIGHT *----------------------------------------------------------------------* C REFORMAT of: *.Frag C C Nfi.Msf MSF: 594 Type: P February 17, 1992 14:37 Check: 1709 .. C C Name: Cnfi02 Len: 594 Check: 7754 Weight: 1.00 C Name: Cnfi03 Len: 594 Check: 4932 Weight: 1.00 C C// C C 1 50 CCnfi02 MMYSPICLTQ DEFHPFIEAL LPHVRAIAYT WFNLQARKRK YFKKHEKRMS CCnfi03 MMYSPICLTQ DEFHPFIEAL LPHVRAIAYT WFNLQARKRK YFKKHEKRMS CGAPCHAR = '.' ERROR = .FALSE. CALL STRPOS(FILENAME,ISTART,ISTOP) ERRORMESSAGE = ' open error for file: ' // 1 FILENAME(MAX(ISTART,1):MAX(1,ISTOP)) CALL OPEN_FILE(KUNIT,FILENAME,'old,readonly',error) IF ( ERROR ) GOTO 99 C READ MSF IDENTIFICATION LINE C Nfi.Msf MSF: 594 Type: P February 17, 1992 14:37 Check: 1709 .. ERROR = .TRUE. ERRORMESSAGE = ' MSF identification line missing !! ' READ(KUNIT,'(A)',END = 99) LINE DO WHILE ( INDEX(LINE,'MSF: ') .EQ. 0 ) READ(KUNIT,'(A)',END = 99) LINE ENDDO LNRES_ALI = .FALSE. LCHECK = .FALSE. LTYPE = .FALSE. NEXT_IS_NRES_ALI = .FALSE. NEXT_IS_CHECK = .FALSE. NEXT_IS_TYPE = .FALSE. C DUMMY VALUE FOR "POSITION OF START OF NEXT WORD" FPOS = -1 C ITH WORD I = 1 CALL GETTOKEN(LINE,LINELEN,I,FPOS,CTOKEN) DO WHILE ( FPOS .NE. 0 ) CALL STRPOS(CTOKEN,ISTART,ISTOP) CALL LOWTOUP(CTOKEN, LEN(CTOKEN)) IF ( NEXT_IS_NRES_ALI ) THEN NEXT_IS_NRES_ALI = .FALSE. CALL MAKE_FORMAT_INT(ISTOP-ISTART+1,CFORMAT) READ(CTOKEN(ISTART:ISTOP),CFORMAT) NRES_ALI ELSE IF ( NEXT_IS_TYPE ) THEN TYPE = CTOKEN(ISTART:ISTOP) NEXT_IS_TYPE = .FALSE. ELSE IF ( NEXT_IS_CHECK ) THEN CALL MAKE_FORMAT_INT(ISTOP-ISTART+1,CFORMAT) READ(CTOKEN(ISTART:ISTOP),CFORMAT) MSFCHECK NEXT_IS_CHECK = .FALSE. ENDIF IF ( CTOKEN(ISTART:ISTOP) .EQ. 'MSF:' ) THEN LNRES_ALI = .TRUE. NEXT_IS_NRES_ALI = .TRUE. ELSE IF ( CTOKEN(ISTART:ISTOP) .EQ. 'TYPE:' ) THEN LTYPE = .TRUE. NEXT_IS_TYPE = .TRUE. ELSE IF ( CTOKEN(ISTART:ISTOP) .EQ. 'CHECK:' ) THEN LCHECK = .TRUE. NEXT_IS_CHECK = .TRUE. ENDIF I = I + 1 CALL GETTOKEN(LINE,LINELEN,I,FPOS,CTOKEN) ENDDO IF ( .NOT. LNRES_ALI ) THEN ERROR = .TRUE. ERRORMESSAGE = ' MSF identification line missing !! ' GOTO 99 ENDIF IF ( .NOT. LTYPE ) THEN ERROR = .TRUE. ERRORMESSAGE = ' Type identifier missing !! ' GOTO 99 ENDIF IF ( .NOT. LCHECK ) THEN ERROR = .TRUE. ERRORMESSAGE = ' CHECKSUM MISSING !! ' GOTO 99 ENDIF C READ SEQUENCE DESCRIPTION SECTION READ(KUNIT,'(A)',END = 99) LINE C Name: Cnfi02 Len: 594 Check: 7754 Weight: 1.00 ERROR = .TRUE. ERRORMESSAGE = ' Sequence description section missing !! ' DO WHILE ( INDEX(LINE,'Name: ') .EQ. 0 ) READ(KUNIT,'(A)',END = 99) LINE ENDDO ERROR = .FALSE. ERRORMESSAGE = ' Alignment missing !! ' C>>> NALIGN = 0 DO WHILE ( INDEX(LINE,'Name: ') .NE. 0 ) NALIGN = NALIGN + 1 IF ( NALIGN .GT. MAXALIGNS .OR. 1 NALIGN .GT. MAXALIGNS_LOC) THEN ERROR = .TRUE. ERRORMESSAGE = ' MAXALIGNS overflow in read_msf !' GOTO 99 ENDIF NEXT_IS_NAME = .FALSE. NEXT_IS_LEN = .FALSE. NEXT_IS_SEQCHECK = .FALSE. NEXT_IS_WEIGHT = .FALSE. C DUMMY VALUE FOR "POSITION OF START OF NEXT WORD" FPOS = -1 C ith word I = 1 CALL GETTOKEN(LINE,LINELEN,I,FPOS,CTOKEN_ORIGINAL) CTOKEN=CTOKEN_ORIGINAL DO WHILE ( FPOS .NE. 0 ) CALL STRPOS(CTOKEN,ISTART,ISTOP) CALL STRPOS(CTOKEN_ORIGINAL,ISTART2,ISTOP2) CALL LOWTOUP(CTOKEN,LEN(CTOKEN)) IF ( NEXT_IS_NAME ) THEN NEXT_IS_NAME = .FALSE. SEQNAMES_UPPER(NALIGN)= CTOKEN(ISTART:ISTOP) SEQNAMES(NALIGN)= CTOKEN_ORIGINAL(ISTART2:ISTOP2) ELSE IF ( NEXT_IS_LEN ) THEN NEXT_IS_LEN = .FALSE. CALL MAKE_FORMAT_INT(ISTOP-ISTART+1,CFORMAT) READ(CTOKEN(ISTART:ISTOP),CFORMAT) ILEN NRES_ALI = MAX(NRES_ALI,ILEN) ELSE IF ( NEXT_IS_SEQCHECK ) THEN NEXT_IS_SEQCHECK = .FALSE. CALL MAKE_FORMAT_INT(ISTOP-ISTART+1,CFORMAT) READ(CTOKEN(ISTART:ISTOP),CFORMAT) SEQCHECK(NALIGN) ELSE IF ( NEXT_IS_WEIGHT ) THEN NEXT_IS_WEIGHT = .FALSE. READ(CTOKEN(ISTART:ISTOP),*) WEIGHT(NALIGN) ENDIF IF ( CTOKEN(ISTART:ISTOP) .EQ. 'NAME:' ) THEN NEXT_IS_NAME = .TRUE. ELSE IF ( CTOKEN(ISTART:ISTOP) .EQ. 'LEN:' ) THEN NEXT_IS_LEN = .TRUE. ELSE IF ( CTOKEN(ISTART:ISTOP) .EQ. 'CHECK:' ) THEN NEXT_IS_SEQCHECK = .TRUE. ELSE IF ( CTOKEN(ISTART:ISTOP) .EQ. 'WEIGHT:' ) THEN NEXT_IS_WEIGHT = .TRUE. ENDIF I = I + 1 CALL GETTOKEN(LINE,LINELEN,I,FPOS,CTOKEN_ORIGINAL) CTOKEN=CTOKEN_ORIGINAL ENDDO READ(KUNIT,'(A)',END = 99) LINE ENDDO ERROR = .FALSE. CALL MSFCHECKSEQ(SEQCHECK,NALIGN,TESTCHECK) IF ( TESTCHECK .NE. MSFCHECK ) THEN C ERROR = .TRUE. ERRORMESSAGE = 1 ' Total checksum incompatible with single checksums !!' WRITE(6,'(A)') ERRORMESSAGE c goto 99 ENDIF C SEARCH FOR "//" DIVIDER ERROR = .TRUE. ERRORMESSAGE = ' No proper MSFfile: divider missing !! ' DO WHILE ( INDEX(LINE,'//' ) .EQ. 0 ) READ(KUNIT,'(A)',END=99) LINE ENDDO ERROR = .FALSE. C READ MULTIPLE ALIGNMENT C 1 50 CCnfi02 MMYSPICLTQ DEFHPFIEAL LPHVRAIAYT WFNLQARKRK YFKKHEKRMS C initialize DO ISEQ = 1, NALIGN NSEQLINES(ISEQ)= 0 NRES(ISEQ)= 0 LASTOCCUPIED(ISEQ)= 0 INSIDE(ISEQ)= .FALSE. C TEMPORARY assignment! IF ( ISEQ .EQ. 1 ) THEN ALIPOINTER(ISEQ) = 1 ELSE ALIPOINTER(ISEQ) = ALIPOINTER(ISEQ-1)+NRES_ALI+1 ENDIF JFIR(ISEQ) = 1 JLAS(ISEQ) = 0 ENDDO ERROR = .TRUE. ERRORMESSAGE = ' ALIGNMENT MISSING !! ' C---- first line of alignment READ(KUNIT,'(A)',END=99) LINE C---- -------------------------------------------------- C---- now loop over all blocks C---- end if overflow of some array, or file read C---- LINELEN= maximal length of a line read C---- C---- C---- C---- -------------------------------------------------- ERROR = .FALSE. DO WHILE ( .TRUE. ) C------- get the first non-blank string in the line read (CNAME) C------- note: this is the protein name CALL GETTOKEN(LINE,LINELEN,1,FPOS,CNAME) CALL LOWTOUP(CNAME, LEN(CNAME) ) C------- get the number of the protein with that name (CNAME) C------- out: NPROT_THIS=number of protein with name CNAME C------- NPROT_THIS=0 if none matched! CALL GETARRAYINDEX(SEQNAMES_UPPER,CNAME,NALIGN,NPROT_THIS) C------ one of the names found IF ( NPROT_THIS .GT. 0 ) THEN NSEQLINES(NPROT_THIS)=NSEQLINES(NPROT_THIS)+1 C---------- get the second non-blank string in the line read (TMPSEQ) C---------- note: this is the sequence CALL GETTOKEN(LINE,LINELEN,2,IBEG,TMPSEQ) CALL LOWTOUP(LINE,LEN(LINE)) C---- C---- loop over all characters of line read C---- DO IPOS=IBEG,LINELEN C------------- if current residue neither ' ' nor TAB IF ( LINE(IPOS:IPOS) .NE. ' ' .AND. 1 LINE(IPOS:IPOS) .NE. CHAR(0) ) THEN C---------------- count up protein length NRES(NPROT_THIS)=NRES(NPROT_THIS) + 1 IF ( NRES(NPROT_THIS) .GT. NRES_ALI ) THEN WRITE(6,'(A)') 1 '*** ERROR in read_msf : SEQUENCE LENGTH EXCEEDS ' // 2 'ALIGNMENT LENGTH GIVEN IN HEADER !!! ***' WRITE(6,*)'*** line=',LINE(1:LEN(LINE)) WRITE(6,*)'*** this=',NRES(NPROT_THIS), + ' > ',NRES_ALI,' (NRES_ALI)' STOP ENDIF C---------------- is gap IF ( LINE(IPOS:IPOS) .EQ. CGAPCHAR ) THEN INGAP(NPROT_THIS)= .TRUE. IF ( INSIDE(NPROT_THIS) ) THEN ITMP=ALIPOINTER(NPROT_THIS)+NRES(NPROT_THIS)-1 ALISEQ(ITMP)=CGAPCHAR ENDIF C---------------- is NOT gap ELSE INGAP(NPROT_THIS) = .FALSE. LASTOCCUPIED(NPROT_THIS) = NRES(NPROT_THIS) IF ( .NOT. INSIDE(NPROT_THIS) ) THEN INSIDE(NPROT_THIS) = .TRUE. IFIR(NPROT_THIS) = NRES(NPROT_THIS) ENDIF JLAS(NPROT_THIS) = JLAS(NPROT_THIS) + 1 ALISEQ(ALIPOINTER(NPROT_THIS)+NRES(NPROT_THIS)-1)= + LINE(IPOS:IPOS) ENDIF ENDIF ENDDO ENDIF C else do nothing - blank or scale line READ(KUNIT,'(A)',END=99) LINE ENDDO 99 CONTINUE IF ( .NOT. ERROR ) THEN DO ISEQ=2,NALIGN IF (NSEQLINES(ISEQ) .NE. NSEQLINES(1)) THEN ERROR= .TRUE. ERRORMESSAGE = 1 ' Inconsistent sequence names !!' STOP ENDIF ENDDO ENDIF IF ( ERROR ) THEN WRITE(6,'(A)') ERRORMESSAGE RETURN ENDIF NO_ENDGAPS = .TRUE. DO ISEQ = 1,NALIGN NO_ENDGAPS = NO_ENDGAPS .AND. ( .NOT. INGAP(ISEQ)) ILAS(ISEQ) = LASTOCCUPIED(ISEQ) ENDDO C delete n- and c-terminal gaps from aliseq; C set ifir and ilas accordingly; C set pointers to alignments C 1.6.94 : C truncate NRES_ALI to be the last position occupied in at least one C ........ one of the sequences ! DIFF = 0 CFREE = NRES_ALI + 1 IPOS = 1 DO ISEQ = 1,NALIGN ALIPOINTER(ISEQ) = IPOS DIFF = DIFF + IFIR(ISEQ) - 1 I = IFIR(ISEQ) DO WHILE ( I .LE. ILAS(ISEQ) ) if (ipos .ge. maxcore) then write(6,*)'xx ipos=',ipos,' maxcore=',maxcore stop end if ALISEQ(IPOS) = ALISEQ(IPOS+DIFF) I = I + 1 IPOS = IPOS + 1 ENDDO ALISEQ(IPOS) = '/' IPOS = IPOS + 1 DIFF = DIFF + ( NRES_ALI - ILAS(ISEQ) ) C SMALLEST DISTANCE OF LAST OCCUPIED POSITION TO LAST ALIGNMENT POSITION C .. SHOULD BE ZERO, IF AT LEAST ONE SEQUENCE EXTENDS TO THE VERY END CFREE = MIN(CFREE,(NRES_ALI - ILAS(ISEQ)) ) ENDDO IF ( CFREE .GT. 0 ) THEN WRITE(6,'(1X,A)') 1 ' *** WARNING : empty c-terminal positions truncated ***' NRES_ALI = NRES_ALI - CFREE ENDIF ERROR = .FALSE. DO ISEQ = 1, NALIGN STRAND = ' ' CALL GET_SEQ_FROM_ALISEQ(ALISEQ,IFIR,ILAS,ALIPOINTER, 1 NRES_ALI,ISEQ,STRAND,NREAD, 2 ERROR ) IF ( NO_ENDGAPS ) THEN CALL CHECKSEQ(STRAND,1,ILAS(ISEQ),TESTCHECK) ELSE CALL CHECKSEQ(STRAND,1,NRES_ALI,TESTCHECK) ENDIF IF ( TESTCHECK .NE. SEQCHECK(ISEQ) ) THEN C ERROR = .TRUE. CALL STRPOS(SEQNAMES_UPPER(ISEQ),ISTART,ISTOP) ERRORMESSAGE = 1 ' checksum of sequence '//seqnames(iseq)(istart:istop)// 2 ' is not the same as checksum given in the header !' C goto 99 ENDIF ENDDO CLOSE(KUNIT) RETURN END C END READ_MSF C...................................................................... C...................................................................... C SUB READ_PIR SUBROUTINE READ_PIR(KIN,INFILE,CTRANS,RLEN,NRES,ACCESSION, 1 COMPND,SEQ,TRUNCATED,ERROR) C 14.5.93 C>P1; test Ctest.pir ( test.pep from: 1 to: 13 ) C A A A A A A A A A A A A A * IMPLICIT NONE C IMPORT INTEGER KIN, RLEN CHARACTER*(*) CTRANS,INFILE C EXPORT INTEGER NRES CHARACTER*(*) ACCESSION,COMPND, SEQ LOGICAL TRUNCATED,ERROR C INTERNAL INTEGER LINELEN PARAMETER (LINELEN= 1000) INTEGER IPOS, ISTART, ISTOP C INTEGER JSTART, JSTOP CHARACTER*1 C CHARACTER*(LINELEN) LINE c logical empty *----------------------------------------------------------------------* ERROR = .FALSE. ISTOP=0 C try to open outfile; return if unsuccessful CALL OPEN_FILE(KIN,INFILE,'old,readonly',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE DO WHILE ( LINE(1:1) .NE. '>' ) READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO ISTOP=INDEX(LINE,' ')-1 ACCESSION(1:LEN(ACCESSION))=LINE(2:ISTOP) c istart=index(line,'|')+1 c if ( istart .gt. 1) then c istop=index(line(istart:),'|')-1 c if ( istop .gt. 0) then c ACCESSION(1:len(ACCESSION))=line(istart:istart+istop-1) c else c ACCESSION(1:len(ACCESSION))=line(istart:) c endif c else c ACCESSION(1:len(ACCESSION))=line(2:) c ENDIF C ?? one comment line ?? always ?? READ(KIN,'(A)',ERR=1,END=2) LINE CALL STRPOS(LINE,ISTART,ISTOP) IF ( ISTOP .GT. 0 ) THEN COMPND = LINE(ISTART:ISTOP) ELSE COMPND = ' ' ENDIF c if ( empty ) then c call strpos(line,istart,istop) c if ( istop .gt. 0 ) then c empty = .false. c compnd = line(istart:istop) c endif c endif c if ( empty ) then c call strpos(infile,istart,istop) c compnd = infile(istart:istop) c endif NRES = 0 READ(KIN,'(A)',ERR=1,END=2) LINE DO WHILE ( .NOT. TRUNCATED ) CALL STRPOS(LINE,ISTART,ISTOP) IF ( ISTOP .NE. 0 ) THEN DO IPOS = ISTART,ISTOP C = LINE(IPOS:IPOS) CALL LOWTOUP(C,1) IF ( INDEX(CTRANS,C) .NE. 0 ) THEN TRUNCATED = ( NRES+1 .GT. LEN(SEQ) ) IF ( .NOT. TRUNCATED ) THEN NRES = NRES + 1 SEQ(NRES:NRES) = C ENDIF ELSE IF (C .EQ. '*') THEN GOTO 2 ENDIF ENDDO ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO GOTO 2 1 ERROR = .TRUE. WRITE(6,'(a)') ' ** error reading PIR file **' 2 CONTINUE CLOSE(KIN) RETURN END C END READ_PIR C...................................................................... C...................................................................... C SUB READ_REAL SUBROUTINE READ_REAL(STRING,XREAL) CHARACTER*(*) STRING REAL XREAL INTEGER EXPONENT IEXP=INDEX(STRING,'E') JEXP=INDEX(STRING,'E') CALL STRPOS(STRING,IBEG,IEND) IF (IEXP .EQ.0 .AND. JEXP .EQ. 0) THEN CALL READ_REAL_FROM_STRING(STRING(IBEG:IEND),XREAL) ELSE IPOS=MAX(IEXP,JEXP) CALL READ_INT_FROM_STRING(STRING(IPOS+1:IEND),EXPONENT) CALL READ_REAL_FROM_STRING(STRING(IBEG:IPOS-1),XREAL) XEXPONENT=FLOAT(EXPONENT) XREAL=XREAL * (10.0**XEXPONENT) ENDIF RETURN END C END READ_REAL C...................................................................... C...................................................................... C SUB READ_REAL_FROM_STRING SUBROUTINE READ_REAL_FROM_STRING(CSTRING,XNUMBER) C import CHARACTER*(*) CSTRING C export REAL XNUMBER C internal CHARACTER*100 CFORMAT,CTEMP INTEGER IPOS XNUMBER=0.0 CALL STRPOS(CSTRING,ISTART,ISTOP) ITOTAL=ISTOP-ISTART+1 IAFTER=0 IPOS=INDEX(CSTRING,'.') IF (IPOS .GT. 0) THEN IAFTER=ISTOP-IPOS ENDIF CALL CONCAT_STRING_INT('(F',ITOTAL,CTEMP) CALL CONCAT_STRINGS(CTEMP,'.',CFORMAT) CALL CONCAT_STRING_INT(CFORMAT,IAFTER,CTEMP) CALL CONCAT_STRINGS(CTEMP,')',CFORMAT) READ(CSTRING(ISTART:ISTOP),CFORMAT)XNUMBER RETURN END C END READ_REAL_FROM_STRING C...................................................................... C...................................................................... C SUB READ_SEQ_FROM_DSSP SUBROUTINE READ_SEQ_FROM_DSSP(KIN,INFILE,CHAINS,CTRANS,RLEN, 1 SEQ,STRUC,ACC,PDBNO,COMPND,NRES, 2 LACCZERO,TRUNCATED,ERROR) C 18. Dec 96, hackedihack, fixed a problem with dssp files with more C than 9 chains, C NOTE: this whole routine is bullsh...; RS 96 C 14.5.93 Ulrike Goebel C 1 1 O A 0 0 81 0, 0.0 149,-0.2 0, 0.0 104,-0.1 IMPLICIT NONE C IMPORT INTEGER KIN,RLEN INTEGER PDBNO(*), ACC(*) CHARACTER*(*) INFILE C EXPORT INTEGER NRES CHARACTER*(*) CHAINS CHARACTER*(*) COMPND, CTRANS, SEQ, STRUC LOGICAL LACCZERO,TRUNCATED,ERROR C INTERNAL INTEGER LINELEN PARAMETER (LINELEN= 1000) INTEGER N,ISTART,ISTOP,IPOS,JPOS,ICHAIN,JCHAIN,NREAD CHARACTER*1 C CHARACTER*(LINELEN) LINE CHARACTER*1000 NUMBERS CHARACTER*1000 TCHAINS,T2CHAINS C ebi version 02.98 (and not: tchains, t2chains) C CHARACTER*10 NUMBERS ERROR = .FALSE. C try to open outfile; return if unsuccessful CALL OPEN_FILE(KIN,INFILE,'old,readonly',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF C not in ebi version 02.98 TCHAINS=' ' T2CHAINS=' ' C end ebi version 02.98 LACCZERO = .TRUE. NUMBERS = '01234567891011121314151617181920'// 1 '21222324252627282930313233343536'// 2 '37383940414243444546474849505152'// 3 '53545556575859606162636465666768'// 4 '69707172737475767778798081828384'// 5 '858687888990919293949596979899100' C in ebi version 02.98 C NUMBERS = '0123456789' READ(KIN,'(A)',ERR=1,END=2) LINE DO WHILE ( LINE(3:12) .NE. '# RESIDUE' ) IF (LINE(1:6) .EQ. 'COMPND' ) THEN CALL STRPOS(LINE,ISTART,ISTOP) COMPND = LINE(7:MIN(200,ISTOP)) ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO C .. read pointer is now on first data line READ(KIN,'(A)',ERR=1,END=2) LINE NRES = 0 ICHAIN = 1 JCHAIN = 1 CALL STRPOS(CHAINS,ISTART,ISTOP) TCHAINS(1:)=CHAINS(ISTART:ISTOP) IPOS=1 JPOS=INDEX(TCHAINS,' ')-1 C last 3 lines not in ebi version 02.98, instead: C CALL GETTOKEN(CHAINS,LEN(CHAINS),1,IPOS,CHAIN) DO WHILE ( IPOS .LE. ISTOP ) C = LINE(12:12) IF ( INDEX(NUMBERS,TCHAINS(1:JPOS) ) .NE. 0 ) THEN READ(TCHAINS(1:JPOS),'(I2)') N C 2 lines in ebi version 02.98, instead: C IF ( INDEX(NUMBERS,CHAIN ) .NE. 0 ) THEN C READ(CHAIN,'(I1)') N IF ( N .EQ. ICHAIN ) THEN CALL READ_DSSPCHAIN(KIN,NRES,CTRANS,RLEN,LINE,SEQ, 1 STRUC,ACC,PDBNO,NREAD,LACCZERO, 2 TRUNCATED,ERROR) NRES = NRES + NREAD ICHAIN = ICHAIN + 1 JCHAIN = JCHAIN + 1 READ(KIN,'(A)',ERR=1,END=2) LINE ELSE CALL SKIP_DSSPCHAIN(KIN,RLEN,LINE,ERROR) ICHAIN = ICHAIN + 1 READ(KIN,'(A)',ERR=1,END=2) LINE ENDIF ELSE IF ( C .EQ. CHAINS .OR. CHAINS .EQ. ' ') THEN C line in ebi version 02.98, instead: C IF ( C .EQ. CHAIN .OR. CHAINS .EQ. ' ') THEN CALL READ_DSSPCHAIN(KIN,NRES,CTRANS,RLEN,LINE,SEQ, 1 STRUC,ACC,PDBNO,NREAD,LACCZERO, 2 TRUNCATED,ERROR) NRES = NRES + NREAD ICHAIN = ICHAIN + 1 JCHAIN = JCHAIN + 1 READ(KIN,'(A)',ERR=1,END=2) LINE ELSE CALL SKIP_DSSPCHAIN(KIN,RLEN,LINE,ERROR) ICHAIN = ICHAIN + 1 READ(KIN,'(A)',ERR=1,END=2) LINE ENDIF ENDIF T2CHAINS(1:)=TCHAINS(JPOS+1:) CALL STRPOS(T2CHAINS,ISTART,ISTOP) TCHAINS(1:)=T2CHAINS(ISTART:ISTOP) JPOS=INDEX(TCHAINS,' ')-1 C last 4 lines in ebi version 02.98, instead: C CALL STRPOS(CHAINS,ISTART,ISTOP) C CALL GETTOKEN(CHAINS,LEN(CHAINS),JCHAIN,IPOS,CHAIN) ENDDO IF ( SEQ(NRES:NRES) .EQ. '!' ) THEN SEQ(NRES:NRES) = ' ' STRUC(NRES:NRES) = ' ' NRES = NRES - 1 ENDIF GOTO 2 1 ERROR = .TRUE. WRITE(6,'(a)') '*** ERROR reading DSSP file (READ_SEQ_FROM_DSSP)' 2 CONTINUE CLOSE(KIN) RETURN END C END READ_SEQ_FROM_DSSP C...................................................................... C...................................................................... C SUB READ_SEQ_FROM_HSSP SUBROUTINE READ_SEQ_FROM_HSSP(KIN,INFILE,CHAINS,CTRANS,RLEN,SEQ, 1 STRUC,ACC,PDBNO,COMPND,NRES,LACCZERO,TRUNCATED,ERROR ) C 14.5.93 C 1 1 O A 0 0 81 11 13 AAAAAAAA IMPLICIT NONE C IMPORT INTEGER KIN, RLEN CHARACTER*(*) CHAINS CHARACTER*(*) INFILE C EXPORT INTEGER NRES INTEGER PDBNO(*), ACC(*) CHARACTER*(*) COMPND, CTRANS, SEQ, STRUC LOGICAL LACCZERO,TRUNCATED,ERROR C INTERNAL INTEGER LINELEN PARAMETER (LINELEN= 1000) INTEGER N,ISTART,ISTOP,IPOS,ICHAIN,JCHAIN,NREAD CHARACTER*1 C,CHAIN CHARACTER*(LINELEN) LINE CHARACTER*10 NUMBERS *----------------------------------------------------------------------* ERROR = .FALSE. C try to open outfile; return if unsuccessful CALL OPEN_FILE(KIN,INFILE,'old,readonly',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF ERROR = .FALSE. LACCZERO = .TRUE. NUMBERS = '0123456789' READ(KIN,'(A)',ERR=1,END=2) LINE DO WHILE ( LINE(1:13) .NE. '## ALIGNMENTS' ) IF (LINE(1:6) .EQ. 'COMPND' ) THEN CALL STRPOS(LINE,ISTART,ISTOP) COMPND = LINE(7:MIN(200,ISTOP)) ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO C .. skip 1 line READ(KIN,'(A)',ERR=1,END=2) LINE C .. read pointer is now on first data line READ(KIN,'(A)',ERR=1,END=2) LINE NRES = 0 ICHAIN = 1 JCHAIN = 1 CALL STRPOS(CHAINS,ISTART,ISTOP) CALL GETTOKEN(CHAINS,LEN(CHAINS),1,IPOS,CHAIN) DO WHILE ( IPOS .LE. ISTOP ) C = LINE(13:13) IF ( INDEX(NUMBERS,CHAIN ) .NE. 0 ) THEN READ(CHAIN,'(I1)') N IF ( N .EQ. ICHAIN ) THEN CALL READ_HSSPCHAIN(KIN,NRES,CTRANS,RLEN,LINE,SEQ, 1 STRUC,ACC,PDBNO,NREAD,LACCZERO, 2 TRUNCATED,ERROR) NRES = NRES + NREAD ICHAIN = ICHAIN + 1 JCHAIN = JCHAIN + 1 READ(KIN,'(A)',ERR=1,END=2) LINE ELSE CALL SKIP_HSSPCHAIN(KIN,RLEN,LINE,ERROR) ICHAIN = ICHAIN + 1 READ(KIN,'(A)',ERR=1,END=2) LINE ENDIF ELSE IF ( C .EQ. CHAIN ) THEN CALL READ_HSSPCHAIN(KIN,NRES,CTRANS,RLEN,LINE,SEQ, 1 STRUC,ACC,PDBNO,NREAD,LACCZERO, 2 TRUNCATED,ERROR) NRES = NRES + NREAD ICHAIN = ICHAIN + 1 JCHAIN = JCHAIN + 1 READ(KIN,'(A)',ERR=1,END=2) LINE ELSE CALL SKIP_HSSPCHAIN(KIN,RLEN,LINE,ERROR) ICHAIN = ICHAIN + 1 READ(KIN,'(A)',ERR=1,END=2) LINE ENDIF ENDIF CALL STRPOS(CHAINS,ISTART,ISTOP) CALL GETTOKEN(CHAINS,LEN(CHAINS),JCHAIN,IPOS,CHAIN) ENDDO IF ( SEQ(NRES:NRES) .EQ. '!' ) THEN SEQ(NRES:NRES) = ' ' STRUC(NRES:NRES) = ' ' NRES = NRES - 1 ENDIF GOTO 2 1 ERROR = .TRUE. WRITE(6,'(a)') '*** ERROR reading HSSP file (read_seq_from_hssp)' 2 CONTINUE CLOSE(KIN) RETURN END C END READ_SEQ_FROM_HSSP C...................................................................... C...................................................................... C SUB READ_STAR SUBROUTINE READ_STAR(KIN,INFILE,CTRANS,RLEN,NRES,SEQ, 1 TRUNCATED,ERROR) C 7.12.93 C*test.star ( test.pep from: 1 to: 13 ) C A A A A A A A A A A A A A IMPLICIT NONE C IMPORT INTEGER KIN, RLEN CHARACTER*(*) CTRANS,INFILE C EXPORT INTEGER NRES CHARACTER*(*) SEQ LOGICAL TRUNCATED,ERROR C INTERNAL INTEGER LINELEN PARAMETER (LINELEN= 1000) INTEGER IPOS, ISTART, ISTOP C INTEGER JSTART, JSTOP CHARACTER*1 C CHARACTER*(LINELEN) LINE C LOGICAL EMPTY *----------------------------------------------------------------------* ERROR = .FALSE. C try to open outfile; return if unsuccessful CALL OPEN_FILE(KIN,INFILE,'old,readonly',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF NRES = 0 READ(KIN,'(A)',ERR=1,END=2) LINE DO WHILE ( .NOT. TRUNCATED ) IF ( LINE(1:1) .NE. '*' ) THEN CALL STRPOS(LINE,ISTART,ISTOP) IF ( ISTOP .NE. 0 ) THEN DO IPOS = ISTART,ISTOP C = LINE(IPOS:IPOS) CALL LOWTOUP(C,1) IF ( INDEX(CTRANS,C) .NE. 0 ) THEN TRUNCATED = ( NRES+1 .GT. LEN(SEQ) ) IF ( .NOT. TRUNCATED ) THEN NRES = NRES + 1 SEQ(NRES:NRES) = C ENDIF ENDIF ENDDO ENDIF ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO GOTO 2 1 ERROR = .TRUE. WRITE(6,'(A)') ' ** ERROR READING STAR FILE **' 2 CONTINUE CLOSE(KIN) RETURN END C END READ_STAR C...................................................................... C...................................................................... C SUB READHSSP SUBROUTINE READHSSP(IUNIT,HSSPFILE,ERROR,MAXRES,MAXALIGNS, + MAXCORE,MAXINS,MAXINSBUFFER,PDBID,HEADER,COMPOUND, + SOURCE,AUTHOR,SEQLENGTH,NCHAIN,KCHAIN,CHAINREMARK, + NALIGN,EXCLUDEFLAG,EMBLID,STRID,IDE,SIM,IFIR,ILAS, + JFIR,JLAS,LALI,NGAP,LGAP,LENSEQ,ACCESSION,PROTNAME, + PDBNO,PDBSEQ,CHAINID,SECSTR,COLS,SHEETLABEL,BP1,BP2, + ACC,NOCC,VAR,ALISEQ,ALIPOINTER,SEQPROF,NDEL,NINS, + ENTROPY,RELENT,CONSWEIGHT,INSNUMBER,INSALI, + INSPOINTER,INSLEN,INSBEG_1,INSBEG_2,INSBUFFER, + LCONSERV,LHSSP_LONG_ID) C C Reinhard Schneider 1989, BIOcomputing EMBL, D-6900 Heidelberg, FRG C please report any bug, e-mail (INTERNET): C schneider@EMBL-Heidelberg.DE C or sander@EMBL-Heidelberg.DE C======================================================================= C INCREASE THE NUMBER OF FOLLOWING THREE PARAMETER IN THE CALLING C PROGRAM IF NECESSARY C======================================================================= C maxaligns = maximal number of alignments in a HSSP-file C maxres= maximal number of residues in a PDB-protein C maxcore= maximal space for storing the alignments C======================================================================= C C maxaa= 20 amino acids C nblocksize= number of alignments in one line C pdbid= Brookhaven Data Bank identifier C header,compound,source,author= informations about the PDB-protein C pdbseq= amino acid sequence of the PDB-protein C chainid= chain identifier (chain A etc.) C secstr= DSSP secondary structure summary C bp1,bp2= beta-bridge partner C cols= DSSP hydrogen bonding patterns for turns and helices, C geometrical bend, chirality, one character name of beta-ladder C and of beta-sheet C sheetlabel= chain identifier of beta bridge partner C seqlength= number of amino acids in the PDB-protein C pdbno= residue number as in PDB file C nchain= number of different chains in pdbid.DSSP data set C kchain= number of chains used in HSSP data set C nalign= number of alignments C acc= solvated residue surface area in A**2 C emblid= EMBL/SWISSPROT identifier of the alignend protein C strid= if the 3-D structure of this protein is known, then strid C (structure ID)is the Protein Data Bank identifier as taken C from the EMBL/SWISSPROT entry C protname= one line description of alignend protein C aliseq= sequential storage for the alignments C alipointer= points to the beginning of alignment X ( 1>= X <=nalign ) C ifir,ilas= first and last position of the alignment in the test C protein C jfir,jlas= first and last position of the alignment in the alignend C protein C lali= length of the alignment excluding insertions and deletions C ngap= number of insertions and deletions in the alignment C lgap= total length of all insertions and deletions C lenseq= length of the entire sequence of the alignend protein C ide= percentage of residue identity of the alignment C var= sequence variability as derived from the nalign alignments C seqprof= relative frequency for each of the 20 amino acids C nocc= number of alignend sequences spanning this position (including C the test sequence C ndel= number of sequences with a deletion in the test protein at this C position C nins= number of sequences with an insertion in the test protein at C this position C entropy= entropy measure of sequence variability at this position C relent= relative entropy (entropy normalized to the range 0-100) C consweight= conservation weight C======================================================================= IMPLICIT NONE INTEGER MAXALIGNS,MAXRES,MAXCORE,MAXINS,MAXAA,NBLOCKSIZE INTEGER MAXINSBUFFER PARAMETER (MAXAA= 20) PARAMETER (NBLOCKSIZE= 70) C============================ import ================================== CHARACTER HSSPFILE*(*) INTEGER IUNIT LOGICAL ERROR C ATTRIBUTES OF SEQUENCE WITH KNOWN STRUCTURE CHARACTER*(*) PDBID,HEADER,COMPOUND,SOURCE,AUTHOR CHARACTER PDBSEQ(MAXRES),CHAINID(MAXRES),SECSTR(MAXRES) C.......LENGHT*7 CHARACTER*(*) COLS(MAXRES),CHAINREMARK CHARACTER SHEETLABEL(MAXRES) INTEGER SEQLENGTH,PDBNO(MAXRES),NCHAIN,KCHAIN,NALIGN INTEGER BP1(MAXRES),BP2(MAXRES),ACC(MAXRES) C ATTRIBUTES OF ALIGNEND SEQUENCES CHARACTER*(*) EMBLID(MAXALIGNS),STRID(MAXALIGNS) CHARACTER*(*) ACCESSION(MAXALIGNS),PROTNAME(MAXALIGNS) CHARACTER ALISEQ(MAXCORE) CHARACTER EXCLUDEFLAG(MAXALIGNS) INTEGER ALIPOINTER(MAXALIGNS), + IFIR(MAXALIGNS),ILAS(MAXALIGNS),JFIR(MAXALIGNS), + JLAS(MAXALIGNS),LALI(MAXALIGNS),NGAP(MAXALIGNS), + LGAP(MAXALIGNS),LENSEQ(MAXALIGNS) REAL IDE(MAXALIGNS),SIM(MAXALIGNS) C ATTRIBUTES OF PROFILE INTEGER VAR(MAXRES),SEQPROF(MAXRES,MAXAA),RELENT(MAXRES), + NOCC(MAXRES),NDEL(MAXRES),NINS(MAXRES), + INSNUMBER,INSALI(MAXINS),INSPOINTER(MAXINS), + INSLEN(MAXINS),INSBEG_1(MAXINS),INSBEG_2(MAXINS) REAL ENTROPY(MAXRES),CONSWEIGHT(MAXRES) CHARACTER INSBUFFER(MAXINSBUFFER) LOGICAL LCONSERV,LHSSP_LONG_ID C======================================================================= C internal INTEGER MAXALIGNS_LOC PARAMETER (MAXALIGNS_LOC= 8765) C PARAMETER (MAXALIGNS_LOC= 19999) CHARACTER CTEMP*(NBLOCKSIZE),TEMPNAME*200 CHARACTER*200 LINE CHARACTER CHAINSELECT LOGICAL LCHAIN INTEGER ICHAINBEG,ICHAINEND,NALIGNORG, + I,J,K,IPOS,ILEN,NRES,IRES,NBLOCK,IALIGN,IBLOCK, + IALI,IBEG,IEND,IPOINTER(300000),IPOINT,IINS C---- ------------------------------------------------------------ C---- initialise C---- ------------------------------------------------------------ C ORDER OF AMINO ACID SYMBOLS IN THE HSSP SEQUENCE PROFILE BLOCK C PROFILESEQ='VLIMFWYGAPSTCHRKQEND' ERROR=.FALSE. NALIGN=0 CHAINREMARK=' ' CHAINSELECT=' ' DO I=1,MAXINSBUFFER INSBUFFER(I)=' ' ENDDO DO I=1,MAXALIGNS_LOC IPOINTER(I)=0 ENDDO LCHAIN=.FALSE. LHSSP_LONG_ID = .FALSE. TEMPNAME(1:)=HSSPFILE I=INDEX(TEMPNAME,'_!_') J=INDEX(TEMPNAME,'hssp_') IF (I.NE.0) THEN TEMPNAME(1:)=HSSPFILE(1:I-1) LCHAIN=.TRUE. READ(HSSPFILE(I+3:),'(A1)')CHAINSELECT WRITE(6,*)'*** ReadHSSP: extract the chain: ',chainselect ELSE IF (J.NE.0) THEN TEMPNAME(1:)=HSSPFILE(1:J+3) LCHAIN=.TRUE. READ(HSSPFILE(J+5:),'(A1)')CHAINSELECT WRITE(6,*)'*** ReadHSSP: extract the chain: ',chainselect ENDIF CALL OPEN_FILE(IUNIT,TEMPNAME,'old,readonly',error) IF (ERROR) THEN WRITE(6,'(A,A)')'*** ERROR READHSSP failed opening file:', + TEMPNAME GOTO 99 END IF READ(IUNIT,'(A)',ERR=99)LINE C check if it is a HSSP-file and get the release number for format flags IF (LINE(1:4).NE.'HSSP') THEN WRITE(6,'(A)')' ERROR: is not a HSSP-file' ERROR=.TRUE. RETURN ENDIF C read in PDBID etc. DO WHILE(LINE(1:6).NE.'PDBID') READ(IUNIT,'(A)',ERR=99)LINE ENDDO READ(LINE,'(11X,A)',ERR=99)PDBID DO WHILE(LINE(1:6).NE.'HEADER') READ(IUNIT,'(A)',ERR=99)LINE IF (INDEX(LINE,'LONG-ID').NE.0) THEN IF (INDEX(LINE,'YES').NE.0) THEN LHSSP_LONG_ID = .TRUE. ENDIF ENDIF ENDDO READ(LINE ,'(11X,A)',ERR=99)HEADER READ(IUNIT,'(11X,A)',ERR=99)COMPOUND READ(IUNIT,'(11X,A)',ERR=99)SOURCE READ(IUNIT,'(11X,A)',ERR=99)AUTHOR READ(IUNIT,'(11X,I4)',ERR=99)SEQLENGTH READ(IUNIT,'(11X,I4)',ERR=99)NCHAIN IF (CHAINSELECT .NE. ' ')NCHAIN=1 KCHAIN=NCHAIN READ(IUNIT,'(A)',ERR=99)LINE IF (INDEX(LINE,'KCHAIN').NE.0) THEN READ(LINE,'(11X,I4,A)',ERR=99)KCHAIN,CHAINREMARK READ(IUNIT,'(11X,I4)',ERR=99)NALIGNORG ELSE READ(LINE,'(11X,I4)',ERR=99)NALIGNORG ENDIF C if HSSP-file contains no alignments return IF (NALIGNORG.EQ.0) THEN WRITE(6,'(A)')'*** HSSP-file contains no alignments ***' CLOSE(IUNIT) c error=.true. RETURN ENDIF C parameter overflow handling IF (NALIGNORG.GT.MAXALIGNS) THEN WRITE(6,'(A)')'*** HSSP-file contains too many alignments **' WRITE(6,'(A)')'*** INCREASE MAXALIGNS IN COMMOM BLOCK ***' CLOSE(IUNIT) ERROR=.TRUE. RETURN ENDIF IF (NALIGNORG .GT. MAXALIGNS_LOC) THEN WRITE(6,*)'*** READHSSP: MAXALIGNS overflow, increase to >', + NALIGNORG STOP ENDIF IF (SEQLENGTH+KCHAIN-1.GT.MAXRES) THEN WRITE(6,'(A)')'*** PDB-sequence in HSSP-file too long ***' WRITE(6,'(A)')'*** INCREASE MAXRES ***' WRITE(6,'(A,I6,A,I6)') + 'need: ',seqlength+kchain-1,' limit is: ',maxres CLOSE(IUNIT) ERROR=.TRUE. RETURN ENDIF C number of sequence positions is number of residues + number of chains C chain break is indicated by a '!' NRES=SEQLENGTH+KCHAIN-1 ICHAINBEG=1 ICHAINEND=NRES IF (LCHAIN) THEN C search for ALIGNMENT-block DO WHILE (LINE(1:13).NE.'## ALIGNMENTS') READ(IUNIT,'(A)',ERR=99)LINE ENDDO READ(IUNIT,'(A)',ERR=99)LINE ICHAINBEG=0 ICHAINEND=0 C read till end ; some PDB-chains have DSSP-chain breaks !! DO I=1,NRES READ(IUNIT,'(7X,I4,1X,A1)',ERR=99)PDBNO(I),CHAINID(I) IF (CHAINID(I) .EQ. CHAINSELECT) THEN IF (ICHAINBEG .EQ. 0)ICHAINBEG=I ICHAINEND=I ENDIF ENDDO C increment chain number for artificial chain breaks DO I=ICHAINBEG,ICHAINEND IF (CHAINID(I) .NE. CHAINSELECT)NCHAIN=NCHAIN+1 ENDDO REWIND(IUNIT) SEQLENGTH=ICHAINEND-ICHAINBEG+1 NRES=SEQLENGTH+NCHAIN-1 ENDIF C search for the PROTEINS block LINE=' ' DO WHILE(LINE(1:11).NE.'## PROTEINS') READ(IUNIT,'(A)',ERR=99)LINE ENDDO READ(IUNIT,'(A)',ERR=99)LINE LCONSERV=.FALSE. IF (INDEX(LINE,'%WSIM').NE.0)LCONSERV=.TRUE. C read data about the alignments IALIGN=1 DO I=1,NALIGNORG IF ( LHSSP_LONG_ID) THEN READ(IUNIT,50,ERR=99) + EXCLUDEFLAG(IALIGN),EMBLID(IALIGN)(1:),STRID(IALIGN), + IDE(IALIGN),SIM(IALIGN),IFIR(IALIGN),ILAS(IALIGN), + JFIR(IALIGN),JLAS(IALIGN),LALI(IALIGN),NGAP(IALIGN), + LGAP(IALIGN),LENSEQ(IALIGN),ACCESSION(IALIGN), + PROTNAME(IALIGN) ELSE READ(IUNIT,100,ERR=99) + EXCLUDEFLAG(IALIGN),EMBLID(IALIGN)(1:),STRID(IALIGN), + IDE(IALIGN),SIM(IALIGN),IFIR(IALIGN),ILAS(IALIGN), + JFIR(IALIGN),JLAS(IALIGN),LALI(IALIGN),NGAP(IALIGN), + LGAP(IALIGN),LENSEQ(IALIGN),ACCESSION(IALIGN), + PROTNAME(IALIGN) ENDIF IF (IFIR(IALIGN) .GE. ICHAINBEG .AND. + ILAS(IALIGN) .LE. ICHAINEND) THEN IPOINTER(I)=IALIGN IALIGN=IALIGN+1 ELSE WRITE(6,*)'READHSSP INFO: skip alignment: ',i ENDIF ENDDO 50 FORMAT (5X,A1,2X,A40,A6,1X,F5.2,1X,F5.2,8(1X,I4),2X,A10,1X,A) 100 FORMAT (5X,A1,2X,A12,A6,1X,F5.2,1X,F5.2,8(1X,I4),2X,A10,1X,A) NALIGN=IALIGN-1 WRITE(6,*)'--- number of alignments: ',nalign WRITE(6,*)'--- PROTEINS block done' C init pointer ; aliseq contains the alignments (amino acid symbols) C stored in the following way ; '/' separates alignments C alignment(x) is stored from: C aliseq(alipointer(x)) to aliseq(ilas(x)-ifir(x)) C aliseq(1........46/48.........60/62....) C | | | C | | | C pointer pointer pointer C ali 1 ali 2 ali 3 C C C init pointer IPOS=1 DO I=1,NALIGN IF (IPOS.GE.MAXCORE) THEN WRITE(6,'(A)')'*** ERROR: INCREASE MAXCORE ***' STOP ENDIF ALIPOINTER(I)=IPOS ILEN=ILAS(I)-IFIR(I)+1 IPOS=IPOS+ILEN ALISEQ(IPOS)='/' IPOS=IPOS+1 ENDDO IF (NALIGN .LT. MAXALIGNS) THEN ALIPOINTER(NALIGN+1)=IPOS+1 ENDIF C number of ALIGNMENTS-blocks IF (MOD(FLOAT(NALIGNORG),FLOAT(NBLOCKSIZE)).EQ. 0.0) THEN NBLOCK=NALIGNORG/NBLOCKSIZE ELSE NBLOCK=NALIGNORG/NBLOCKSIZE+1 ENDIF C search for ALIGNMENT-block DO WHILE (LINE(1:13).NE.'## ALIGNMENTS') READ(IUNIT,'(A)',ERR=99)LINE ENDDO READ(IUNIT,'(A)',ERR=99)LINE C loop over ALIGNMENTS blocks C read in pdbno, chainid, secstr etc. IALIGN=0 IALI=0 DO IBLOCK=1,NBLOCK IRES=1 DO I=1,NRES READ(IUNIT,200,ERR=99) + PDBNO(IRES),CHAINID(IRES),PDBSEQ(IRES),SECSTR(IRES), + COLS(IRES),BP1(IRES),BP2(IRES),SHEETLABEL(IRES), + ACC(IRES),NOCC(IRES),VAR(IRES),CTEMP 200 FORMAT(7X,I4,2(1X,A1),2X,A1,1X,A7,2(I4),A1,I4,2(1X,I4),2X,A) C fill up aliseq IF (I .GE. ICHAINBEG .AND. I .LE. ICHAINEND) THEN IRES=IRES+1 IF (PDBSEQ(I) .NE. '!') THEN CALL STRPOS(CTEMP,IBEG,IEND) DO IPOS=MAX(IBEG,1),MIN(NBLOCKSIZE,IEND) IALI=IALIGN+IPOS IF (CTEMP(IPOS:IPOS) .NE. ' '.AND. + IPOINTER(IALI) .GT. 0) THEN IF (IPOINTER(IALI) .LE. 0 ) THEN WRITE(6,*)'*** READHSSP: ipointer=', + ipointer(iali), + 'iali,ialign,ipos=',iali,ialign,ipos ENDIF J=ALIPOINTER(IPOINTER(IALI)) + + (I-IFIR(IPOINTER(IALI))) ALISEQ(J)=CTEMP(IPOS:IPOS) ENDIF ENDDO ENDIF ENDIF ENDDO IALIGN=IALIGN+NBLOCKSIZE DO K=1,2 READ(IUNIT,'(A)',ERR=99)LINE ENDDO ENDDO WRITE(6,*)' ALIGNMENTS block done' C read in sequence profile, entropy etc. IRES=1 DO I=1,NRES READ(IUNIT,300,ERR=99)(SEQPROF(IRES,K),K=1,MAXAA), + NOCC(IRES),NDEL(IRES),NINS(IRES),ENTROPY(IRES), + RELENT(IRES),CONSWEIGHT(IRES) IF (I .GE. ICHAINBEG .AND. I .LE. ICHAINEND) THEN IRES=IRES+1 ENDIF ENDDO 300 FORMAT(12X,20(I4),1X,3(1X,I4),1X,F7.3,3X,I4,2X,F4.2) WRITE(6,*)' PROFILE block done' IF (LCHAIN) THEN DO I=1,NALIGN IFIR(I)=IFIR(I)-ICHAINBEG+1 ILAS(I)=ILAS(I)-ICHAINBEG+1 ENDDO ENDIF C read the insertion list COLD check if next line (last line in a HSSP-file) contains a '//' READ(IUNIT,'(A)',ERR=99)LINE IF (INDEX (LINE,'## INSERTION') .NE. 0) THEN READ(IUNIT,'(A)',ERR=99)LINE READ(IUNIT,'(A)',ERR=99)LINE IINS=0 IPOINT=1 DO WHILE (LINE(1:2) .NE. '//') CALL STRPOS(LINE,IBEG,IEND) IF (LINE(6:6) .NE. '+') THEN IF (IINS+1 .GT. MAXINS) THEN WRITE(6,*)'*** ERROR: MAXINS OVERFLOW, INCREASE !' GOTO 99 ENDIF IINS=IINS+1 INSPOINTER(IINS)=IPOINT READ(LINE,'(4(I6))')INSALI(IINS),INSBEG_1(IINS), + INSBEG_2(IINS),INSLEN(IINS) IF (IPOINT + INSLEN(IINS)+3 .GT. MAXINSBUFFER) THEN WRITE(6,*) + '*** ERROR: MAXINSBUFFER overflow, increase !' GOTO 99 c else c insbuffer(ipoint:)=line(26:iend) c ipoint=ipoint+inslen(iins)+3 ENDIF c else c call strpos(insbuffer,ipos,jpos) c insbuffer(jpos+1:)=line(26:iend) ENDIF c changed DO I=26,IEND INSBUFFER(IPOINT)=LINE(I:I) IPOINT=IPOINT+1 ENDDO c end change READ(IUNIT,'(A)',ERR=99)LINE ENDDO WRITE(6,*)' INSERTION list done' INSNUMBER=IINS ELSE IF (LINE(1:2) .NE. '//') THEN WRITE(6,'(A,A)')'*** READHSSP: missing line "//"' GOTO 99 ENDIF CLOSE(IUNIT) CALL STRPOS(HSSPFILE,IBEG,IEND) WRITE(6,'(A,A,A)')' ReadHSSP: ',HSSPFILE(IBEG:IEND),' OK' RETURN 99 WRITE(6,'(A,A)')'*** ERROR in READHSSP reading: ',HSSPFILE ERROR=.TRUE. NALIGN=0 SEQLENGTH=0 RETURN END C END READHSSP C...................................................................... C...................................................................... C SUB READPROFILE SUBROUTINE READPROFILE(KPROF,PROFILENAME,MAXRES,NTRANS,TRANS, + LDSSP,NRES,NCHAIN,HSSPID,HEADER,COMPOUND,SOURCE, + AUTHOR,SMIN,SMAX,MAPLOW,MAPHIGH,METRICFILE,PDBNO, + CHAINID,SEQ,STRUC,ACC,COLS,SHEETLABEL,BP1,BP2, + NOCC,GAPOPEN,GAPELONG,CONSWEIGHT,PROFILEMETRIC, + MAXBOX,NBOX,PROFILEBOX) IMPLICIT NONE C order of amino acids INTEGER NTRANS CHARACTER*(*) TRANS LOGICAL LDSSP INTEGER NACID PARAMETER (NACID= 20) INTEGER KPROF,MAXRES,NRES,ACC(MAXRES),BP1(MAXRES), + BP2(MAXRES),NOCC(MAXRES),NCHAIN,PDBNO(MAXRES) REAL PROFILEMETRIC(MAXRES,NTRANS),GAPOPEN(MAXRES), + GAPELONG(MAXRES),CONSWEIGHT(MAXRES), + SMIN,SMAX,MAPLOW,MAPHIGH CHARACTER*(*) HSSPID,HEADER,COMPOUND,SOURCE,AUTHOR,METRICFILE, + PROFILENAME,SEQ(MAXRES),STRUC(MAXRES), + CHAINID(MAXRES) CHARACTER*7 COLS(MAXRES) CHARACTER*1 SHEETLABEL(MAXRES) CHARACTER*300 LINE INTEGER MAXBOX,NBOX,PROFILEBOX(MAXBOX,2) C internal INTEGER I,J,K,IBOX CHARACTER CDIVIDE1,CDIVIDE2 LOGICAL LERROR *----------------------------------------------------------------------* C init LDSSP=.FALSE. LINE=' ' CDIVIDE1=':' CDIVIDE2='-' SMIN=0.0 SMAX=0.0 MAPLOW=0.0 MAPHIGH=0.0 DO I=1,MAXRES PDBNO(I)=0 CHAINID(I)=' ' SEQ(I)=' ' STRUC(I)=' ' COLS(I)=' ' BP1(I)=0 BP2(I)=0 SHEETLABEL(I)=' ' ACC(I)=0 NOCC(I)=0 GAPOPEN(I)=0.0 GAPELONG(I)=0.0 CONSWEIGHT(I)=0.0 DO J=1,NTRANS PROFILEMETRIC(I,J)=0.0 ENDDO ENDDO NBOX=1 DO I=1,MAXBOX PROFILEBOX(I,1)=0 PROFILEBOX(I,2)=0 ENDDO C====================================================================== CALL OPEN_FILE(KPROF,PROFILENAME,'OLD,RECL=2000,readonly', + LERROR) READ(KPROF,'(A)')LINE IF (INDEX(LINE,'-PROFILE').EQ.0) THEN WRITE(6,'(A,A)') + '*** ERROR: file is not a proper MAXHOM-PROFILE: ',profilename STOP ELSE IF (INDEX(LINE,'SECONDARY').NE.0) THEN LDSSP=.TRUE. ENDIF ENDIF C search for keywords C "SMIN" and "SMAX" scale metric C "MAPLOW" and "MAPHIGH" C if MAPLOW and MAPHIGH are specified the profile is rescaled C such that the profile values are mapped between MAPLOW and C MAPHIGH to ingnore outsider values C (fx. MAPHIGH=mean-value + standart-deviation) DO WHILE(INDEX(LINE,'SeqNo PDBNo AA STRUCTURE BP1 BP2').EQ.0) LINE=' ' READ(KPROF,'(A)')LINE c read(kprof,'(a)',end=999)line CALL EXTRACT_STRING(LINE,CDIVIDE1,'ID',HSSPID) CALL EXTRACT_STRING(LINE,CDIVIDE1,'HEADER',HEADER) CALL EXTRACT_STRING(LINE,CDIVIDE1,'COMPOUND',COMPOUND) CALL EXTRACT_STRING(LINE,CDIVIDE1,'SOURCE',SOURCE) CALL EXTRACT_STRING(LINE,CDIVIDE1,'AUTHOR',AUTHOR) CALL EXTRACT_STRING(LINE,CDIVIDE1,'METRIC',METRICFILE) CALL EXTRACT_INTEGER(LINE,CDIVIDE1,'NRES',NRES) CALL EXTRACT_INTEGER(LINE,CDIVIDE1,'NCHAIN',NCHAIN) CALL EXTRACT_INTEGER(LINE,CDIVIDE1,'NBOX',NBOX) CALL EXTRACT_REAL(LINE,CDIVIDE1,'SMIN',SMIN) CALL EXTRACT_REAL(LINE,CDIVIDE1,'SMAX',SMAX) CALL EXTRACT_REAL(LINE,CDIVIDE1,'MAPLOW',MAPLOW) CALL EXTRACT_REAL(LINE,CDIVIDE1,'MAPHIGH',MAPHIGH) ENDDO C read BOX definition IF (NBOX .GT. 1) THEN REWIND(KPROF) LINE=' ' IBOX=0 DO WHILE(INDEX(LINE,'SeqNo PDBNo AA STRUCTURE BP1 BP2').EQ.0) LINE=' ' READ(KPROF,'(A)',END=999)LINE IF (LINE(1:3).EQ.'BOX') THEN IBOX=IBOX+1 CALL EXTRACT_INTEGER_RANGE(LINE,CDIVIDE1,CDIVIDE2, + PROFILEBOX(IBOX,1)) ENDIF ENDDO IF (IBOX .NE. NBOX) THEN WRITE(6,*)' ERROR: number of boxes does not match number'// + ' of box specification' WRITE(6,*)NBOX,IBOX STOP ENDIF ELSE PROFILEBOX(NBOX,1)=1 PROFILEBOX(NBOX,2)=NRES ENDIF LINE=' ' I=0 READ(KPROF,'(A)')LINE DO WHILE(LINE(1:2).NE.'//') I=I+1 IF (I.GT.MAXRES) THEN WRITE(6,'(A)') + ' *** ERROR IN READROFILE: NRES.GT.MAXRES' STOP ENDIF c WRITE(6,*)line c read(line,100,end=999)pdbno(i),chainid(i),seq(i), c + struc(i),cols(i),bp1(i),bp2(i),sheetlabel(i),acc(i), c + nocc(i),gapopen(i),gapelong(i),consweight(i), c + (profilemetric(i,j),j=1,nacid) c read(line,100,err=999,end=999)pdbno(i),chainid(i),seq(i), READ(LINE,100)PDBNO(I),CHAINID(I),SEQ(I), + STRUC(I),COLS(I),BP1(I),BP2(I),SHEETLABEL(I),ACC(I), + NOCC(I),GAPOPEN(I),GAPELONG(I),CONSWEIGHT(I), + (PROFILEMETRIC(I,J),J=1,NACID) 100 FORMAT(6X,1X,I4,1X,A1,1X,A1,2X,A1,1X,A7,2(I4),A1,2(I4,1X), + 2(F6.2),F7.2,20(F8.3)) READ(KPROF,'(A)')LINE ENDDO IF (I .NE. NRES) THEN WRITE(6,*) ' ********************************************' WRITE(6,*) ' FATAL ERROR' WRITE(6,*) ' Heee, number of positions read in is: ',i WRITE(6,*) ' NRES in Header is: ',nres STOP ENDIF CLOSE(KPROF) C add 'B' 'Z' 'X' '!' '-' '.' I=INDEX(TRANS,'N') J=INDEX(TRANS,'B') DO K=1,NRES PROFILEMETRIC(K,J)=PROFILEMETRIC(K,I) ENDDO I=INDEX(TRANS,'Q') J=INDEX(TRANS,'Z') DO K=1,NRES PROFILEMETRIC(K,J)=PROFILEMETRIC(K,I) ENDDO RETURN C read error 999 CLOSE(KPROF) WRITE(6,*)'*** ERROR: read error in MAXHOM-PROFILE' NRES=0 RETURN END C END READROFILE C...................................................................... C...................................................................... C SUB RECEIVE_DATA_FROM_HOST SUBROUTINE RECEIVE_DATA_FROM_HOST(ILINK) C node routine: get all relevant information about sequence 1 and C control flow C IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' INTEGER ILINK C internal CHARACTER CPARSYTEC_BUG*(MAXSQ) INTEGER ISIZE,I INTEGER ILBACKWARD,ILINSERT_2,ILISTOFSEQ_2,ILSHOW_SAMESEQ, + ILSWISSBASE,ILDSSP_1,ILCONSERV_1,ILCONSERV_2, + ILCONSIMPORT,ILALL,ILFORMULA,ILTHRESHOLD, + ILCOMPSTR,ILPASS2,ILTRACE,ILONG_OUT,ILBATCH, + I3WAY,I3WAYDONE,IWARM_START,IBINARY C INTEGER ILMIXED_ARCH, C init logicals ILBACKWARD=0 ILINSERT_2=0 ILISTOFSEQ_2=0 ILSHOW_SAMESEQ=0 ILSWISSBASE=0 ILDSSP_1=0 ILCONSERV_1=0 ILCONSERV_2=0 ILCONSIMPORT=0 ILALL=0 ILFORMULA=0 ILTHRESHOLD=0 ILCOMPSTR=0 ILPASS2=0 ILTRACE=0 ILONG_OUT=0 ILBATCH=0 C ILMIXED_ARCH=0 I3WAY=0 I3WAYDONE=0 IWARM_START=0 LBACKWARD = .FALSE. LINSERT_2 = .FALSE. LISTOFSEQ_2 = .FALSE. LSHOW_SAMESEQ = .FALSE. LSWISSBASE = .FALSE. LDSSP_1 = .FALSE. LCONSERV_1 = .FALSE. LCONSERV_2 = .FALSE. LCONSIMPORT = .FALSE. LALL = .FALSE. LFORMULA = .FALSE. LTHRESHOLD = .FALSE. LCOMPSTR = .FALSE. LPASS2 = .FALSE. LTRACE = .FALSE. LONG_OUT = .FALSE. LBATCH = .FALSE. L3WAY=.FALSE. L3WAYDONE=.FALSE. LWARM_START=.FALSE. LBINARY=.FALSE. C LMIXED_ARCH=.FALSE. C INIT WRITE(6,*)' receive data start 1: ',idproc CALL FLUSH_UNIT(6) MSGTYPE=1 c if (mp_model .eq. 'PARIX') then ; msgtype=idtop ; endif CALL MP_RECEIVE_DATA(MSGTYPE,LINK(ID_HOST)) CALL MP_GET_INT4(MSGTYPE,ILINK,ID_HOST,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,N1,N_ONE) IF (N1 .GT. 0) THEN ISIZE=N1 CALL MP_GET_INT4_ARRAY(MSGTYPE,ILINK,LSQ_1,ISIZE) CALL MP_GET_INT4_ARRAY(MSGTYPE,ILINK,LSTRUC_1,ISIZE) CALL MP_GET_INT4_ARRAY(MSGTYPE,ILINK,LSTRCLASS_1,ISIZE) CALL MP_GET_INT4_ARRAY(MSGTYPE,ILINK,LACC_1,ISIZE) ISIZE=MAXBREAK CALL MP_GET_INT4_ARRAY(MSGTYPE,ILINK,IBREAKPOS_1,ISIZE) CALL MP_GET_INT4(MSGTYPE,ILINK,NBREAK_1,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,NBEST,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,IPROFBEG,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,IPROFEND,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,PROFILEMODE,N_ONE) ISIZE=NASCII CALL MP_GET_INT4_ARRAY(MSGTYPE,ILINK,TRANSPOS,ISIZE) ISIZE=MAXCUTOFFSTEPS CALL MP_GET_INT4_ARRAY(MSGTYPE,ILINK,ISOLEN,ISIZE) CALL MP_GET_INT4(MSGTYPE,ILINK,NSTEP,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ISAFE,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,NSTRSTATES_1,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,NSTRSTATES_2,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,NIOSTATES_1,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,NIOSTATES_2,N_ONE) ISIZE=N1 CALL MP_GET_INT4_ARRAY(MSGTYPE,ILINK,PDBNO_1,ISIZE) ISIZE=MAXCUTOFFSTEPS CALL MP_GET_REAL4_ARRAY(MSGTYPE,ILINK,ISOIDE,ISIZE) ISIZE=N1 CALL MP_GET_REAL4_ARRAY(MSGTYPE,ILINK,GAPOPEN_1,ISIZE) CALL MP_GET_REAL4_ARRAY(MSGTYPE,ILINK,GAPELONG_1,ISIZE) CALL MP_GET_REAL4(MSGTYPE,ILINK,OPEN_1,N_ONE) CALL MP_GET_REAL4(MSGTYPE,ILINK,ELONG_1,N_ONE) CALL MP_GET_REAL4_ARRAY(MSGTYPE,ILINK,CONSWEIGHT_1,ISIZE) ISIZE=MAXSQ*NTRANS CALL MP_GET_REAL4_ARRAY(MSGTYPE,ILINK,SIMMETRIC_1,ISIZE) IF (PROFILEMODE .EQ. 6) THEN ISIZE= NTRANS * NTRANS * MAXSTRSTATES * MAXIOSTATES * + MAXSTRSTATES*MAXIOSTATES CALL MP_GET_REAL4_ARRAY(MSGTYPE,ILINK,SIMORG,ISIZE) ENDIF CALL MP_GET_REAL4(MSGTYPE,ILINK,FILTER_VAL,N_ONE) CALL MP_GET_REAL4(MSGTYPE,ILINK,PUNISH,N_ONE) CALL MP_GET_REAL4(MSGTYPE,ILINK,CUTVALUE1,N_ONE) CALL MP_GET_REAL4(MSGTYPE,ILINK,CUTVALUE2,N_ONE) CALL MP_GET_REAL4(MSGTYPE,ILINK,SMIN,N_ONE) CALL MP_GET_REAL4(MSGTYPE,ILINK,SMAX,N_ONE) CALL MP_GET_REAL4(MSGTYPE,ILINK,MAPLOW,N_ONE) CALL MP_GET_REAL4(MSGTYPE,ILINK,MAPHIGH,N_ONE) ISIZE=MAXSTRSTATES*MAXIOSTATES CALL MP_GET_REAL4_ARRAY(MSGTYPE,ILINK,IORANGE,ISIZE) ISIZE=LEN(MP_MODEL) CALL MP_GET_STRING(MSGTYPE,ILINK,MP_MODEL,ISIZE) ISIZE=LEN(SPLIT_DB_PATH) CALL MP_GET_STRING(MSGTYPE,ILINK,SPLIT_DB_PATH,ISIZE) ISIZE=LEN(SPLIT_DB_DATA) CALL MP_GET_STRING(MSGTYPE,ILINK,SPLIT_DB_DATA,ISIZE) ISIZE=LEN(SWISSPROT_SEQ) CALL MP_GET_STRING(MSGTYPE,ILINK,SWISSPROT_SEQ,ISIZE) ISIZE=LEN(LISTFILE_2) CALL MP_GET_STRING(MSGTYPE,ILINK,LISTFILE_2,ISIZE) ISIZE=LEN(CSQ_1) CALL MP_GET_STRING(MSGTYPE,ILINK,CSQ_1,ISIZE) c isize=MAXSQ c call mp_get_string_array(msgtype,ilink,struc_1,isize) c call mp_get_string_array(msgtype,ilink,chainid_1,isize) ISIZE=N1 CALL MP_GET_STRING(MSGTYPE,ILINK,CPARSYTEC_BUG,ISIZE) DO I=1,N1 STRUC_1(I)=CPARSYTEC_BUG(I:I) ENDDO CALL MP_GET_STRING(MSGTYPE,ILINK,CPARSYTEC_BUG,ISIZE) DO I=1,N1 CHAINID_1(I)=CPARSYTEC_BUG(I:I) ENDDO ISIZE=LEN(OPENWEIGHT_ANSWER) CALL MP_GET_STRING(MSGTYPE,ILINK,OPENWEIGHT_ANSWER,ISIZE) ISIZE=LEN(ELONGWEIGHT_ANSWER) CALL MP_GET_STRING(MSGTYPE,ILINK,ELONGWEIGHT_ANSWER,ISIZE) ISIZE=LEN(SMIN_ANSWER) CALL MP_GET_STRING(MSGTYPE,ILINK,SMIN_ANSWER,ISIZE) ISIZE=LEN(NAME_1) CALL MP_GET_STRING(MSGTYPE,ILINK,NAME_1,ISIZE) ISIZE=LEN(HSSPID_2) CALL MP_GET_STRING(MSGTYPE,ILINK,HSSPID_2,ISIZE) ISIZE=LEN(CSORTMODE) CALL MP_GET_STRING(MSGTYPE,ILINK,CSORTMODE,ISIZE) ISIZE=LEN(METRICFILE) CALL MP_GET_STRING(MSGTYPE,ILINK,METRICFILE,ISIZE) ISIZE=LEN(CURRENT_DIR) CALL MP_GET_STRING(MSGTYPE,ILINK,CURRENT_DIR,ISIZE) ISIZE=LEN(DSSP_PATH) CALL MP_GET_STRING(MSGTYPE,ILINK,DSSP_PATH,ISIZE) ISIZE=LEN(PDBPATH) CALL MP_GET_STRING(MSGTYPE,ILINK,PDBPATH,ISIZE) ISIZE=LEN(PLOTFILE) CALL MP_GET_STRING(MSGTYPE,ILINK,PLOTFILE,ISIZE) ISIZE=LEN(COREPATH) CALL MP_GET_STRING(MSGTYPE,ILINK,COREPATH,ISIZE) ISIZE=LEN(COREFILE) CALL MP_GET_STRING(MSGTYPE,ILINK,COREFILE,ISIZE) ISIZE=LEN(TRANS) CALL MP_GET_STRING(MSGTYPE,ILINK,TRANS,ISIZE) ISIZE=LEN(STRTRANS) CALL MP_GET_STRING(MSGTYPE,ILINK,STRTRANS,ISIZE) ISIZE=LEN(CSTRSTATES) CALL MP_GET_STRING(MSGTYPE,ILINK,CSTRSTATES,ISIZE) ISIZE=LEN(CIOSTATES) CALL MP_GET_STRING(MSGTYPE,ILINK,CIOSTATES,ISIZE) DO I=1,MAXSTRSTATES ISIZE=LEN(STR_CLASSES(I)) CALL MP_GET_STRING(MSGTYPE,ILINK,STR_CLASSES(I),ISIZE) ENDDO C CALL MP_GET_INT4(MSGTYPE,ILINK,ILMIXED_ARCH,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILBACKWARD,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILINSERT_2,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILISTOFSEQ_2,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILSHOW_SAMESEQ,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILSWISSBASE,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILDSSP_1,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILCONSERV_1,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILCONSERV_2,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILCONSIMPORT,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILALL,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILFORMULA,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILTHRESHOLD,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILCOMPSTR,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILPASS2,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILTRACE,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILONG_OUT,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILBATCH,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,I3WAY,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,I3WAYDONE,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,IWARM_START,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,IBINARY,N_ONE) C IF ( ILMIXED_ARCH .EQ. 1 )LMIXED_ARCH = .TRUE. IF ( ILBACKWARD .EQ. 1 )LBACKWARD = .TRUE. IF ( ILINSERT_2 .EQ. 1 )LINSERT_2 = .TRUE. IF ( ILISTOFSEQ_2 .EQ. 1 )LISTOFSEQ_2 = .TRUE. IF ( ILSHOW_SAMESEQ .EQ. 1 )LSHOW_SAMESEQ = .TRUE. IF ( ILSWISSBASE .EQ. 1 )LSWISSBASE = .TRUE. IF ( ILDSSP_1 .EQ. 1 )LDSSP_1 = .TRUE. IF ( ILCONSERV_1 .EQ. 1 )LCONSERV_1 = .TRUE. IF ( ILCONSERV_2 .EQ. 1 )LCONSERV_2 = .TRUE. IF ( ILCONSIMPORT .EQ. 1 )LCONSIMPORT = .TRUE. IF ( ILALL .EQ. 1 )LALL = .TRUE. IF ( ILFORMULA .EQ. 1 )LFORMULA = .TRUE. IF ( ILTHRESHOLD .EQ. 1 )LTHRESHOLD = .TRUE. IF ( ILCOMPSTR .EQ. 1 )LCOMPSTR = .TRUE. IF ( ILPASS2 .EQ. 1 )LPASS2 = .TRUE. IF ( ILTRACE .EQ. 1 )LTRACE = .TRUE. IF ( ILONG_OUT .EQ. 1 )LONG_OUT = .TRUE. IF ( ILBATCH .EQ. 1 )LBATCH = .TRUE. IF ( I3WAY .EQ. 1 )L3WAY = .TRUE. IF ( I3WAYDONE .EQ. 1 )L3WAYDONE = .TRUE. IF ( IWARM_START .EQ. 1 )LWARM_START = .TRUE. IF ( IBINARY .EQ. 1 )LBINARY = .TRUE. ENDIF WRITE(6,*)' receive data OK: ',idproc CALL FLUSH_UNIT(6) RETURN END C END RECEIVE_DATA_FROM_HOST C...................................................................... C...................................................................... C SUB REPORTPIECES SUBROUTINE REPORTPIECES PARAMETER (MXPIECES= 50) COMMON/CPIECE/IPRESPIE(2,2,MXPIECES),NPIECES,NRESPIE(2), + NATMPIE(2) CALL CHECKRANGE(NPIECES,1,MXPIECES,'NPIECES ','REPORTPIEC') C IPRESPIE(1/2, molA/molB, IPIECE) WRITE(6,*)'------- you chose ---------' WRITE(6,'(I10,A10)') NPIECES,' pieces ' WRITE(6,*)'---------------------------' WRITE(6,*)' mol A mol B' WRITE(6,*)' from...to from...to ' WRITE(6,*)'----------------------------' DO IPIECE=1,NPIECES WRITE(6,'(I3,1x,2I5,5X,2I5)') IPIECE, + ( (IPRESPIE(I,M,IPIECE),I=1,2), M=1,2) C FOR IPIECE=1,NPIECES ENDDO WRITE(6,*)'----------------------------' RETURN END C END REPORTPIECES C...................................................................... C...................................................................... C SUB RightADJUST SUBROUTINE RIGHTADJUST(STRING,NDIM,NLEN) C right-adjust of astring CHARACTER*(*) STRING INTEGER NDIM, NLEN, l,il C find position of last non-blank IF (NDIM.LT.1.OR.NLEN.LT.1) RETURN IF (NDIM .gt. 1 ) STOP' update routine rightadjust' L=NLEN DO WHILE(STRING(L:L) .EQ. ' ' .AND. L .GT. 1) L=L-1 ENDDO IF (L .LT. NLEN) THEN C L is position of last non-blank STRING(NLEN-L+1:NLEN)=STRING(1:L) C fill rest with blanks from 1 to NLEN-L DO IL=1,NLEN-L STRING(IL:IL)=' ' ENDDO ENDIF c DO I=1,NDIM ! for each string c L=NLEN c DO WHILE(STRINGS(I)(L:L).EQ.' '.AND.L.GT.1) c L=L-1 c ENDDO c IF (L.LT.NLEN) THEN C L is position of last non-blank c STRINGS(I)(NLEN-L+1:NLEN)=STRINGS(I)(1:L) C fill rest with blanks from 1 to NLEN-L c DO IL=1,NLEN-L c STRINGS(I)(IL:IL)=' ' c ENDDO c ENDIF c ENDDO RETURN END C END RightADJUST C...................................................................... C...................................................................... C SUB S3TOS1 SUBROUTINE S3TOS1(SEQ3,SEQ1,NRES) C TRANSLATES A3 TO A1 AND VICE VERSA. CHRIS SANDER MAY 1983 C INPUT/OUTPUT CHARACTER SEQ3(*)*3,SEQ1(*)*1 INTEGER NRES C LOCAL CHARACTER AA3(24)*3, AA1(24)*1 DATA AA3/'GLY','PRO','ASP','GLU','ALA','ASN','GLN','SER', + 'THR','LYS','ARG','HIS','VAL','ILE','MET','CYS', + 'LEU','PHE','TYR','TRP','ASX','GLX','---','!!!'/ DATA AA1/'G','P','D','E','A','N','Q','S','T','K', + 'R','H','V','I','M','C','L','F','Y','W','B','Z','-','!'/ C 'X' OR 'XYZ' FOR NON-STANDARD OR UNKNOWN AMINO ACID RESIDUES DO I=1,NRES DO J=1,24 CD WRITE(6,*)'S3TOS1: ',SEQ3(I),I,' =?= ',AA3(J),J IF (SEQ3(I).EQ.AA3(J)) THEN SEQ1(I)=AA1(J) GOTO 9 ENDIF ENDDO SEQ1(I)='X' WRITE(6,100) SEQ3(I),SEQ1(I) WRITE(6,*)' legal residues are: ' WRITE(6,*) (AA3(J),J=1,24) 9 CONTINUE ENDDO 100 FORMAT(' UNUSUAL RESIDUE NAME <',A3,'> TRANSLATED TO <',A1,'>') C c ENTRY S1TOS3(SEQ3,SEQ1,NRES) c DO I=1,NRES c DO J=1,24 c IF (SEQ1(I).EQ.AA1(J)) THEN c SEQ3(I)=AA3(J) c GOTO 99 c ENDIF c ENDDO c SEQ3(I)='XYZ' c WRITE(6,100) SEQ1(I),SEQ3(I) c99 CONTINUE c ENDDO RETURN END C END S3TOS1 C...................................................................... C...................................................................... C SUB S1TOS3 SUBROUTINE S1TOS3(SEQ3,SEQ1,NRES) C TRANSLATES A3 TO A1 AND VICE VERSA. CHRIS SANDER MAY 1983 C INPUT/OUTPUT CHARACTER SEQ3(*)*3,SEQ1(*)*1 INTEGER NRES C LOCAL CHARACTER AA3(24)*3, AA1(24)*1 DATA AA3/'GLY','PRO','ASP','GLU','ALA','ASN','GLN','SER', + 'THR','LYS','ARG','HIS','VAL','ILE','MET','CYS', + 'LEU','PHE','TYR','TRP','ASX','GLX','---','!!!'/ DATA AA1/'G','P','D','E','A','N','Q','S','T','K', + 'R','H','V','I','M','C','L','F','Y','W','B','Z','-','!'/ C 'X' OR 'XYZ' FOR NON-STANDARD OR UNKNOWN AMINO ACID RESIDUES DO I=1,NRES DO J=1,24 IF (SEQ1(I).EQ.AA1(J)) THEN SEQ3(I)=AA3(J) GOTO 99 ENDIF ENDDO SEQ3(I)='XYZ' WRITE(6,100) SEQ1(I),SEQ3(I) 99 CONTINUE ENDDO 100 FORMAT(' UNUSUAL RESIDUE NAME <',A3,'> TRANSLATED TO <',A1,'>') RETURN END C END S1TOS3 C...................................................................... C...................................................................... C SUB SCALE_PROFILE_METRIC SUBROUTINE SCALE_PROFILE_METRIC(MAXRES,NTRANS,TRANS, + PROFILEMETRIC,SMIN,SMAX,MAPLOW,MAPHIGH) C======================================================================= C scale profile metric according to SMIN,SMAX,MAPLOW,MAPHIGH C profilemetric is sim(maxres,26) C======================================================================= IMPLICIT NONE INTEGER MAXRES,NTRANS REAL PROFILEMETRIC(MAXRES,NTRANS) REAL SMIN,SMAX,MAPLOW,MAPHIGH CHARACTER*(*) TRANS C internal INTEGER NN,I,J,K,L,M NN=MAXRES*NTRANS C======================================================================= C reset value for chain breaks etc... C add 'X' '!' and "-" J=INDEX(TRANS,'X') K=INDEX(TRANS,'!') L=INDEX(TRANS,'-') M=INDEX(TRANS,'.') IF (J.EQ.0 .OR. K.EQ.0 .OR. L.EQ.0 .or. M.eq. 0) THEN WRITE(6,*)'*** ERROR: "X","!","-" or "." unknown in '// + 'SCALE_PROFILE_METRIC' ENDIF DO I=1,MAXRES PROFILEMETRIC(I,J)=0.0 PROFILEMETRIC(I,K)=0.0 PROFILEMETRIC(I,L)=0.0 PROFILEMETRIC(I,M)=0.0 ENDDO CALL SCALEINTERVAL(PROFILEMETRIC,NN,SMIN,SMAX,MAPLOW,MAPHIGH) C======================================================================= C reset value for chain breaks etc... C add 'X' '!' and "-" J=INDEX(TRANS,'X') K=INDEX(TRANS,'!') L=INDEX(TRANS,'-') M=INDEX(TRANS,'.') IF (J.EQ.0 .OR. K.EQ.0 .OR. L.EQ.0 .or. M.eq. 0) THEN WRITE(6,*)'*** ERROR: "X","!","-" or "." unknown in '// + 'SCALE_PROFILE_METRIC' ENDIF DO I=1,MAXRES PROFILEMETRIC(I,J)=0.0 PROFILEMETRIC(I,K)=-200.0 PROFILEMETRIC(I,L)=0.0 PROFILEMETRIC(I,M)=0.0 ENDDO C======================================================================= C DEBUG: WRITE MATRIX IN OUTPUT-FILE C======================================================================= c OPEN(99,FILE='METRIC_DEBUG.X',STATUS='NEW',RECL=500) c DO I=1,50 c WRITE(99,'(1X,26(F7.2))')(PROFILEMETRIC(I,J),J=1,NTRANS) c ENDDO c CLOSE(99) C======================================================================= RETURN END C END SCALE_PROFILE_METRIC C...................................................................... C...................................................................... C SUB SCALEMETRIC SUBROUTINE SCALEMETRIC(NTRANS,TRANS,MAXSTRSTATES, + MAXIOSTATES,SIMMETRIC,SMIN,SMAX,MAPLOW,MAPHIGH) C======================================================================= C scale matrix according to SMIN,SMAX,MAPLOW,MAPHIGH C======================================================================= IMPLICIT NONE INTEGER NTRANS,MAXSTRSTATES,MAXIOSTATES REAL SIMMETRIC(NTRANS,NTRANS,MAXSTRSTATES,MAXIOSTATES, + MAXSTRSTATES,MAXIOSTATES) REAL SMIN,SMAX,MAPLOW,MAPHIGH CHARACTER*(*) TRANS C internal INTEGER NN,I,J,istr1,io1,istr2,io2 NN= (NTRANS * NTRANS) * (MAXSTRSTATES * MAXSTRSTATES) * + (MAXIOSTATES * MAXIOSTATES) CALL SCALEINTERVAL(SIMMETRIC,NN,SMIN,SMAX,MAPLOW,MAPHIGH) C======================================================================= C reset value for chain breaks etc... C add 'X' I=INDEX(TRANS,'X') IF (I.EQ.0) THEN WRITE(6,*)'*** ERROR: "X" unknown in SCALEMETRIC' STOP ENDIF DO J=1,NTRANS DO ISTR1=1,MAXSTRSTATES DO IO1=1,MAXIOSTATES DO ISTR2=1,MAXSTRSTATES DO IO2=1,MAXIOSTATES SIMMETRIC(I,J,ISTR1,IO1,ISTR2,IO2)=0.0 SIMMETRIC(J,I,ISTR1,IO1,ISTR2,IO2)=0.0 ENDDO ENDDO ENDDO ENDDO ENDDO C add '!' I=INDEX(TRANS,'!') IF (I.EQ.0) THEN WRITE(6,*)'*** ERROR: "!" unknown in SCALEMETRIC' STOP ENDIF DO J=1,NTRANS DO ISTR1=1,MAXSTRSTATES DO IO1=1,MAXIOSTATES DO ISTR2=1,MAXSTRSTATES DO IO2=1,MAXIOSTATES SIMMETRIC(I,J,ISTR1,IO1,ISTR2,IO2)=0.0 SIMMETRIC(J,I,ISTR1,IO1,ISTR2,IO2)=0.0 ENDDO ENDDO ENDDO ENDDO ENDDO C add '-' I=INDEX(TRANS,'-') IF (I.EQ.0) THEN WRITE(6,*)'*** ERROR: "-" unknown in SCALEMETRIC' STOP ENDIF DO J=1,NTRANS DO ISTR1=1,MAXSTRSTATES DO IO1=1,MAXIOSTATES DO ISTR2=1,MAXSTRSTATES DO IO2=1,MAXIOSTATES SIMMETRIC(I,J,ISTR1,IO1,ISTR2,IO2)=0.0 SIMMETRIC(J,I,ISTR1,IO1,ISTR2,IO2)=0.0 ENDDO ENDDO ENDDO ENDDO ENDDO C add '.' c I=INDEX(TRANS,'.') c IF (I.EQ.0) THEN c WRITE(6,*)'*** ERROR: "." unknown in SCALEMETRIC' c STOP c ENDIF c DO J=1,NTRANS c DO istr1=1,MAXSTRSTATES c DO io1=1,MAXIOSTATES c DO istr2=1,MAXSTRSTATES c DO io2=1,MAXIOSTATES c SIMMETRIC(I,J,istr1,io1,istr2,io2)=0.0 c SIMMETRIC(j,i,istr1,io1,istr2,io2)=0.0 c enddo c enddo c ENDDO c ENDDO c ENDDO C======================================================================= C DEBUG: WRITE MATRIX IN OUTPUT-FILE C======================================================================= c open(99,file='METRIC_DEBUG.X',status='NEW') c istr1=1 c io1=1 c istr2=1 c io2=1 c do i=1,ntrans c do istr1=1,maxstrstates c do io1=1,maxiostates c do istr2=1,maxstrstates c do io2=1,maxiostates c write(99,'(1x,a1,4(1x,i3),5x,26(f6.2))') c + trans(i:i),istr1,io1,istr2,io2, c + (simmetric(i,j,istr1,io1,istr2,io2),j=1,ntrans) c enddo c enddo c enddo c enddo c enddo c close(99) C======================================================================= RETURN END C END SCALEMETRIC C...................................................................... C...................................................................... C SUB SCALEINTERVAL SUBROUTINE SCALEINTERVAL(S,N,SMIN,SMAX,MAPLOW,MAPHIGH) C imported: old values in S(1..N) C maplow and maphigh C target limits SMAX, SMIN C C exported: new values in S(1..N) C internal: SHI, SLO C SHI.........*.........SLO map this interval onto C SMAX...*...SMIN this interval or C MAPLOW MAPHIGH C REAL S(*),MAPLOW,MAPHIGH,SMIN,SMAX,SHI,SLO SHI=-1.0E+10 SLO=1.0E+10 C IF (SMIN.EQ.0.0 .AND. SMAX.EQ.0.0 .AND. + MAPLOW.EQ.0.0 .AND. MAPHIGH.EQ.0.0) THEN WRITE(6,*)' SCALEINTERVAL: NO SCALING ' RETURN ENDIF IF (MAPLOW.EQ.0.0 .AND. MAPHIGH.EQ.0.0) THEN c WRITE(6,*)' SCALEINTERVAL: scale between SMIN/SMAX' DO I=1,N IF (S(I) .GT. SHI)SHI=S(I) IF (S(I) .LT. SLO)SLO=S(I) ENDDO ELSE WRITE(6,*)' SCALEINTERVAL: scale between MAPLOW/MAPHIGH' SHI=MAPHIGH SLO=MAPLOW ENDIF c WRITE(6,*)'high/low: ',shi,slo,n,(SHI-SLO),(SMAX-SMIN)+SMIN DO I=1,N S(I)=((S(I)-SLO)/(SHI-SLO))*(SMAX-SMIN)+SMIN ENDDO c WRITE(6,'(20F5.2)')(S(I),I=1,N) RETURN END C END SCALEINTERVAL C...................................................................... C...................................................................... C SUB SECSTRUC_TO_3_STATE SUBROUTINE SECSTRUC_TO_3_STATE(SECSTRUC,CLASS,ICLASS) C convert DSSP-secondary structure symbol to 3-state (L,H,E) secondary C structure C given SECSTRUC, what is the class number ICLASS and class C representative CLASS ? C undefined states is set CLASS='U', ICLASS=0 C C input CHARACTER SECSTRUC C output CHARACTER CLASS INTEGER ICLASS C internal c INTEGER MAXSTRSTATES c PARAMETER (MAXSTRSTATES=3) CHARACTER*25 STATES c 1234567890123456789012345 STATES='L TCSltcsEBAPMebapmHGIhgi' c DATA STATES/'L TCStclss','EBAPMebapm','HGIhgiiiii'/ c CHARACTER STATES(MAXSTRSTATES)*10 C====================================================================== ICLASS=0 CLASS='U' I=INDEX(STATES,SECSTRUC) IF (I .NE. 0) THEN IF (I .LE. 9) THEN ICLASS=1 CLASS='L' RETURN ELSE IF (I .GE. 10 .AND. I .LE. 19) THEN ICLASS=10 CLASS='E' RETURN ELSE IF (I .GE. 20 .AND. I .LE. 25) THEN ICLASS=20 CLASS='H' RETURN ENDIF ENDIF c DO K=1,MAXSTRSTATES c IF (INDEX(STATES(K),SECSTRUC).NE.0) THEN c ICLASS=K c CLASS=STATES(K)(1:1) c RETURN c ENDIF c ENDDO RETURN END C END SECSTRUC_TO_3_STATE C...................................................................... C...................................................................... C SUB SELECT_PDB_POINTER SUBROUTINE SELECT_PDB_POINTER(KUNIT,DSSP_PATH,PDBIN,PDBOUT) C selects from a string returned from GETSEQ one pdb-pointer for HSSP C the selection is done by a "best-guess": C C 1.) check if there is a valid DSSP-file C if so, take the latest entry in PDB C 2.) if not 1 then check if it is a C-alpha set C 3.) if not 2 then check if it is a model-structure C C INPUT: pdbin C 1INS; 15-JAN-91 | 2INS; 15-JAN-91 | 3INS; 20-OCT-92 || 3 C or 1NSB; PRELIMINARY. C OUTPUT: pdbout C 1INS if "normal" DSSP-file or C 1INS_C if c-alpha only or C 1INS_M if model structure or C 1INS_P if pre-released structure or C 1INS_? if none of the above cases, like the SwissProt pointer C is pointing to a PDB-file which is gone (renamed) in the C current version of PDB OR C if something is wrong with the "normal" DSSP-file IMPLICIT NONE C input: CHARACTER*(*) PDBIN,DSSP_PATH INTEGER KUNIT C output: CHARACTER*(*) PDBOUT C internal INTEGER MAXPOINTER PARAMETER (MAXPOINTER= 200) INTEGER NPOINTER,SORTNUMBER(MAXPOINTER),IHIGH, + IDSSP_FLAG,ISTART,ISTOP,IPOS,JPOS,IPOINTER, + JPOINTER,KPOINTER,NEXTPOS,IMONTH,IYEAR CHARACTER*12 PDBPOINTER(MAXPOINTER) CHARACTER CTEMP*50,FILENAME*200 CHARACTER CMONTH*36 LOGICAL LERROR C used to convert entry date to sort number DATA CMONTH /'JANFEBMARAPRMAIJUNJULAUGSEPOCTNOVDEC'/ C init *----------------------------------------------------------------------* PDBOUT=' ' CTEMP=' ' IF (PDBIN.EQ.' ')RETURN C extract number of pointers IPOS=INDEX(PDBIN,'||') IF (IPOS .NE. 0) THEN CALL STRPOS(PDBIN,ISTART,ISTOP) CALL READ_INT_FROM_STRING(PDBIN(IPOS+2:ISTOP),NPOINTER) ELSE RETURN ENDIF IF (NPOINTER .LE. 0)RETURN C loop over pdb-pointers IPOS=1 IF (NPOINTER .GT. MAXPOINTER) THEN WRITE(6,*)' SELECT_PDB_POINTER: npointer .gt. maxpointer' WRITE(6,*)' set npointer to maxpointer' NPOINTER= MAXPOINTER ENDIF DO IPOINTER=1,NPOINTER SORTNUMBER(IPOINTER)=0 CTEMP=' ' NEXTPOS=INDEX(PDBIN(IPOS:),'|')+IPOS-1 CTEMP(1:)=PDBIN(IPOS:NEXTPOS-1) JPOS=INDEX(CTEMP,';') PDBPOINTER(IPOINTER)=CTEMP(1:JPOS-1) C extract month and year of pdb entry IF (INDEX(CTEMP,'PRELIM') .EQ. 0) THEN JPOS=INDEX(CTEMP,'-') IMONTH= ( (INDEX(CMONTH,CTEMP(JPOS+1:JPOS+4) )) / 3 )+1 CALL READ_INT_FROM_STRING(CTEMP(JPOS+5:JPOS+6),IYEAR) C build up a sort number C latest entry has largest number: 199201= JAN 1992 C with beginning of the year 2080 or so we have to add a line here :-) IF (IYEAR .GT. 0) THEN SORTNUMBER(IPOINTER)=10000*19 + 100*IYEAR + IMONTH ELSE SORTNUMBER(IPOINTER)=10000*20 + 100*IYEAR + IMONTH ENDIF ENDIF C set line pointer to next entry IPOS=NEXTPOS+1 ENDDO DO JPOINTER=1,NPOINTER IPOINTER=-1 IHIGH=-1 DO KPOINTER=1,NPOINTER IF (SORTNUMBER(KPOINTER) .GE. IHIGH) THEN IHIGH=SORTNUMBER(KPOINTER) IPOINTER=KPOINTER ENDIF ENDDO SORTNUMBER(IPOINTER)=-99 CALL UPTOLOW(PDBPOINTER(IPOINTER),LEN(PDBPOINTER(IPOINTER)) ) C LOOK IF THERE IS A "NORMAL" DSSP-FILE IDSSP_FLAG=4 CALL CONCAT_3STRINGS(DSSP_PATH,PDBPOINTER(IPOINTER),'.dssp', + FILENAME) CALL OPEN_FILE(KUNIT,FILENAME,'old,readonly,silent',lerror) IF (LERROR)GOTO 10 C check if there is something in the file CTEMP=' ' DO WHILE(INDEX(CTEMP,'# RES') .EQ. 0) READ(KUNIT,'(A10)',END=10,ERR=10)CTEMP ENDDO IDSSP_FLAG=0 10 CALL CLOSE_FILE(KUNIT,FILENAME) IF (.NOT. LERROR) GOTO 100 C look if there is C-alpha model set CALL CONCAT_3STRINGS(DSSP_PATH,PDBPOINTER(IPOINTER), + '.dssp_ca',filename) CALL OPEN_FILE(KUNIT,FILENAME,'old,readonly,silent',lerror) CALL CLOSE_FILE(KUNIT,FILENAME) IF (.NOT. LERROR) THEN IDSSP_FLAG=1 GOTO 100 ENDIF C look if there is a model-structure CALL CONCAT_3STRINGS(DSSP_PATH,PDBPOINTER(IPOINTER), + '.dssp_mod',filename) CALL OPEN_FILE(KUNIT,FILENAME,'old,readonly,silent',lerror) CALL CLOSE_FILE(KUNIT,FILENAME) IF (.NOT. LERROR) THEN IDSSP_FLAG=2 GOTO 100 ENDIF C look if there is a pre-released structure CALL CONCAT_3STRINGS(DSSP_PATH,PDBPOINTER(IPOINTER), + '.dssp_pre',filename) CALL OPEN_FILE(KUNIT,FILENAME,'old,readonly,silent',lerror) CALL CLOSE_FILE(KUNIT,FILENAME) IF (.NOT. LERROR) THEN IDSSP_FLAG=3 GOTO 100 ENDIF C set pdbpointer-extension according to selection 100 CALL STRPOS(PDBPOINTER(JPOINTER),ISTART,ISTOP) IF ( IDSSP_FLAG .EQ. 0) THEN PDBOUT=PDBPOINTER(IPOINTER)(ISTART:ISTOP) GOTO 200 ELSE IF ( IDSSP_FLAG .EQ. 1) THEN PDBOUT=PDBPOINTER(IPOINTER)(ISTART:ISTOP)//'_C' ELSE IF ( IDSSP_FLAG .EQ. 2) THEN PDBOUT=PDBPOINTER(IPOINTER)(ISTART:ISTOP)//'_M' ELSE IF ( IDSSP_FLAG .EQ. 3) THEN PDBOUT=PDBPOINTER(IPOINTER)(ISTART:ISTOP)//'_P' ELSE IF ( IDSSP_FLAG .EQ. 4) THEN PDBOUT=PDBPOINTER(IPOINTER)(ISTART:ISTOP)//'_?' ENDIF ENDDO 200 CALL LOWTOUP(PDBOUT,LEN(PDBOUT) ) C RETURN END C END SELECT_PDB_POINTER C...................................................................... C...................................................................... C SUB SELECT_UNIQUE_CHAIN SUBROUTINE SELECT_UNIQUE_CHAIN(KFILE,FILENAME,OUTNAME) C selects unique chains from dssp file, builds up a new filename of the C form: $pdb:4hhb.dssp_!_A,B IMPLICIT NONE INTEGER KFILE CHARACTER*(*) FILENAME,OUTNAME C internal INTEGER MAXRES_LOC PARAMETER (MAXRES_LOC= 9999) C PARAMETER (MAXRES_LOC= 10000) C PARAMETER (MAXRES_LOC= 30011) INTEGER NRES,NCHAIN CHARACTER CRESID(MAXRES_LOC) C CHARACTER*6 CRESID(MAXRES_LOC) CHARACTER CSEQ(MAXRES_LOC) C CHARACTER TRANS*26 INTEGER MAXCHAIN PARAMETER (MAXCHAIN=100) INTEGER IBREAK,IBREAKPOS(0:MAXCHAIN),I,J,ICHAIN,JCHAIN INTEGER ISTART,ISTOP CHARACTER CHAINID(0:MAXCHAIN) CHARACTER CTEMP*100 LOGICAL LALL,LSAME(MAXCHAIN,MAXCHAIN),LTAKE(MAXCHAIN) LOGICAL LERROR C CHARACTER LOWER*26 CHARACTER LINE*(1000) C DONT USE INDEX COMMAND (CPU TIME) C INTEGER NASCII C PARAMETER (NASCII=256) C INTEGER TRANSPOS(NASCII) c init LINE=' ' NRES=1 NCHAIN=1 c lower='abcdefghijklmnopqrstuvwxyz' c TRANS='VLIMFWYGAPSTCHRKQENDBZX!-.' IBREAK=0 IBREAKPOS(0)=0 CHAINID(0)='?' OUTNAME=' ' DO I=1,MAXCHAIN IBREAKPOS(I)=0 CHAINID(I)=' ' LTAKE(I)=.TRUE. DO J=1,MAXCHAIN LSAME(I,J)=.TRUE. ENDDO ENDDO DO I=1,MAXRES_LOC CSEQ(I)=' ' ENDDO CALL OPEN_FILE(KFILE,FILENAME,'readonly,old',LERROR) C READ FROM DSSP READ(KFILE,'(A)',END=199)LINE IF (INDEX(LINE,'SECONDARY').EQ.0) THEN WRITE(6,*)'***select_unique... error: dssp file assumed, ' WRITE(6,*)' but word /secondary/ is missing in first line' RETURN ENDIF c repeat until # 105 READ(KFILE,'(A)',END=199)LINE IF (INDEX(LINE(1:5),'#').EQ.0) GOTO 105 C C23456123451x1 C23456789011x1 Ccccccaaaaaaca C 9 9 A S C 21 21 Y C DO WHILE (.TRUE.) IF (NRES .LE. MAXRES_LOC) THEN READ(KFILE,'(11X,A1,1X,A1)',END=900)CRESID(NRES),CSEQ(NRES) c read(kfile,'(6x,a6,1x,a1)',end=900)cresid(nres),cseq(nres) c convert ss-bridges to 'c'.... c if (index(lower,cseq(nres)) .ne. 0) cseq(nres)='C' IF (CSEQ(NRES) .EQ. '!') THEN NCHAIN=NCHAIN+1 ENDIF c illegal residues c call getindex(cseq(nres),transpos,i) c if (i .le. 0) then c WRITE(6,'(a,a)')'*** seq unknown: ',cseq(nres) c ENDIF NRES=NRES+1 c dimension overflow ELSE WRITE(6,'(a)')'*** error: dimension overflow MAXRES_LOC ***' WRITE(6,*)'truncated to ',nres,' residues' GOTO 900 ENDIF c next line ENDDO C--------------DSSP read error ----------------------------------- 199 WRITE(6,*)'*** incomplete dssp file (eof) ' NRES=0 NCHAIN=0 WRITE(6,*) 'file: ',filename(1:40) CLOSE(KFILE) RETURN c finished reading----------------------- 900 CLOSE(KFILE) NRES=NRES-1 IF (NCHAIN .EQ. 1)RETURN DO I=1,NRES IF (CSEQ(I) .EQ. '!') THEN IBREAK=IBREAK+1 IBREAKPOS(IBREAK)=I ENDIF ENDDO IBREAK=IBREAK+1 CHAINID(1)=CRESID(1) c chainid(1)=cresid(1)(6:6) DO I=1,IBREAK CHAINID(I+1)=CRESID(IBREAKPOS(I)+1) c chainid(i+1)=cresid(ibreakpos(i)+1)(6:6) ENDDO IBREAKPOS(IBREAK)=NRES+1 DO ICHAIN=1,IBREAK-1 DO JCHAIN=ICHAIN+1,IBREAK IF (IBREAKPOS(ICHAIN)-IBREAKPOS(ICHAIN-1)-1 .EQ. + IBREAKPOS(JCHAIN)-IBREAKPOS(JCHAIN-1)-1 ) THEN J=IBREAKPOS(JCHAIN-1) DO I=IBREAKPOS(ICHAIN-1)+1,IBREAKPOS(ICHAIN)-1 J=J+1 IF (CSEQ(I) .NE. CSEQ(J)) THEN LSAME(ICHAIN,JCHAIN)=.FALSE. LSAME(JCHAIN,ICHAIN)=.FALSE. GOTO 50 ENDIF ENDDO 50 CONTINUE ELSE LSAME(ICHAIN,JCHAIN)=.FALSE. LSAME(JCHAIN,ICHAIN)=.FALSE. ENDIF ENDDO ENDDO DO I=1,NCHAIN-1 IF ( LTAKE(I) ) THEN DO J=I+1,NCHAIN IF (LSAME(I,J)) THEN LTAKE(J)=.FALSE. ENDIF ENDDO ENDIF ENDDO LALL=.TRUE. DO I=1,NCHAIN IF (LALL) THEN IF ( .NOT. LTAKE(I))LALL=.FALSE. ENDIF ENDDO CTEMP=' ' CALL STRPOS(FILENAME,ISTART,ISTOP) CTEMP=FILENAME(ISTART:ISTOP)//'_!_' DO I=1,NCHAIN IF (LTAKE(I)) THEN CALL STRPOS(CTEMP,ISTART,ISTOP) IF (CHAINID(I-1) .NE. CHAINID(I)) THEN WRITE(CTEMP(ISTOP+1:),'(A,A)')CHAINID(I),',' ENDIF ENDIF ENDDO CALL STRPOS(CTEMP,ISTART,ISTOP) IF (CTEMP(ISTOP:ISTOP) .EQ. ',') THEN CTEMP(ISTOP:ISTOP)=' ' ENDIF OUTNAME=' ' C IN CASE OF "ARTIFICIAL" CHAIN-BREAKS THE END IS EMPTY CALL STRPOS(CTEMP,ISTART,ISTOP) IF (CTEMP(ISTOP-2:ISTOP) .EQ. '_!_') THEN CALL STRPOS(FILENAME,ISTART,ISTOP) OUTNAME(1:)=FILENAME(ISTART:ISTOP) ELSE WRITE(OUTNAME(1:),'(A)')CTEMP(ISTART:ISTOP) ENDIF WRITE(6,*)'select_unique: ',outname(1:60) RETURN END C END SELECT_UNIQUE_CHAIN c$$$ C SUB SELECT_UNIQUE_CHAIN c$$$ subroutine select_unique_chain(kfile,filename,outname) c$$$C selects unique chains from dssp file, and builds up a new filename of the c$$$C form: $pdb:4hhb.dssp_!_A,B c$$$ c$$$ implicit none c$$$ c$$$ integer kfile c$$$cx character*80 filename,outname c$$$ character*(*) filename,outname c$$$C internal c$$$ integer MAXSQ c$$$ parameter (MAXSQ=4500) c$$$ integer nres,lacc(MAXSQ),iop,ntrans,kchain,nchain c$$$ integer ipdbno(MAXSQ) c$$$ character*6 cresid(MAXSQ) c$$$ character cseq(MAXSQ),struc(MAXSQ) c$$$ character*80 compound c$$$ character*12 ACCESSION,pdbref c$$$ character trans*26,cchain c$$$ character chains*26 c$$$ logical ldssp c$$$ c$$$ integer maxchain c$$$ parameter (maxchain=30) c$$$ integer ibreak,ibreakpos(0:maxchain),i,j,ichain,jchain c$$$ integer istart,istop c$$$ character chainid(0:maxchain) c$$$ character ctemp*100 c$$$ logical lall,lsame(maxchain,maxchain),ltake(maxchain) c$$$ logical ltruncated,lerror c$$$C init c$$$ ntrans=26 c$$$ TRANS='VLIMFWYGAPSTCHRKQENDBZX!-.' c$$$ iop=0 c$$$ ibreak=0 c$$$ ibreakpos(0)=0 c$$$ chainid(0)='?' c$$$ do i=1,maxchain c$$$ ibreakpos(i)=0 c$$$ chainid(i)=' ' c$$$ ltake(i)=.true. c$$$ do j=1,maxchain c$$$ lsame(i,j)=.true. c$$$ enddo c$$$ enddo c$$$C all chains wanted from DSSP data set c$$$ kchain=0 c$$$ chains=' ' c$$$ c$$$ c$$$ c$$$c call getseq(kfile,MAXSQ,nres,cresid,cseq,struc, c$$$c + lacc,ldssp,filename,compound,ACCESSION,pdbref,iop,trans, c$$$c + ntrans,kchain,nchain,cchain) c$$$ c$$$ if (nchain .eq. 1)return c$$$ do i=1,nres c$$$ if (cseq(i) .eq. '!') then c$$$ ibreak=ibreak+1 c$$$ ibreakpos(ibreak)=i c$$$ endif c$$$ enddo c$$$ ibreak=ibreak+1 c$$$ chainid(1)=cresid(1)(6:6) c$$$ do i=1,ibreak c$$$ chainid(i+1)=cresid(ibreakpos(i)+1)(6:6) c$$$ enddo c$$$ ibreakpos(ibreak)=nres+1 c$$$ do ichain=1,ibreak-1 c$$$ do jchain=ichain+1,ibreak c$$$ if (ibreakpos(ichain)-ibreakpos(ichain-1)-1 .eq. c$$$ + ibreakpos(jchain)-ibreakpos(jchain-1)-1 ) then c$$$ j=ibreakpos(jchain-1) c$$$ do i=ibreakpos(ichain-1)+1,ibreakpos(ichain)-1 c$$$ j=j+1 c$$$ if (cseq(i) .ne. cseq(j)) then c$$$ lsame(ichain,jchain)=.false. c$$$ lsame(jchain,ichain)=.false. c$$$ GOTO 50 c$$$ endif c$$$ enddo c$$$50 continue c$$$ else c$$$ lsame(ichain,jchain)=.false. c$$$ lsame(jchain,ichain)=.false. c$$$ endif c$$$ enddo c$$$ enddo c$$$ c$$$ do i=1,nchain-1 c$$$ if ( ltake(i) ) then c$$$ do j=i+1,nchain c$$$ if (lsame(i,j)) then c$$$ ltake(j)=.false. c$$$ endif c$$$ ENDDO c$$$ endif c$$$ enddo c$$$ c$$$ lall=.true. c$$$ do i=1,nchain c$$$ if (lall) then c$$$ if ( .not. ltake(i))lall=.false. c$$$ endif c$$$ enddo c$$$ c$$$ ctemp=' ' c$$$ call strpos(filename,istart,istop) c$$$ ctemp=filename(istart:istop)//'_!_' c$$$ do i=1,nchain c$$$ if (ltake(i)) then c$$$ call strpos(ctemp,istart,istop) c$$$ if (chainid(i-1) .ne. chainid(i)) then c$$$ write(ctemp(istop+1:),'(a,a)')chainid(i),',' c$$$ endif c$$$ endif c$$$ enddo c$$$ call strpos(ctemp,istart,istop) c$$$ if (ctemp(istop:istop) .eq. ',') then c$$$ ctemp(istop:istop)=' ' c$$$ endif c$$$ outname=' ' c$$$ write(outname(1:),'(a)')ctemp(istart:istop) c$$$ WRITE(6,*)outname c$$$ c$$$ return c$$$ end c$$$ C END SELECT_UNIQUE_CHAIN C...................................................................... C...................................................................... C SUB SEND_DATA_TO_NODE C send start signal and all data to workers C they have to wait until they received all information SUBROUTINE SEND_DATA_TO_NODE IMPLICIT NONE C import INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C internal C ISIZE is dummy variable; otherwise we have to pass a parameter , which C gets defined as a variable in the subroutines (not clear what happens) INTEGER ILINK cc integer iworker,iset,ilink,isize C C link=-1 means send to everybody C c if (mp_model .eq. 'PARIX') then c msgtype=idtop c if (lsmall_machine) then c do iworker=1,nworker c ilink= link(iworker) c call send_maxhom_data(ilink) c enddo c else c do iset=1,nworkset c ilink=sender_node(iset) c call send_maxhom_data(ilink) c ilink=receiver_node(iset) c isize=len(corepath) c call mp_put_string(msgtype,ilink,corepath,isize) c isize=len(corefile) c call mp_put_string(msgtype,ilink,corefile,isize) c enddo c endif c else if (mp_model .eq. 'DELTA') then c call mp_init_send() ; ilink=-1 c call send_maxhom_data(ilink) c else if (mp_model .eq. 'PVM3') then CALL MP_INIT_SEND() ILINK=-1 CALL SEND_MAXHOM_DATA(ILINK) c else if (mp_model .eq. 'PVM') then c call mp_init_send() ; ilink=-1 c call send_maxhom_data(ilink) c endif WRITE(6,*)' send init data finished' CALL FLUSH_UNIT(6) RETURN END C END SEND_DATA_TO_NODE C...................................................................... C...................................................................... C SUB SEND_MAXHOM_DATA SUBROUTINE SEND_MAXHOM_DATA(ILINK) C import INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' INTEGER ILINK C internal CHARACTER CPARSYTEC_BUG*(MAXSQ) INTEGER ISIZE,I INTEGER ILBACKWARD,ILINSERT_2,ILISTOFSEQ_2,ILSHOW_SAMESEQ, + ILSWISSBASE,ILDSSP_1,ILCONSERV_1,ILCONSERV_2, + ILCONSIMPORT,ILALL,ILFORMULA,ILTHRESHOLD, + ILCOMPSTR,ILPASS2,ILTRACE,ILONG_OUT,ILBATCH, + I3WAY,I3WAYDONE,IWARM_START,IBINARY c integer ilmixed_arch C init logicals C NOTE: LOGICALS are sent in an integer variable C on some machines its not clear what happens if one snets C logicals as integers ILBACKWARD=0 ILINSERT_2=0 ILISTOFSEQ_2=0 ILSHOW_SAMESEQ=0 ILSWISSBASE=0 ILDSSP_1=0 ILCONSERV_1=0 ILCONSERV_2=0 ILCONSIMPORT=0 ILALL=0 ILFORMULA=0 ILTHRESHOLD=0 ILCOMPSTR=0 ILPASS2=0 ILTRACE=0 ILONG_OUT=0 ILBATCH=0 I3WAY=0 I3WAYDONE=0 IWARM_START=0 IBINARY=0 c ilmixed_arch=0 MSGTYPE=1 CALL MP_PUT_INT4(MSGTYPE,ILINK,ID_HOST,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,N1,N_ONE) IF (N1 .GT. 0) THEN ISIZE=N1 CALL MP_PUT_INT4_ARRAY(MSGTYPE,ILINK,LSQ_1,ISIZE) CALL MP_PUT_INT4_ARRAY(MSGTYPE,ILINK,LSTRUC_1,ISIZE) CALL MP_PUT_INT4_ARRAY(MSGTYPE,ILINK,LSTRCLASS_1,ISIZE) CALL MP_PUT_INT4_ARRAY(MSGTYPE,ILINK,LACC_1,ISIZE) ISIZE=MAXBREAK CALL MP_PUT_INT4_ARRAY(MSGTYPE,ILINK,IBREAKPOS_1,ISIZE) CALL MP_PUT_INT4(MSGTYPE,ILINK,NBREAK_1,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,NBEST,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,IPROFBEG,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,IPROFEND,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,PROFILEMODE,N_ONE) ISIZE=NASCII CALL MP_PUT_INT4_ARRAY(MSGTYPE,ILINK,TRANSPOS,ISIZE) ISIZE=MAXCUTOFFSTEPS CALL MP_PUT_INT4_ARRAY(MSGTYPE,ILINK,ISOLEN,ISIZE) CALL MP_PUT_INT4(MSGTYPE,ILINK,NSTEP,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ISAFE,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,NSTRSTATES_1,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,NSTRSTATES_2,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,NIOSTATES_1,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,NIOSTATES_2,N_ONE) ISIZE=N1 CALL MP_PUT_INT4_ARRAY(MSGTYPE,ILINK,PDBNO_1,ISIZE) ISIZE=MAXCUTOFFSTEPS CALL MP_PUT_REAL4_ARRAY(MSGTYPE,ILINK,ISOIDE,ISIZE) ISIZE=N1 CALL MP_PUT_REAL4_ARRAY(MSGTYPE,ILINK,GAPOPEN_1,ISIZE) CALL MP_PUT_REAL4_ARRAY(MSGTYPE,ILINK,GAPELONG_1,ISIZE) CALL MP_PUT_REAL4(MSGTYPE,ILINK,OPEN_1,N_ONE) CALL MP_PUT_REAL4(MSGTYPE,ILINK,ELONG_1,N_ONE) CALL MP_PUT_REAL4_ARRAY(MSGTYPE,ILINK,CONSWEIGHT_1,ISIZE) ISIZE=MAXSQ*NTRANS CALL MP_PUT_REAL4_ARRAY(MSGTYPE,ILINK,SIMMETRIC_1,ISIZE) IF (PROFILEMODE .EQ. 6) THEN ISIZE= NTRANS * NTRANS * MAXSTRSTATES * MAXIOSTATES * + MAXSTRSTATES*MAXIOSTATES CALL MP_PUT_REAL4_ARRAY(MSGTYPE,ILINK,SIMORG,ISIZE) ENDIF CALL MP_PUT_REAL4(MSGTYPE,ILINK,FILTER_VAL,N_ONE) CALL MP_PUT_REAL4(MSGTYPE,ILINK,PUNISH,N_ONE) CALL MP_PUT_REAL4(MSGTYPE,ILINK,CUTVALUE1,N_ONE) CALL MP_PUT_REAL4(MSGTYPE,ILINK,CUTVALUE2,N_ONE) CALL MP_PUT_REAL4(MSGTYPE,ILINK,SMIN,N_ONE) CALL MP_PUT_REAL4(MSGTYPE,ILINK,SMAX,N_ONE) CALL MP_PUT_REAL4(MSGTYPE,ILINK,MAPLOW,N_ONE) CALL MP_PUT_REAL4(MSGTYPE,ILINK,MAPHIGH,N_ONE) ISIZE=MAXSTRSTATES*MAXIOSTATES CALL MP_PUT_REAL4_ARRAY(MSGTYPE,ILINK,IORANGE,ISIZE) ISIZE=LEN(MP_MODEL) CALL MP_PUT_STRING(MSGTYPE,ILINK,MP_MODEL,ISIZE) ISIZE=LEN(SPLIT_DB_PATH) CALL MP_PUT_STRING(MSGTYPE,ILINK,SPLIT_DB_PATH,ISIZE) ISIZE=LEN(SPLIT_DB_DATA) CALL MP_PUT_STRING(MSGTYPE,ILINK,SPLIT_DB_DATA,ISIZE) ISIZE=LEN(SWISSPROT_SEQ) CALL MP_PUT_STRING(MSGTYPE,ILINK,SWISSPROT_SEQ,ISIZE) ISIZE=LEN(LISTFILE_2) CALL MP_PUT_STRING(MSGTYPE,ILINK,LISTFILE_2,ISIZE) ISIZE=LEN(CSQ_1) CALL MP_PUT_STRING(MSGTYPE,ILINK,CSQ_1,ISIZE) C Parsytec bug c isize=MAXSQ c call mp_put_string_array(msgtype,ilink,struc_1,isize) c call mp_put_string_array(msgtype,ilink,chainid_1,isize) ISIZE=N1 DO I=1,N1 CPARSYTEC_BUG(I:I)=STRUC_1(I) ENDDO CALL MP_PUT_STRING(MSGTYPE,ILINK,CPARSYTEC_BUG,ISIZE) DO I=1,N1 CPARSYTEC_BUG(I:I)=CHAINID_1(I) ENDDO CALL MP_PUT_STRING(MSGTYPE,ILINK,CPARSYTEC_BUG,ISIZE) ISIZE=LEN(OPENWEIGHT_ANSWER) CALL MP_PUT_STRING(MSGTYPE,ILINK,OPENWEIGHT_ANSWER,ISIZE) ISIZE=LEN(ELONGWEIGHT_ANSWER) CALL MP_PUT_STRING(MSGTYPE,ILINK,ELONGWEIGHT_ANSWER,ISIZE) ISIZE=LEN(SMIN_ANSWER) CALL MP_PUT_STRING(MSGTYPE,ILINK,SMIN_ANSWER,ISIZE) ISIZE=LEN(NAME_1) CALL MP_PUT_STRING(MSGTYPE,ILINK,NAME_1,ISIZE) ISIZE=LEN(HSSPID_2) CALL MP_PUT_STRING(MSGTYPE,ILINK,HSSPID_2,ISIZE) ISIZE=LEN(CSORTMODE) CALL MP_PUT_STRING(MSGTYPE,ILINK,CSORTMODE,ISIZE) ISIZE=LEN(METRICFILE) CALL MP_PUT_STRING(MSGTYPE,ILINK,METRICFILE,ISIZE) ISIZE=LEN(CURRENT_DIR) CALL MP_PUT_STRING(MSGTYPE,ILINK,CURRENT_DIR,ISIZE) ISIZE=LEN(DSSP_PATH) CALL MP_PUT_STRING(MSGTYPE,ILINK,DSSP_PATH,ISIZE) ISIZE=LEN(PDBPATH) CALL MP_PUT_STRING(MSGTYPE,ILINK,PDBPATH,ISIZE) ISIZE=LEN(PLOTFILE) CALL MP_PUT_STRING(MSGTYPE,ILINK,PLOTFILE,ISIZE) ISIZE=LEN(COREPATH) CALL MP_PUT_STRING(MSGTYPE,ILINK,COREPATH,ISIZE) ISIZE=LEN(COREFILE) CALL MP_PUT_STRING(MSGTYPE,ILINK,COREFILE,ISIZE) ISIZE=LEN(TRANS) CALL MP_PUT_STRING(MSGTYPE,ILINK,TRANS,ISIZE) ISIZE=LEN(STRTRANS) CALL MP_PUT_STRING(MSGTYPE,ILINK,STRTRANS,ISIZE) ISIZE=LEN(CSTRSTATES) CALL MP_PUT_STRING(MSGTYPE,ILINK,CSTRSTATES,ISIZE) ISIZE=LEN(CIOSTATES) CALL MP_PUT_STRING(MSGTYPE,ILINK,CIOSTATES,ISIZE) DO I=1,MAXSTRSTATES ISIZE=LEN(STR_CLASSES(I)) CALL MP_PUT_STRING(MSGTYPE,ILINK,STR_CLASSES(I),ISIZE) ENDDO c if ( lmixed_arch ) ilmixed_arch=1 IF ( LBACKWARD ) ILBACKWARD=1 IF ( LINSERT_2 ) ILINSERT_2=1 IF ( LISTOFSEQ_2 ) ILISTOFSEQ_2=1 IF ( LSHOW_SAMESEQ ) ILSHOW_SAMESEQ=1 IF ( LSWISSBASE ) ILSWISSBASE=1 IF ( LDSSP_1 ) ILDSSP_1=1 IF ( LCONSERV_1 ) ILCONSERV_1=1 IF ( LCONSERV_2 ) ILCONSERV_2=1 IF ( LCONSIMPORT ) ILCONSIMPORT=1 IF ( LALL ) ILALL=1 IF ( LFORMULA ) ILFORMULA=1 IF ( LTHRESHOLD ) ILTHRESHOLD=1 IF ( LCOMPSTR ) ILCOMPSTR=1 IF ( LPASS2 ) ILPASS2=1 IF ( LTRACE ) ILTRACE=1 IF ( LONG_OUT ) ILONG_OUT=1 IF ( LBATCH ) ILBATCH=1 IF ( L3WAY ) I3WAY=1 IF ( L3WAYDONE ) I3WAYDONE=1 IF ( LWARM_START ) IWARM_START=1 IF ( LBINARY ) IBINARY=1 C CALL MP_PUT_INT4(MSGTYPE,ILINK,ILMIXED_ARCH,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILBACKWARD,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILINSERT_2,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILISTOFSEQ_2,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILSHOW_SAMESEQ,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILSWISSBASE,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILDSSP_1,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILCONSERV_1,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILCONSERV_2,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILCONSIMPORT,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILALL,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILFORMULA,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILTHRESHOLD,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILCOMPSTR,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILPASS2,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILTRACE,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILONG_OUT,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILBATCH,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,I3WAY,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,I3WAYDONE,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,IWARM_START,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,IBINARY,N_ONE) ENDIF c if (mp_model .ne. 'PARIX') then c do ilink=1,nworker c WRITE(6,*)' send data to: ',link(ilink) c call mp_send_data(msgtype,link(ilink)) c enddo CALL MP_CAST(NWORKER,MSGTYPE,LINK(1)) c endif RETURN END C END SEND_MAXHOM_DATA C...................................................................... C...................................................................... C SUB SEQ_TO_INTEGER SUBROUTINE SEQ_TO_INTEGER(SEQ,LSEQ,NRES,TRANSPOS) C converts string of amino acid characters to amino acid integers. C uses integer table TRANSPOS C DOES NOT: internally converts DSSP SS bridges to 'C' before converting to C integer. Call "lower_to_cys" before calling this routine C input may contain funnies like '!' C output will be according to transpos IMPLICIT NONE C import CHARACTER*(*) SEQ INTEGER NRES INTEGER TRANSPOS(*) C export INTEGER LSEQ(*) C internal INTEGER I LOGICAL NOILLEGAL C NOILLEGAL=.TRUE. DO I=1,NRES LSEQ(I)=TRANSPOS ( ICHAR(SEQ(I:I)) ) IF (LSEQ(I) .LE. 0) THEN IF (NOILLEGAL) THEN NOILLEGAL=.FALSE. WRITE(6,'(A,I3,A,A,A1)')'*** ERROR SEQ_TO_INTEGER: '// + 'unk res/chain separator I=',I,' =',SEQ(I:I),'|' ENDIF ENDIF ENDDO RETURN END C END SEQ_TO_INTEGER C...................................................................... C...................................................................... C SUB SETBACK SUBROUTINE SETBACK(N1BEG,N1END,N2BEG,N2END,N2,LH1,LH2, + BESTVAL_CHECK) C----------------------------------------------------------------------- C reverse SETMATRIX (see comments there also) C here the matrix is processed in the backward direction C the best path value is stored in a temporary array MAX_ALL(), C NO traceback is stored (this is done in SETMATRIX) C the original matrix values are overwritten by the sum of the forward C and backward path value C this allows the computation of all pairs of residues i,j that C CAN BE PART of an optimal and suboptimal alignments. C NOTE: optimal value forward = optimal value backward C LH_F(i-1,j-1) + sim_val(i,j) + LH_B(i+1,j+1) = LH_FB C LH_FB is the score of an optimal alignment of sequence A and B C which is constrained to align residue i with residue j C All matrix values for THE optimal path have the same value after C this routine. The matrix values can be displayed as a 2-D or 3-D C graph showing how reliable the alignment is. C in contrast to Zuker its done in the same memory C see: Zuker M., Suboptimal sequence alignment in molecular biology C Alignment with error analysis C J.Mol.Biol. (1991) 221, 403-429 C C 1,1 C \ C \ C LH_B(i+1,j+1)= best value from backward path up to i,j C C \ LH_FB = LH_F + LH_B + sim_val(i,j) C optimal path value trough i,j C C LH_F(i-1,j-1)= best value from forward path up to i,j C \ C \ C \ C N1,N2 C C===================================================================== C====================================================================== IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C import C DIMENSIONS AND ACTUAL SEQ LENGTH INTEGER N1BEG,N1END,N2BEG,N2END,N2 REAL BESTVAL_CHECK C IMPORT/EXPORT REAL LH1(0:N1+1,0:N2+1) INTEGER*2 LH2(0:N1+1,0:N2+1) C REAL LH(0:N1+1,0:N2+1,2) C INTERNAL INTEGER NSIZE1,NSIZE2 REAL SUM REAL BESTVAL C INTEGER BESTII,BESTJJ INTEGER I,J,II,JJ,IBEG,IEND,IIBEG,JJBEG,K INTEGER NDIAGONAL,LEN_DIAG,IDIAG,ISMALL_DIM,IBIG_DIM LOGICAL LERROR CHARACTER CTEMP*50 c======================================================================= c initialize c======================================================================= c WRITE(6,*)' setback: ',profilemode II=0 NSIZE1=N1END-N1BEG+1 NSIZE2=N2END-N2BEG+1 K=MAX(N1+1,N2+1) DO I=0,K C DO I=0,MAXSQ+1 MAX_H(I)=0.0 MAX_V(I)=0.0 RIGHT_LH(I)=0.0 DOWN_LH(I)=0.0 DIAG_LH(I)=0.0 MAX_ALL(I)=0.0 ENDDO BESTVAL=0.0 C BESTII=0 ; BESTJJ=0 C====================================================================== NDIAGONAL=NSIZE1+NSIZE2-1 C NDIAGONAL=IPROFEND-IPROFBEG+1+N2-1 ISMALL_DIM=MIN(NSIZE1,NSIZE2) IBIG_DIM=MAX(NSIZE1,NSIZE2) IIBEG=N1END-1 JJBEG=N2END LEN_DIAG=0 DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .GT. NSIZE2) THEN IIBEG=IIBEG-1 ELSE JJBEG=JJBEG-1 ENDIF JJ=JJBEG+IIBEG C===================================================================== C PROFILE 1 (NO PROFILES OR PROFILE FOR FIRST SEQUENCE) C-------------------------------------------------------------------- IF (PROFILEMODE .LE. 1) THEN DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 c===================================================================== c store best value for horizontal deletion c===================================================================== IF ( ( (MAX_H(JJ-II) - ELONG_GAP_1(II+1)) .GE. + (RIGHT_LH(JJ-II)-OPEN_GAP_1(II+1)) ) .AND. + ( (MAX_H(JJ-II) - ELONG_GAP_1(II+1)) .GT. 0.0 ) ) THEN MAX_H(JJ-II)= (MAX_H(JJ-II) - ELONG_GAP_1(II+1)) ELSE IF (( (RIGHT_LH(JJ-II)-OPEN_GAP_1(II+1)) .GE. + (MAX_H(JJ-II) - ELONG_GAP_1(II+1)) ) .AND. + ( (RIGHT_LH(JJ-II)-OPEN_GAP_1(II+1)) .GT. 0.0)) THEN MAX_H(JJ-II)= (RIGHT_LH(JJ-II) - OPEN_GAP_1(II+1)) ELSE MAX_H(JJ-II)= 0.0 ENDIF c===================================================================== c store best value for vertical deletion c===================================================================== IF ( ( (MAX_V(II) - ELONG_GAP_1(II+1)) .GE. + (DOWN_LH(II) - OPEN_GAP_1(II+1)) ) .AND. + ( (MAX_V(II) - ELONG_GAP_1(II+1)) .GT. 0.0 ) ) THEN MAX_V(II)=(MAX_V(II) - ELONG_GAP_1(II+1)) ELSE IF ( ( (DOWN_LH(II) - OPEN_GAP_1(II+1)) .GE. + (MAX_V(II) - ELONG_GAP_1(II+1)) ) .AND. + ((DOWN_LH(II) - OPEN_GAP_1(II+1)) .GT. 0.0)) THEN MAX_V(II)= (DOWN_LH(II) - OPEN_GAP_1(II+1)) ELSE MAX_V(II)= 0.0 ENDIF c====================================================================== c which value is the best (diagonal,horizontal or vertical) C====================================================================== MAX_D(II)= DIAG_LH(JJ-II)+METRIC_1(II+1,LSQ_2(JJ-II+1)) MAX_ALL(II)=MAX(MAX_D(II),MAX_V(II),MAX_H(JJ-II),0.0) C set matrix value to forward path + backward path + sim_val LH1(II+1,JJ-II+1)= LH1(II+1,JJ-II+1) + MAX_D(II) c if ( lh1(ii+1,jj-ii+1) .ge. subopt_val) then c lh2(ii+1,jj-ii+1)= -1 * lh2(ii+1,jj-ii+1) c endif IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN LH1(II,II)= 0.0 C LH2(II,II)= 0 MAX_ALL(II)=0.0 ENDIF DIAG_LH(JJ-II)=DOWN_LH(II) RIGHT_LH(JJ-II)=MAX_ALL(II) DOWN_LH(II)=MAX_ALL(II) IF (BESTVAL .LT. MAX_ALL(II) ) THEN BESTVAL=MAX_ALL(II) C BESTII=II ; BESTJJ=JJ-II ENDIF ENDDO C-------------------------------------------------------------------- C profile 2 (profile for sequence 2) C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 2) THEN DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 IF ( (MAX_H(JJ-II) - ELONG_GAP_2(JJ-II+1)) .GT. + (RIGHT_LH(JJ-II)-OPEN_GAP_1(II+1)) .AND. + (MAX_H(JJ-II) - ELONG_GAP_2(JJ-II+1)) .GT. 0.0 ) THEN MAX_H(JJ-II) = (MAX_H(JJ-II) - ELONG_GAP_2(JJ-II+1)) ELSE IF ( (RIGHT_LH(JJ-II)-OPEN_GAP_1(II+1)) .GE. + (MAX_H(JJ-II) - ELONG_GAP_2(JJ-II+1)) .AND. + (RIGHT_LH(JJ-II)-OPEN_GAP_1(II+1)) .GT. 0.0) THEN MAX_H(JJ-II) = (RIGHT_LH(JJ-II) - OPEN_GAP_1(II+1)) ELSE MAX_H(JJ-II) = 0.0 ENDIF IF ( (MAX_V(II) - ELONG_GAP_2(JJ-II+1)) .GT. + (DOWN_LH(II) - OPEN_GAP_1(II+1)) .AND. + (MAX_V(II) - ELONG_GAP_2(JJ-II+1)) .GT. 0.0 ) THEN MAX_V(II) = (MAX_V(II) - ELONG_GAP_2(JJ-II+1)) ELSE IF ( (DOWN_LH(II) - OPEN_GAP_1(II+1)) .GE. + (MAX_V(II) - ELONG_GAP_2(JJ-II+1)) .AND. + (DOWN_LH(II) - OPEN_GAP_1(II+1)) .GT. 0.0) THEN MAX_V(II)= (DOWN_LH(II) - OPEN_GAP_1(II+1)) ELSE MAX_V(II) = 0.0 ENDIF MAX_D(II)= DIAG_LH(JJ-II)+METRIC_2(JJ-II+1,LSQ_1(II+1)) MAX_ALL(II)=MAX(MAX_D(II),MAX_V(II),MAX_H(JJ-II),0.0) C set matrix value to forward path + backward path + sim_val LH1(II+1,JJ-II+1)= LH1(II+1,JJ-II+1) + DIAG_LH(JJ-II) + + METRIC_2(JJ-II+1,LSQ_1(II+1)) IF ( LH1(II+1,JJ-II+1) .GE. SUBOPT_VAL) THEN LH2(II+1,JJ-II+1)= -1 * LH2(II+1,JJ-II+1) ENDIF DIAG_LH(JJ-II) = DOWN_LH(II) RIGHT_LH(JJ-II)= MAX_ALL(II) DOWN_LH(II) = MAX_ALL(II) IF (BESTVAL .LT. MAX_ALL(II) ) THEN BESTVAL=MAX_ALL(II) C BESTII=II ; BESTJJ=JJ-II ENDIF ENDDO c-------------------------------------------------------------------- c full profile alignment C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 3) THEN DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 MAX_H(JJ-II)= MAX_H(JJ-II)- + (( ELONG_GAP_1(II+1)+ ELONG_GAP_2(JJ-II+1))* 0.5) IF ( (RIGHT_LH(JJ-II)- + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) *0.5 )) + .GE. MAX_H(JJ-II) .AND. + (RIGHT_LH(JJ-II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) *0.5 )) + .GT. 0.0) THEN MAX_H(JJ-II) = (RIGHT_LH(JJ-II) - + ( (OPEN_GAP_1(II+1)+OPEN_GAP_2(JJ-II+1)) *0.5 )) ELSE IF ( MAX_H(JJ-II) .LE. 0.0) THEN MAX_H(JJ-II) = 0.0 ENDIF MAX_V(II)= MAX_V(II)- + ( (ELONG_GAP_1(II+1)+ ELONG_GAP_2(JJ-II+1))* 0.5) IF ( (DOWN_LH(II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) * 0.5 )) + .GE. MAX_V(II) .AND. + (DOWN_LH(II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) * 0.5 )) + .GT. 0.0) THEN MAX_V(II)= (DOWN_LH(II) - + ((OPEN_GAP_1(II+1)+OPEN_GAP_2(JJ-II+1)) * 0.5 )) ELSE IF ( MAX_V(II) .LE. 0.0) THEN MAX_V(II) = 0.0 ENDIF SUM=0.0 DO K=1,NTRANS SUM = SUM + ( METRIC_1(II+1,K) * METRIC_2(JJ-II+1,K) ) ENDDO C MAX_D(II) = DIAG_LH(JJ-II) + (SUM/NTRANS) MAX_D(II) = DIAG_LH(JJ-II) + SUM C SET MATRIX VALUE TO FORWARD PATH + BACKWARD PATH + SIM_VAL LH1(II+1,JJ-II+1)= LH1(II+1,JJ-II+1) + DIAG_LH(JJ-II) + + SUM IF ( LH1(II+1,JJ-II+1) .GE. SUBOPT_VAL) THEN LH2(II+1,JJ-II+1)= -1 * LH2(II+1,JJ-II+1) ENDIF MAX_ALL(II)=MAX(MAX_D(II),MAX_V(II),MAX_H(JJ-II),0.0) DIAG_LH(JJ-II) = DOWN_LH(II) RIGHT_LH(JJ-II)= MAX_ALL(II) DOWN_LH(II) = MAX_ALL(II) IF (BESTVAL .LT. MAX_ALL(II) ) THEN BESTVAL=MAX_ALL(II) C BESTII=II ; BESTJJ=JJ-II ENDIF ENDDO c-------------------------------------------------------------------- c take sequences as representatives of family c-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 4) THEN DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 MAX_H(JJ-II)= MAX_H(JJ-II)- + ( (ELONG_GAP_1(II+1)+ELONG_GAP_2(JJ-II+1)) *0.5) IF ( (RIGHT_LH(JJ-II)- + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) *0.5 )) + .GE. MAX_H(JJ-II) .AND. + (RIGHT_LH(JJ-II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) *0.5 )) + .GT. 0.0) THEN MAX_H(JJ-II) = (RIGHT_LH(JJ-II) - + ( (OPEN_GAP_1(II+1)+OPEN_GAP_2(JJ-II+1)) *0.5 )) ELSE IF ( MAX_H(JJ-II) .LE. 0.0) THEN MAX_H(JJ-II) = 0.0 ENDIF MAX_V(II)= MAX_V(II)- + ( (ELONG_GAP_1(II+1)+ ELONG_GAP_2(JJ-II+1))* 0.5) IF ( (DOWN_LH(II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) * 0.5 )) + .GE. MAX_V(II) .AND. + (DOWN_LH(II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) * 0.5 )) + .GT. 0.0) THEN MAX_V(II)= (DOWN_LH(II) - + ( (OPEN_GAP_1(II+1)+OPEN_GAP_2(JJ-II+1)) * 0.5 )) ELSE IF ( MAX_V(II) .LE. 0.0) THEN MAX_V(II) = 0.0 ENDIF MAX_D(II)= DIAG_LH(JJ-II) + + (( METRIC_1 (II+1,LSQ_2(JJ-II+1)) + + METRIC_2 (JJ-II+1,LSQ_1(II+1)) ) * 0.5) MAX_ALL(II)=MAX(MAX_D(II),MAX_V(II),MAX_H(JJ-II),0.0) C SET MATRIX VALUE TO FORWARD PATH + BACKWARD PATH + SIM_VAL LH1(II+1,JJ-II+1)= LH1(II+1,JJ-II+1) + DIAG_LH(JJ-II) + + (( METRIC_1 (II+1,LSQ_2(JJ-II+1)) + + METRIC_2 (JJ-II+1,LSQ_1(II+1)) ) * 0.5) IF ( LH1(II+1,JJ-II+1) .GE. SUBOPT_VAL) THEN LH2(II+1,JJ-II+1)= -1.0 * LH2(II+1,JJ-II+1) ENDIF DIAG_LH(JJ-II) = DOWN_LH(II) RIGHT_LH(JJ-II)= MAX_ALL(II) DOWN_LH(II) = MAX_ALL(II) IF (BESTVAL .LT. MAX_ALL(II) ) THEN BESTVAL=MAX_ALL(II) C BESTII=II ; BESTJJ=JJ-II ENDIF ENDDO C-------------------------------------------------------------------- C take maximal value as consensus C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 5) THEN DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 MAX_H(JJ-II)=MAX_H(JJ-II) - + ((ELONG_GAP_1(II+1)+ELONG_GAP_2(JJ-II+1))*0.5) IF ( (RIGHT_LH(JJ-II)- + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) *0.5 )) + .GE. MAX_H(JJ-II) .AND. + (RIGHT_LH(JJ-II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) *0.5 )) + .GT. 0.0) THEN MAX_H(JJ-II) = (RIGHT_LH(JJ-II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) *0.5 )) ELSE IF ( MAX_H(JJ-II) .LE. 0.0) THEN MAX_H(JJ-II) = 0.0 ENDIF MAX_V(II)= MAX_V(II)- + ( (ELONG_GAP_1(II+1)+ ELONG_GAP_2(JJ-II+1))* 0.5) IF ( (DOWN_LH(II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) * 0.5 )) + .GE. MAX_V(II) .AND. + (DOWN_LH(II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) * 0.5 )) + .GT. 0.0) THEN MAX_V(II)= (DOWN_LH(II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) * 0.5 )) ELSE IF ( MAX_V(II) .LE. 0.0) THEN MAX_V(II) = 0.0 ENDIF MAX_D(II) = DIAG_LH(JJ-II) + + ((MAX_METRIC_1_VAL(II+1) + MAX_METRIC_2_VAL(II+1)) * 0.5) MAX_ALL(II)=MAX(MAX_D(II),MAX_V(II),MAX_H(JJ-II),0.0) C SET MATRIX VALUE TO FORWARD PATH + BACKWARD PATH + SIM_VAL LH1(II+1,JJ-II+1)= LH1(II+1,JJ-II+1) + DIAG_LH(JJ-II) + + ((MAX_METRIC_1_VAL(II+1) + MAX_METRIC_2_VAL(II+1)) * 0.5) IF ( LH1(II+1,JJ-II+1) .GE. SUBOPT_VAL) THEN LH2(II+1,JJ-II+1)= -1 * LH2(II+1,JJ-II+1) ENDIF DIAG_LH(JJ-II) = DOWN_LH(II) RIGHT_LH(JJ-II)= MAX_ALL(II) DOWN_LH(II) = MAX_ALL(II) IF (BESTVAL+0.0001 .LT. MAX_ALL(II) ) THEN BESTVAL=MAX_ALL(II) C BESTII=II ; BESTJJ=JJ-II ENDIF ENDDO C==================================================================== C PROFILE MODE SELECTION END ENDIF C======================================================================= IF (LSAMESEQ) THEN I=II IF (II .LE. 0)I=JJ-II LH1(I,I) = 0.0 RIGHT_LH(I)= 0.0 DOWN_LH(I) = 0.0 ENDIF C==================================================================== C next antidiagonal C==================================================================== ENDDO C==================================================================== c WRITE(6,*)' SETBACK: ',BESTVAL,BESTII,BESTJJ C write data for SciAn, XPrism3... IF (ABS(BESTVAL_CHECK - BESTVAL) .GT. 0.01) THEN WRITE(6,*)'*** FATAL ERROR in SETBACK' WRITE(6,*)' bestval_check .ne. bestval: ', + BESTVAL_CHECK,BESTVAL STOP ENDIF CTEMP=' ' WRITE(CTEMP,*) '(',N2,'(F7.2))' CALL STRPOS(CTEMP,IBEG,IEND) CALL OPEN_FILE(99,'matrix.dat','new,recl=20000',lerror) DO I=1,N1 WRITE(99,CTEMP(IBEG:IEND)) ( LH1(I,J),J=1,N2) ENDDO CLOSE(99) C==================================================================== RETURN END C END SETBACK C...................................................................... C...................................................................... C SUB SETMATRIX SUBROUTINE SETMATRIX(N1BEG,N1END,N2BEG,N2END,N2,LH1,LH2) C -------------------------------------------------------- C subroutine SETMATRIX finds LH matrix for maximum homologous C subsequence between any two sequences C generate the homology and traceback matrix C----------------------------------------------------------------------- C LH(.,.,1) is homology score C LH(.,.,2) is traceback value C encoding LDIREC and LDEL: DIREC + LDEL C LH(I,J,1) corresponds to seq postions II=I-1, JJ=J-1 C LH(1,.,1) and LH(.,1,1) are terminal margins C LDIREC 10000,20000,30000,40000 for termination,diagonal,vertical,horizontal C LDEL length of deletion C temporary values: C MAX_H(),MAX_V() best value for horizontal and verical deletions C LDEL_H,LDEL_V length of horizontal and vertical deletion C====================================================================== C JULY 1991 (RS) C MAXDEL restriction removed C see: O. Gotoh, An Improved Algorithm for Matching Biological C Sequences, JMB (1982) 162, 705-708 C----------------------------------------------------------------------- C JUNE 1991 (RS) C matrix setting in a antidiagonal way to run it in parallel C see: Jones R. et.al., Protein Sequence Comparison on the Connection C Machine CM-2, in: Computers and DNA, SFI Studies in the Sciences C of Complexity, Vol VII, Addison-Wesley, 1990 C====================================================================== C C ANTIDIAGONAL SETTING OF THE MATRIX C ================================== C N1,N2: length of sequence 1 and sequence 2 C ADVANTAGE: loop can run in parallel or vectorized C C C ICOUNT 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 C -----------------------------------------------------> C | sequence 1 ====> N1 C | 2345678901234567890123456789 <==IIBEG C | ------------------------------ C 1 | sequence2 | //// / /| C 2 | | 2 |//// / / | C 3 | | 3 |/// / / /| <== JJBEG C 4 | | 4 |// / / //| C 5 | v 5 |/ / / ///| C 6 | 6 | / / ////| C 7 | 7 | / / /////| C 8 | N2 8 |/ / //////| C 9 | -----------------------------| C 10 | C 11 | C 12 | C 13 | C V C C===================================================================== C at each position take the best value of: C C LH(i,j,1)= MAX( LH(i-1,j-1,1) + SIM(i,j) , MAX_H(j) ,MAX_V(i) ,0) C C LH(i-1,j-1,1) : best value of diagonal (no INDEL) C SIM(i,j) : similarity value for position i,j C MAX_H(j) : best value of horizontal INDELs C MAX_V(i) : best value of vertical INDELs C where: C MAX_H(i)=MAX( LH(I-1,J,1) - gap-open , MAX_H(i-1) - gap-elongation , 0) C MAX_V(j)=MAX( LH(I,J-1,1) - gap-open , MAX_V(j-1) - gap-elongation , 0) C NOTE: one has to store the length of the deletion for MAX_H() and MAX_V() C in LDEL_H(j) and LDEL_V() C C C NOTE: C 1) if no INDEL(s) in secondary structure allowed: C GAPOPEN contains PUNISH C 2) internal deletions are (postion dependent ) weighted as: C GAPOPEN + GAPELONG *LENGTH C 3) conservation weights: C gap penalties are dependent on sequence-position(s), so weight C gap-penalties with conservation-weights otherwise the gap penalties C in regions with low conservation are too big C 4) antidiagonal matrix setting: C position in sequence 2 is JJBEG+IIBEG-II: step back in sequence 1 and C down in sequence 2 C C 5) NOT LONGER VALID C if the MAXDEL option is set, one has to check if the number of C INDEL's exceeds the MAXDEL value. C In addition: when the value for opening a gap is higher than C for the elongation, we have to check if the previous length of C the gap is not greater than 0. C That means that for some special cases it's cheaper to punish C the alignment by some open-penalties in a row than to elongate C or continue the alignment in the diagonal. C open a gap if: C 1.) OPEN .gt. ELONG or C 2.) LDELx()+1 .ge. MAXDEL C 3.) but only if LDELx() .eq. 0 C====================================================================== IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C import C ACTUAL SEQ LENGTH INTEGER N1BEG,N1END,N2BEG,N2END,N2 c export REAL LH1(0:N1+1,0:N2+1) INTEGER*2 LH2(0:N1+1,0:N2+1) c real lh(0:n1+1,0:n2+1,2) c internal INTEGER NSIZE1,NSIZE2 REAL SUM,XMAX1,XMAX2 INTEGER I,J,K,NDAMP,NDIAGONAL,ISMALL_DIM,IBIG_DIM,IIBEG,JJBEG INTEGER LEN_DIAG,IDIAG,II,JJ C======================================================================= C DO SOME STUFF OUTSIDE THE LOOPS: C======================================================================= C initialize C======================================================================= NSIZE1=N1END-N1BEG+1 NSIZE2=N2END-N2BEG+1 DO I=N1BEG-1,N1END+1 LH1(I,N2BEG-1)=0.0 LH1(I,N2BEG)=0.0 ENDDO DO J=N2BEG-1,N2END+1 LH1(N1BEG-1,J)=0.0 LH1(N1BEG,J)=0.0 ENDDO C DO I=0,N1+1 LH(I,0,1)=0.0 ; LH(I,1,1)=0.0 ; ENDDO C DO J=0,N2+1 ; LH(0,J,1)=0.0 ; LH(1,J,1)=0.0 ; ENDDO J=MIN(N1BEG-1,N2BEG-1) K=MAX(N1END+1,N2END+1) DO I=J,K C DO I=0,MAXSQ+1 MAX_H(I)=0.0 MAX_V(I)=0.0 LDEL_H(I)=0 LDEL_V(I)=0 LEFT_LH(I)=0.0 UP_LH(I)=0.0 DIAG_LH(I)=0.0 ENDDO C======================================================================= C update the metric values (weights) C this can be done outside the main parallel loop C with this we save at lot of multiplications in the parallel loop C the update can be done in concurrent/vectorized mode C======================================================================= NDAMP=1 IF (PROFILEMODE .EQ. 6) THEN IF (LCONSERV_1) THEN DO I=N1BEG,N1END OPEN_GAP_1(I) = GAPOPEN_1(I) * CONSWEIGHT_1(I) ENDDO DO I=N1BEG,N1END ELONG_GAP_1(I)= GAPELONG_1(I) * CONSWEIGHT_1(I) ENDDO C DAMP PENALTIES CALL DAMP_GAPWEIGHT(N1BEG,N1END,OPEN_GAP_1,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N1BEG,N1END,ELONG_GAP_1,NDAMP,PUNISH) ELSE DO I=N1BEG,N1END OPEN_GAP_1(I) = GAPOPEN_1 (I) ENDDO DO I=N1BEG,N1END ELONG_GAP_1(I)= GAPELONG_1(I) ENDDO CALL DAMP_GAPWEIGHT(N1BEG,N1END,OPEN_GAP_1,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N1BEG,N1END,ELONG_GAP_1,NDAMP,PUNISH) ENDIF IF (LCONSERV_2) THEN DO I=N2BEG,N2END OPEN_GAP_2(I) = GAPOPEN_2(I) * CONSWEIGHT_2(I) ENDDO DO I=N2BEG,N2END ELONG_GAP_2(I)= GAPELONG_2(I) * CONSWEIGHT_2(I) ENDDO CALL DAMP_GAPWEIGHT(N2BEG,N2END,OPEN_GAP_2,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N2BEG,N2END,ELONG_GAP_2,NDAMP,PUNISH) ELSE DO I=N2BEG,N2END OPEN_GAP_2(I) = GAPOPEN_2(I) ENDDO DO I=N2BEG,N2END ELONG_GAP_2(I)= GAPELONG_2(I) ENDDO CALL DAMP_GAPWEIGHT(N2BEG,N2END,OPEN_GAP_2,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N2BEG,N2END,ELONG_GAP_2,NDAMP,PUNISH) ENDIF C============================= ELSE IF (PROFILEMODE .NE. 2) THEN IF (LCONSERV_1) THEN DO K=1,NTRANS DO I=N1BEG,N1END METRIC_1(I,K) = SIMMETRIC_1(I,K) * CONSWEIGHT_1(I) ENDDO ENDDO DO I=N1BEG,N1END OPEN_GAP_1(I) = GAPOPEN_1(I) * CONSWEIGHT_1(I) ENDDO DO I=N1BEG,N1END ELONG_GAP_1(I)= GAPELONG_1(I) * CONSWEIGHT_1(I) ENDDO c damp penalties CALL DAMP_GAPWEIGHT(N1BEG,N1END,OPEN_GAP_1,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N1BEG,N1END,ELONG_GAP_1,NDAMP,PUNISH) ELSE DO K=1,NTRANS DO I=N1BEG,N1END METRIC_1(I,K) = SIMMETRIC_1(I,K) ENDDO ENDDO DO I=N1BEG,N1END OPEN_GAP_1(I) = GAPOPEN_1 (I) ENDDO DO I=N1BEG,N1END ELONG_GAP_1(I)= GAPELONG_1(I) ENDDO CALL DAMP_GAPWEIGHT(N1BEG,N1END,OPEN_GAP_1,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N1BEG,N1END,ELONG_GAP_1,NDAMP,PUNISH) ENDIF ENDIF IF (PROFILEMODE .GE. 2) THEN IF (LCONSERV_2) THEN DO K=1,NTRANS DO I=N2BEG,N2END METRIC_2(I,K) = SIMMETRIC_2(I,K) * CONSWEIGHT_2(I) ENDDO ENDDO DO I=N2BEG,N2END OPEN_GAP_2(I) = GAPOPEN_2(I) * CONSWEIGHT_2(I) ENDDO DO I=N2BEG,N2END ELONG_GAP_2(I)= GAPELONG_2(I) * CONSWEIGHT_2(I) ENDDO CALL DAMP_GAPWEIGHT(N2BEG,N2END,OPEN_GAP_2,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N2BEG,N2END,ELONG_GAP_2,NDAMP,PUNISH) ELSE DO K=1,NTRANS DO I=N2BEG,N2END METRIC_2(I,K) = SIMMETRIC_2(I,K) ENDDO ENDDO DO I=N2BEG,N2END OPEN_GAP_2(I) = GAPOPEN_2(I) ENDDO DO I=N2BEG,N2END ELONG_GAP_2(I)= GAPELONG_2(I) ENDDO CALL DAMP_GAPWEIGHT(N2BEG,N2END,OPEN_GAP_2,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N2BEG,N2END,ELONG_GAP_2,NDAMP,PUNISH) ENDIF ENDIF IF (PROFILEMODE .EQ. 5) THEN DO I=N1BEG,N1END MAX_METRIC_1_VAL(I)=-10000.0 DO K=1,NTRANS MAX_METRIC_1_VAL(I)= + MAX(METRIC_1(I,K),MAX_METRIC_1_VAL(I)) ENDDO ENDDO DO J=N2BEG,N2END MAX_METRIC_2_VAL(J)=-10000.0 DO K=1,NTRANS MAX_METRIC_2_VAL(J)= + MAX(METRIC_2(J,K),MAX_METRIC_2_VAL(J)) ENDDO ENDDO ENDIF IF ( PROFILEMODE .EQ. 3 ) THEN DO I=N1BEG,N1END SUM=0.0 DO K=1,NTRANS SUM= SUM + ( METRIC_1(I,K) * METRIC_1(I,K) ) ENDDO SUM= SQRT(SUM) DO K=1,NTRANS METRIC_1(I,K)= METRIC_1(I,K) / SUM ENDDO ENDDO DO I=N2BEG,N2END SUM=0.0 DO K=1,NTRANS SUM= SUM + ( METRIC_2(I,K) * METRIC_2(I,K) ) ENDDO SUM= SQRT(SUM) DO K=1,NTRANS METRIC_2(I,K)= METRIC_2(I,K) / SUM ENDDO ENDDO ENDIF c====================================================================== NDIAGONAL=NSIZE1+NSIZE2-1 c ndiagonal=iprofend-iprofbeg+1+n2-1 c WRITE(6,'(A,I6)')' NUMBER OF ANTIDIAGONALS: ',NDIAGONAL ISMALL_DIM=MIN(NSIZE1,NSIZE2) IBIG_DIM=MAX(NSIZE1,NSIZE2) IIBEG=N1BEG JJBEG=N2BEG+1 LEN_DIAG=0 C===================================================================== C profile 1 (no profiles or profile for first sequence) C-------------------------------------------------------------------- IF (PROFILEMODE .LE. 1) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG C==================================================================== C THIS LOOP CAN BE EXECUTED IN VECTOR-MODE C====================================================================== C compiler directives for vector C---------------------------------------------------------------------- DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 C====================================================================== C values for diagonal, horizontal and vertical (open and elongation) C===================================================================== C store best value and length for horizontal deletion C===================================================================== MAX_H(JJ-II) = (MAX_H(JJ-II) - ELONG_GAP_1(II-1)) IF ((MAX_H(JJ-II) .GE.(LEFT_LH(JJ-II)-OPEN_GAP_1(II-1))) + .AND. (MAX_H(JJ-II) .GT.0.0 )) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF (((LEFT_LH(JJ-II)-OPEN_GAP_1(II-1)) .GE. + MAX_H(JJ-II)) + .AND. ((LEFT_LH(JJ-II)-OPEN_GAP_1(II-1)) .GT. + 0.0)) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II)= (LEFT_LH(JJ-II) - OPEN_GAP_1(II-1)) ELSE MAX_H(JJ-II)= 0.0 LDEL_H(JJ-II)= 0 ENDIF c===================================================================== c store best value and length for vertical deletion c===================================================================== MAX_V(II) = (MAX_V(II) - ELONG_GAP_1(II-1)) IF ((MAX_V(II).GE.(UP_LH(II) - OPEN_GAP_1(II-1))) .AND. + ( MAX_V(II) .GT. 0.0) ) THEN LDEL_V(II)= LDEL_V(II) + 1 ELSE IF (((UP_LH(II) - OPEN_GAP_1(II-1)).GE.MAX_V(II)) + .AND.((UP_LH(II)- OPEN_GAP_1(II-1)).GT. + 0.0)) THEN MAX_V(II)= (UP_LH(II) - OPEN_GAP_1(II-1)) LDEL_V(II)=1 ELSE MAX_V(II)= 0.0 LDEL_V(II)= 0 ENDIF C====================================================================== C which value is the best (diagonal,horizontal or vertical) C store traceback C LDIREC and LDEL are coded in one number C====================================================================== MAX_D(II)= DIAG_LH(JJ-II)+METRIC_1(II-1,LSQ_2(JJ-II-1)) IF ( (MAX_D(II) .GE. MAX_V(II) ) .AND. + (MAX_D(II) .GE. MAX_H(JJ-II)) .AND. + (MAX_D(II) .GT. 0.0 )) THEN LH1(II,JJ-II)= MAX_D(II) ELSE IF ( (MAX_V(II) .GE. MAX_D(II) ) .AND. + (MAX_V(II) .GE. MAX_H(JJ-II)) .AND. + (MAX_V(II) .GT. 0.0 )) THEN LH1(II,JJ-II)= MAX_V(II) LH2(II,JJ-II)= 10000 + LDEL_V(II) ELSE IF ( (MAX_H(JJ-II) .GE. MAX_D(II)) .AND. + (MAX_H(JJ-II) .GE. MAX_V(II)) .AND. + (MAX_H(JJ-II) .GT. 0.0 )) THEN LH1(II,JJ-II)= MAX_H(JJ-II) LH2(II,JJ-II)= 20000 + LDEL_H(JJ-II) ELSE LH1(II,JJ-II)= 0.0 LH2(II,JJ-II)= 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN LH1(II,II)= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= LH1(II,JJ-II) UP_LH(II)= LH1(II,JJ-II) ENDDO c if (lsameseq) then c x= ( float(iibeg)/ 2.0) + (float(jjbeg)/2.0) c i=nint(x) c lh1(i,i) = 0.0 ; lh2(i,i)= 0 c left_lh(i)= 0.0 c up_lh(i) = 0.0 c WRITE(6,*)iibeg,jjbeg,i c endif C==================================================================== C next antidiagonal C==================================================================== ENDDO C-------------------------------------------------------------------- C profile 2 (profile for sequence 2) C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 2) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG CPAR$ DO_PARALLEL cccC$DIR PARALLEL cvd$ nodepchk DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 MAX_H(JJ-II) = (MAX_H(JJ-II) - ELONG_GAP_2(JJ-II-1)) IF (MAX_H(JJ-II) .GT. + (LEFT_LH(JJ-II)-OPEN_GAP_2(JJ-II-1)) .AND. + MAX_H(JJ-II) .GT.0.0 ) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF ( (LEFT_LH(JJ-II)-OPEN_GAP_2(JJ-II-1)) .GE. + MAX_H(JJ-II) .AND. + (LEFT_LH(JJ-II)-OPEN_GAP_2(JJ-II-1)).GT. + 0.0) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II) = (LEFT_LH(JJ-II)-OPEN_GAP_2(JJ-II-1)) ELSE MAX_H(JJ-II) = 0.0 LDEL_H(JJ-II)= 0 ENDIF c===================================================================== c store best value and length for vertical deletion c===================================================================== MAX_V(II) = (MAX_V(II) - ELONG_GAP_2(JJ-II-1)) IF (MAX_V(II).GT.(UP_LH(II) - OPEN_GAP_2(JJ-II-1)).AND. + MAX_V(II) .GT. 0.0 ) THEN LDEL_V(II) = LDEL_V(II) + 1 ELSE IF ((UP_LH(II)-OPEN_GAP_2(JJ-II-1)) .GE. MAX_V(II) + .AND.(UP_LH(II)-OPEN_GAP_2(JJ-II-1)).GT.0.0) THEN MAX_V(II)= (UP_LH(II) - OPEN_GAP_2(JJ-II-1)) LDEL_V(II)= 1 ELSE MAX_V(II) = 0.0 LDEL_V(II)= 0 ENDIF c====================================================================== c which value is the best (diagonal,horizontal or vertical) c====================================================================== MAX_D(II)= DIAG_LH(JJ-II)+METRIC_2(JJ-II-1,LSQ_1(II-1)) IF ( MAX_D(II) .GE. MAX_V(II) .AND. + MAX_D(II) .GE. MAX_H(JJ-II) .AND. + MAX_D(II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_D(II) ELSE IF ( MAX_V(II) .GT. MAX_D(II) .AND. + MAX_V(II) .GT. MAX_H(JJ-II) .AND. + MAX_V(II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_V(II) LH2(II,JJ-II) = 10000 + LDEL_V(II) ELSE IF ( MAX_H(JJ-II) .GT. MAX_D(II) .AND. + MAX_H(JJ-II) .GT. MAX_V(II) .AND. + MAX_H(JJ-II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_H(JJ-II) LH2(II,JJ-II) = 20000 + LDEL_H(JJ-II) ELSE LH1(II,JJ-II) = 0.0 LH2(II,JJ-II) = 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN LH1(II,II)= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= LH1(II,JJ-II) UP_LH(II) = LH1(II,JJ-II) ENDDO C==================================================================== C next antidiagonal C==================================================================== ENDDO c-------------------------------------------------------------------- C full profile alignment C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 3) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG CPAR$ DO_PARALLEL cccC$DIR PARALLEL cvd$ nodepchk DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 SUM=0.0 XMAX1=0.0 XMAX2=0.0 DO K=1,NTRANS SUM = SUM + ( METRIC_1(II-1,K) * METRIC_2(JJ-II-1,K) ) IF ( ( METRIC_1(II-1,K) * METRIC_2(JJ-II-1,K) ) + .GT. XMAX1 ) THEN XMAX1 = ( METRIC_1(II-1,K) * METRIC_2(JJ-II-1,K) ) ENDIF ENDDO OPEN_GAP_1(II-1) = OPEN_GAP_1(II-1) * XMAX1 ELONG_GAP_1(II-1) = ELONG_GAP_1(II-1) * XMAX1 OPEN_GAP_2(JJ-II-1) = OPEN_GAP_2(JJ-II-1) * XMAX1 ELONG_GAP_2(JJ-II-1) = ELONG_GAP_2(JJ-II-1) * XMAX1 MAX_H(JJ-II)= MAX_H(JJ-II)- + (( ELONG_GAP_1(II-1)+ ELONG_GAP_2(JJ-II-1))* 0.5) IF (MAX_H(JJ-II) .GT. (LEFT_LH(JJ-II)- + (( OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .AND. MAX_H(JJ-II) .GT.0.0 ) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF ( (LEFT_LH(JJ-II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GE. MAX_H(JJ-II) .AND. (LEFT_LH(JJ-II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GT.0.0) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II) = (LEFT_LH(JJ-II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) ELSE MAX_H(JJ-II) = 0.0 LDEL_H(JJ-II)= 0 ENDIF C===================================================================== C store best value and length for vertical deletion C===================================================================== MAX_V(II)= MAX_V(II)- + (( ELONG_GAP_1(II-1)+ ELONG_GAP_2(JJ-II-1))* 0.5) IF ( MAX_V(II) .GT. (UP_LH(II)- + (( OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) .AND. + MAX_V(II) .GT. 0.0 ) THEN LDEL_V(II) = LDEL_V(II) + 1 ELSE IF ( (UP_LH(II)- (( OPEN_GAP_1(II-1)+ + OPEN_GAP_2(JJ-II-1))*0.5)) .GE. MAX_V(II) + .AND. (UP_LH(II)- ((OPEN_GAP_1(II-1)+ + OPEN_GAP_2(JJ-II-1))*0.5)).GT.0.0) THEN MAX_V(II)= (UP_LH(II)- + (( OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) LDEL_V(II)= 1 ELSE MAX_V(II) = 0.0 LDEL_V(II)= 0 ENDIF C====================================================================== C which value is the best (diagonal,horizontal or vertical) C store traceback C LDIREC and LDEL are coded in one number C====================================================================== MAX_D(II) = DIAG_LH(JJ-II) + SUM IF ( MAX_D(II) .GE. MAX_V(II) .AND. + MAX_D(II) .GE. MAX_H(JJ-II) .AND. + MAX_D(II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_D(II) ELSE IF ( MAX_V(II) .GT. MAX_D(II) .AND. + MAX_V(II) .GT. MAX_H(JJ-II) .AND. + MAX_V(II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_V(II) LH2(II,JJ-II) = 10000 + LDEL_V(II) ELSE IF ( MAX_H(JJ-II) .GT. MAX_D(II) .AND. + MAX_H(JJ-II) .GT. MAX_V(II) .AND. + MAX_H(JJ-II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_H(JJ-II) LH2(II,JJ-II) = 20000 + LDEL_H(JJ-II) ELSE LH1(II,JJ-II) = 0.0 LH2(II,JJ-II) = 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN LH1(II,II)= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= LH1(II,JJ-II) UP_LH(II) = LH1(II,JJ-II) ENDDO C==================================================================== C next antidiagonal C==================================================================== ENDDO C-------------------------------------------------------------------- C take sequences as representatives of family C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 4) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG CPAR$ DO_PARALLEL cccC$DIR PARALLEL cvd$ nodepchk DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 MAX_H(JJ-II)= MAX_H(JJ-II)- + ( (ELONG_GAP_1(II-1)+ELONG_GAP_2(JJ-II-1)) *0.5) IF (MAX_H(JJ-II) .GT. (LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .AND. MAX_H(JJ-II) .GT.0.0 ) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF ( (LEFT_LH(JJ-II) - ((OPEN_GAP_1(II-1)+ + OPEN_GAP_2(JJ-II-1))*0.5)) .GE. MAX_H(JJ-II) + .AND. (LEFT_LH(JJ-II) - ((OPEN_GAP_1(II-1)+ + OPEN_GAP_2(JJ-II-1))*0.5)) .GT. 0.0 ) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II) = (LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) ELSE MAX_H(JJ-II) = 0.0 LDEL_H(JJ-II)= 0 ENDIF C===================================================================== C store best value and length for vertical deletion C===================================================================== MAX_V(II)= (MAX_V(II)- + ( (ELONG_GAP_1(II-1)+ELONG_GAP_2(JJ-II-1)) *0.5)) IF ( MAX_V(II) .GT. (UP_LH(II)- + ((OPEN_GAP_1(II-1) +OPEN_GAP_2(JJ-II-1)) *0.5)) + .AND. MAX_V(II) .GT. 0.0 ) THEN LDEL_V(II) = LDEL_V(II) + 1 ELSE IF ( (UP_LH(II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)).GE. + MAX_V(II) + .AND. (UP_LH(II)- ((OPEN_GAP_1(II-1) + + OPEN_GAP_2(JJ-II-1)) *0.5)) .GT. 0.0 ) THEN MAX_V(II)=(UP_LH(II)- + ((OPEN_GAP_1(II-1) +OPEN_GAP_2(JJ-II-1)) *0.5)) LDEL_V(II)= 1 ELSE MAX_V(II) = 0.0 LDEL_V(II)= 0 ENDIF C====================================================================== C which value is the best (diagonal,horizontal or vertical) C store traceback C LDIREC and LDEL are coded in one number C====================================================================== MAX_D(II)= DIAG_LH(JJ-II) + + (( METRIC_1 (II-1,LSQ_2(JJ-II-1)) + + METRIC_2 (JJ-II-1,LSQ_1(II-1)) ) * 0.5) IF ( MAX_D(II) .GE. MAX_V(II) .AND. + MAX_D(II) .GE. MAX_H(JJ-II) .AND. + MAX_D(II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_D(II) ELSE IF ( MAX_V(II) .GT. MAX_D(II) .AND. + MAX_V(II) .GT. MAX_H(JJ-II) .AND. + MAX_V(II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_V(II) LH2(II,JJ-II) = 10000 + LDEL_V(II) ELSE IF ( MAX_H(JJ-II) .GT. MAX_D(II) .AND. + MAX_H(JJ-II) .GT. MAX_V(II) .AND. + MAX_H(JJ-II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_H(JJ-II) LH2(II,JJ-II) = 20000 + LDEL_H(JJ-II) ELSE LH1(II,JJ-II) = 0.0 LH2(II,JJ-II) = 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN LH1(II,II)= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= LH1(II,JJ-II) UP_LH(II) = LH1(II,JJ-II) ENDDO C==================================================================== C next antidiagonal C==================================================================== ENDDO C-------------------------------------------------------------------- C take maximal value as consensus C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 5) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG CPAR$ DO_PARALLEL cccC$DIR PARALLEL cvd$ nodepchk DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 MAX_H(JJ-II)=MAX_H(JJ-II) - + ((ELONG_GAP_1(II-1)+ELONG_GAP_2(JJ-II-1))*0.5) IF (MAX_H(JJ-II) .GT. (LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .AND. MAX_H(JJ-II) .GT. 0.0 ) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF ((LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)).GE. + MAX_H(JJ-II) .AND. (LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GT. 0.0) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II) = (LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) ELSE MAX_H(JJ-II) = 0.0 LDEL_H(JJ-II)= 0 ENDIF C===================================================================== C store best value and length for vertical deletion C===================================================================== MAX_V(II)= MAX_V(II) - + ( (ELONG_GAP_1(II-1)+ELONG_GAP_2(JJ-II-1))*0.5) IF ( MAX_V(II) .GT. (UP_LH(II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) .AND. + MAX_V(II) .GT. 0.0 ) THEN LDEL_V(II) = LDEL_V(II) + 1 ELSE IF ((UP_LH(II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)).GE. + MAX_V(II) .AND. (UP_LH(II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GT. 0.0) THEN MAX_V(II)= (UP_LH(II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) LDEL_V(II)= 1 ELSE MAX_V(II) = 0.0 LDEL_V(II)= 0 ENDIF C====================================================================== C which value is the best (diagonal,horizontal or vertical) C store traceback C LDIREC and LDEL are coded in one number C====================================================================== MAX_D(II) = DIAG_LH(JJ-II) + + ((MAX_METRIC_1_VAL(II-1)+MAX_METRIC_2_VAL(II-1))*0.5) IF ( MAX_D(II) .GE. MAX_V(II) .AND. + MAX_D(II) .GE. MAX_H(JJ-II) .AND. + MAX_D(II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_D(II) ELSE IF ( MAX_V(II) .GT. MAX_D(II) .AND. + MAX_V(II) .GT. MAX_H(JJ-II) .AND. + MAX_V(II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_V(II) LH2(II,JJ-II) = 10000 + LDEL_V(II) ELSE IF ( MAX_H(JJ-II) .GT. MAX_D(II) .AND. + MAX_H(JJ-II) .GT. MAX_V(II) .AND. + MAX_H(JJ-II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_H(JJ-II) LH2(II,JJ-II) = 20000 + LDEL_H(JJ-II) ELSE LH1(II,JJ-II) = 0.0 LH2(II,JJ-II) = 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN LH1(II,II)= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= LH1(II,JJ-II) UP_LH(II) = LH2(II,JJ-II) ENDDO C==================================================================== C next antidiagonal C==================================================================== ENDDO C==================================================================== ELSE IF (PROFILEMODE .EQ. 6) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG CPAR$ DO_PARALLEL cccC$DIR PARALLEL cvd$ nodepchk DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 C===================================================================== MAX_H(JJ-II) = (MAX_H(JJ-II) - ELONG_GAP_1(II-1)) IF ((MAX_H(JJ-II) .GE.(LEFT_LH(JJ-II)-OPEN_GAP_1(II-1))) + .AND. (MAX_H(JJ-II) .GT.0.0 )) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF (((LEFT_LH(JJ-II)-OPEN_GAP_1(II-1)) .GE. + MAX_H(JJ-II)) + .AND.((LEFT_LH(JJ-II)-OPEN_GAP_1(II-1)).GT. + 0.0)) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II)= (LEFT_LH(JJ-II) - OPEN_GAP_1(II-1)) ELSE MAX_H(JJ-II)= 0.0 LDEL_H(JJ-II)= 0 ENDIF c===================================================================== MAX_V(II) = (MAX_V(II) - ELONG_GAP_1(II-1)) IF ( (MAX_V(II) .GE. (UP_LH(II) - OPEN_GAP_1(II-1))) + .AND.( MAX_V(II) .GT. 0.0) ) THEN LDEL_V(II)= LDEL_V(II) + 1 ELSE IF (((UP_LH(II) - OPEN_GAP_1(II-1)).GE. MAX_V(II)) + .AND. ((UP_LH(II) - OPEN_GAP_1(II-1)) .GT. + 0.0)) THEN MAX_V(II)= (UP_LH(II) - OPEN_GAP_1(II-1)) LDEL_V(II)=1 ELSE MAX_V(II)= 0.0 LDEL_V(II)= 0 ENDIF C====================================================================== MAX_D(II)= DIAG_LH(JJ-II) + + SIMORG(LSQ_1(II-1),LSQ_2(JJ-II-1),LSTRCLASS_1(II-1), + LACC_1(II-1),LSTRCLASS_2(JJ-II-1), + LACC_2(JJ-II-1) ) IF ( (MAX_D(II) .GE. MAX_V(II) ) .AND. + (MAX_D(II) .GE. MAX_H(JJ-II)) .AND. + (MAX_D(II) .GT. 0.0 )) THEN LH1(II,JJ-II)= MAX_D(II) ELSE IF ( (MAX_V(II) .GE. MAX_D(II) ) .AND. + (MAX_V(II) .GE. MAX_H(JJ-II)) .AND. + (MAX_V(II) .GT. 0.0 )) THEN LH1(II,JJ-II)= MAX_V(II) LH2(II,JJ-II)= 10000 + LDEL_V(II) ELSE IF ( (MAX_H(JJ-II) .GE. MAX_D(II)) .AND. + (MAX_H(JJ-II) .GE. MAX_V(II)) .AND. + (MAX_H(JJ-II) .GT. 0.0 )) THEN LH1(II,JJ-II)= MAX_H(JJ-II) LH2(II,JJ-II)= 20000 + LDEL_H(JJ-II) ELSE LH1(II,JJ-II)= 0.0 LH2(II,JJ-II)= 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN LH1(II,II)= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= LH1(II,JJ-II) UP_LH(II)= LH1(II,JJ-II) ENDDO C==================================================================== C next antidiagonal C==================================================================== ENDDO C==================================================================== C PROFILE MODE SELECTION END ENDIF C==================================================================== C debug: output the LH (values and trace-back)matrix c call open_file(99,'matrix.dat','new,recl=2000',lerror) c nii=n1+1 ; njj=n2+1 c write(99,*) 'H-MATRIX Hij' c write(99,*)'Index i for Seq. 1' ; write(99,*)'Index j for Seq. 2' c do i=1,nii c write(99,'(i6)')i ; write(99,'(2x,20(i6))')(lh1(i,j),j=1,njj) c enddo c write(99,*)'TRACE-BACK MATRIX' c do i=1,nii c write(99,'(i6)')i ; write(99,'(2x,20(i6))')(lh2(i,j),j=1,njj) c enddo c close(99) C C write data for XPrism3 c call open_file(99,'xprism3.dat','new',lerror) c do I=0,N1+1 c write(99,*) (lh1(i,j),J=0,N2+1) c enddo c do I=0,N1+1 ; do J=0,N2+1 cc write(99,'(2x,i5,2x,i4,f7.2)')i,j,lh1(i,j) cc write(99,'(2x,i5,2x,i4,f7.2)')i,j,lh1(i,j) c trace back cc write(99,'(2x,i5,2x,i4,f7.2,1x,i6)')i,j,lh1(i,j),lh2(i,j) c ENDDO; enddo c close(99) C======================================================================= RETURN END C END SETMATRIX C...................................................................... C...................................................................... C SUB SETMATRIX_FAST SUBROUTINE SETMATRIX_FAST(N1BEG,N1END,N2BEG,N2END,N2,LH2, + BESTVAL,BESTIIPOS,BESTJJPOS) C -------------------------------------------------------- C subroutine SETMATRIX_fast finds LH matrix for maximum homologous C subsequence between any two sequences C generate the homology and traceback matrix C----------------------------------------------------------------------- C LH(.,.,1) is homology score C LH(.,.,2) is traceback value C encoding LDIREC and LDEL: DIREC + LDEL C LH(I,J,1) corresponds to seq postions II=I-1, JJ=J-1 C LH(1,.,1) and LH(.,1,1) are terminal margins C LDIREC 10000,20000,30000,40000 for termination,diagonal,vertical,horizontal C LDEL length of deletion C temporary values: C MAX_H(),MAX_V() best value for horizontal and verical deletions C LDEL_H,LDEL_V length of horizontal and vertical deletion C====================================================================== C JULY 1991 (RS) C MAXDEL restriction removed C see: O. Gotoh, An Improved Algorithm for Matching Biological C Sequences, JMB (1982) 162, 705-708 C----------------------------------------------------------------------- C JUNE 1991 (RS) C matrix setting in a antidiagonal way to run it in parallel C see: Jones R. et.al., Protein Sequence Comparison on the Connection C Machine CM-2, in: Computers and DNA, SFI Studies in the Sciences C of Complexity, Vol VII, Addison-Wesley, 1990 C====================================================================== C C ANTIDIAGONAL SETTING OF THE MATRIX C ================================== C N1,N2: length of sequence 1 and sequence 2 C ADVANTAGE: loop can run in parallel or vectorized C C C ICOUNT 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 C -----------------------------------------------------> C | sequence 1 ====> N1 C | 2345678901234567890123456789 <==IIBEG C | ------------------------------ C 1 | sequence2 | //// / /| C 2 | | 2 |//// / / | C 3 | | 3 |/// / / /| <== JJBEG C 4 | | 4 |// / / //| C 5 | v 5 |/ / / ///| C 6 | 6 | / / ////| C 7 | 7 | / / /////| C 8 | N2 8 |/ / //////| C 9 | -----------------------------| C 10 | C 11 | C 12 | C 13 | C V C C===================================================================== C at each position take the best value of: C C LH(i,j,1)= MAX( LH(i-1,j-1,1) + SIM(i,j) , MAX_H(j) ,MAX_V(i) ,0) C C LH(i-1,j-1,1) : best value of diagonal (no INDEL) C SIM(i,j) : similarity value for position i,j C MAX_H(j) : best value of horizontal INDELs C MAX_V(i) : best value of vertical INDELs C where: C MAX_H(i)=MAX( LH(I-1,J,1) - gap-open , MAX_H(i-1) - gap-elongation , 0) C MAX_V(j)=MAX( LH(I,J-1,1) - gap-open , MAX_V(j-1) - gap-elongation , 0) C NOTE: one has to store the length of the deletion for MAX_H() and MAX_V() C in LDEL_H(j) and LDEL_V() C C C NOTE: C 1) if no INDEL(s) in secondary structure allowed: C GAPOPEN contains PUNISH C 2) internal deletions are (postion dependent ) weighted as: C GAPOPEN + GAPELONG *LENGTH C 3) conservation weights: C gap penalties are dependent on sequence-position(s), so weight C gap-penalties with conservation-weights otherwise the gap penalties C in regions with low conservation are too big C 4) antidiagonal matrix setting: C position in sequence 2 is JJBEG+IIBEG-II: step back in sequence 1 and C down in sequence 2 C C 5) NOT LONGER VALID C if the MAXDEL option is set, one has to check if the number of C INDEL's exceeds the MAXDEL value. C In addition: when the value for opening a gap is higher than C for the elongation, we have to check if the previous length of C the gap is not greater than 0. C That means that for some special cases it's cheaper to punish C the alignment by some open-penalties in a row than to elongate C or continue the alignment in the diagonal. C open a gap if: C 1.) OPEN .gt. ELONG or C 2.) LDELx()+1 .ge. MAXDEL C 3.) but only if LDELx() .eq. 0 C====================================================================== IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C import C ACTUAL SEQ LENGTH INTEGER N1BEG,N1END,N2BEG,N2END,N2 c export c real lh1(0:n1+1,0:n2+1) INTEGER*2 LH2(0:N1+1,0:N2+1) c real lh(0:n1+1,0:n2+1) REAL BESTVAL INTEGER BESTIIPOS,BESTJJPOS c internal INTEGER NSIZE1,NSIZE2 REAL SUM,XMAX1,XMAX2 REAL BESTNOW INTEGER I,J,K,NDAMP,NDIAGONAL,ISMALL_DIM,IBIG_DIM,IIBEG,JJBEG INTEGER LEN_DIAG,IDIAG,II,JJ C======================================================================= C DO SOME STUFF OUTSIDE THE LOOPS: C======================================================================= C initialize C======================================================================= BESTVAL=-99999.0 BESTNOW=-99999.0 BESTIIPOS=-1 BESTJJPOS=-1 NSIZE1=N1END-N1BEG+1 NSIZE2=N2END-N2BEG+1 J=MIN(N1BEG-1,N2BEG-1) K=MAX(N1END+1,N2END+1) DO I=J,K c do i=0,MAXSQ+1 MAX_H(I)=0.0 MAX_V(I)=0.0 LDEL_H(I)=0 LDEL_V(I)=0 LEFT_LH(I)=0.0 UP_LH(I)=0.0 DIAG_LH(I)=0.0 ENDDO C======================================================================= C update the metric values (weights) C this can be done outside the main parallel loop C with this we save at lot of multiplications in the parallel loop C the update can be done in concurrent/vectorized mode C======================================================================= NDAMP=1 IF (PROFILEMODE .EQ. 6) THEN IF (LCONSERV_1) THEN DO I=N1BEG,N1END OPEN_GAP_1(I) = GAPOPEN_1(I) * CONSWEIGHT_1(I) ENDDO DO I=N1BEG,N1END ELONG_GAP_1(I)= GAPELONG_1(I) * CONSWEIGHT_1(I) ENDDO C DAMP PENALTIES CALL DAMP_GAPWEIGHT(N1BEG,N1END,OPEN_GAP_1,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N1BEG,N1END,ELONG_GAP_1,NDAMP,PUNISH) ELSE DO I=N1BEG,N1END OPEN_GAP_1(I) = GAPOPEN_1 (I) ENDDO DO I=N1BEG,N1END ELONG_GAP_1(I)= GAPELONG_1(I) ENDDO CALL DAMP_GAPWEIGHT(N1BEG,N1END,OPEN_GAP_1,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N1BEG,N1END,ELONG_GAP_1,NDAMP,PUNISH) ENDIF IF (LCONSERV_2) THEN DO I=N2BEG,N2END OPEN_GAP_2(I) = GAPOPEN_2(I) * CONSWEIGHT_2(I) ENDDO DO I=N2BEG,N2END ELONG_GAP_2(I)= GAPELONG_2(I) * CONSWEIGHT_2(I) ENDDO CALL DAMP_GAPWEIGHT(N2BEG,N2END,OPEN_GAP_2,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N2BEG,N2END,ELONG_GAP_2,NDAMP,PUNISH) ELSE DO I=N2BEG,N2END OPEN_GAP_2(I) = GAPOPEN_2(I) ENDDO DO I=N2BEG,N2END ELONG_GAP_2(I)= GAPELONG_2(I) ENDDO CALL DAMP_GAPWEIGHT(N2BEG,N2END,OPEN_GAP_2,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N2BEG,N2END,ELONG_GAP_2,NDAMP,PUNISH) ENDIF C============================= ELSE IF (PROFILEMODE .NE. 2) THEN IF (LCONSERV_1) THEN DO K=1,NTRANS DO I=N1BEG,N1END METRIC_1(I,K) = SIMMETRIC_1(I,K) * CONSWEIGHT_1(I) ENDDO ENDDO DO I=N1BEG,N1END OPEN_GAP_1(I) = GAPOPEN_1(I) * CONSWEIGHT_1(I) ENDDO DO I=N1BEG,N1END ELONG_GAP_1(I)= GAPELONG_1(I) * CONSWEIGHT_1(I) ENDDO C DAMP PENALTIES CALL DAMP_GAPWEIGHT(N1BEG,N1END,OPEN_GAP_1,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N1BEG,N1END,ELONG_GAP_1,NDAMP,PUNISH) ELSE DO K=1,NTRANS DO I=N1BEG,N1END METRIC_1(I,K) = SIMMETRIC_1(I,K) ENDDO ENDDO DO I=N1BEG,N1END OPEN_GAP_1(I) = GAPOPEN_1 (I) ENDDO DO I=N1BEG,N1END ELONG_GAP_1(I)= GAPELONG_1(I) ENDDO CALL DAMP_GAPWEIGHT(N1BEG,N1END,OPEN_GAP_1,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N1BEG,N1END,ELONG_GAP_1,NDAMP,PUNISH) ENDIF ENDIF IF (PROFILEMODE .GE. 2) THEN IF (LCONSERV_2) THEN DO K=1,NTRANS DO I=N2BEG,N2END METRIC_2(I,K) = SIMMETRIC_2(I,K) * CONSWEIGHT_2(I) ENDDO ENDDO DO I=N2BEG,N2END OPEN_GAP_2(I) = GAPOPEN_2(I) * CONSWEIGHT_2(I) ENDDO DO I=N2BEG,N2END ELONG_GAP_2(I)= GAPELONG_2(I) * CONSWEIGHT_2(I) ENDDO CALL DAMP_GAPWEIGHT(N2BEG,N2END,OPEN_GAP_2,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N2BEG,N2END,ELONG_GAP_2,NDAMP,PUNISH) ELSE DO K=1,NTRANS DO I=N2BEG,N2END METRIC_2(I,K) = SIMMETRIC_2(I,K) ENDDO ENDDO DO I=N2BEG,N2END OPEN_GAP_2(I) = GAPOPEN_2(I) ENDDO DO I=N2BEG,N2END ELONG_GAP_2(I)= GAPELONG_2(I) ENDDO CALL DAMP_GAPWEIGHT(N2BEG,N2END,OPEN_GAP_2,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N2BEG,N2END,ELONG_GAP_2,NDAMP,PUNISH) ENDIF ENDIF IF (PROFILEMODE .EQ. 5) THEN DO I=N1BEG,N1END MAX_METRIC_1_VAL(I)=-10000.0 DO K=1,NTRANS MAX_METRIC_1_VAL(I)= + MAX(METRIC_1(I,K),MAX_METRIC_1_VAL(I)) ENDDO ENDDO DO J=N2BEG,N2END MAX_METRIC_2_VAL(J)=-10000.0 DO K=1,NTRANS MAX_METRIC_2_VAL(J)= + MAX(METRIC_2(J,K),MAX_METRIC_2_VAL(J)) ENDDO ENDDO ENDIF C IF ( PROFILEMODE .EQ. 3 ) THEN DO I=N1BEG,N1END SUM=0.0 DO K=1,NTRANS SUM= SUM + ( METRIC_1(I,K) * METRIC_1(I,K) ) ENDDO SUM= SQRT(SUM) DO K=1,NTRANS METRIC_1(I,K)= METRIC_1(I,K) / SUM ENDDO ENDDO DO I=N2BEG,N2END SUM=0.0 DO K=1,NTRANS SUM= SUM + ( METRIC_2(I,K) * METRIC_2(I,K) ) ENDDO SUM= SQRT(SUM) DO K=1,NTRANS METRIC_2(I,K)= METRIC_2(I,K) / SUM ENDDO ENDDO ENDIF c====================================================================== NDIAGONAL=NSIZE1+NSIZE2-1 c ndiagonal=iprofend-iprofbeg+1+n2-1 c WRITE(6,'(A,I6)')' NUMBER OF ANTIDIAGONALS: ',NDIAGONAL ISMALL_DIM=MIN(NSIZE1,NSIZE2) IBIG_DIM=MAX(NSIZE1,NSIZE2) IIBEG=N1BEG JJBEG=N2BEG+1 LEN_DIAG=0 C===================================================================== C profile 1 (no profiles or profile for first sequence) C-------------------------------------------------------------------- IF (PROFILEMODE .LE. 1) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG C==================================================================== C THIS LOOP CAN BE EXECUTED IN VECTOR-MODE C====================================================================== C compiler directives for vector C---------------------------------------------------------------------- DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 C====================================================================== C values for diagonal, horizontal and vertical (open and elongation) C===================================================================== C store best value and length for horizontal deletion C===================================================================== MAX_H(JJ-II) = (MAX_H(JJ-II) - ELONG_GAP_1(II-1)) IF ((MAX_H(JJ-II) .GE.(LEFT_LH(JJ-II)-OPEN_GAP_1(II-1))) + .AND. (MAX_H(JJ-II) .GT.0.0 )) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF (((LEFT_LH(JJ-II)-OPEN_GAP_1(II-1)) .GE. + MAX_H(JJ-II)) + .AND. ((LEFT_LH(JJ-II)-OPEN_GAP_1(II-1)) .GT. + 0.0)) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II)= (LEFT_LH(JJ-II) - OPEN_GAP_1(II-1)) ELSE MAX_H(JJ-II)= 0.0 LDEL_H(JJ-II)= 0 ENDIF c===================================================================== c store best value and length for vertical deletion c===================================================================== MAX_V(II) = (MAX_V(II) - ELONG_GAP_1(II-1)) IF ( (MAX_V(II).GE.(UP_LH(II) - OPEN_GAP_1(II-1))) .AND. + ( MAX_V(II) .GT. 0.0) ) THEN LDEL_V(II)= LDEL_V(II) + 1 ELSE IF ( ((UP_LH(II)-OPEN_GAP_1(II-1)) .GE. MAX_V(II)) + .AND. ((UP_LH(II) - OPEN_GAP_1(II-1)) .GT. + 0.0)) THEN MAX_V(II)= (UP_LH(II) - OPEN_GAP_1(II-1)) LDEL_V(II)=1 ELSE MAX_V(II)= 0.0 LDEL_V(II)= 0 ENDIF C====================================================================== C which value is the best (diagonal,horizontal or vertical) C store traceback C LDIREC and LDEL are coded in one number C====================================================================== MAX_D(II)= DIAG_LH(JJ-II)+METRIC_1(II-1,LSQ_2(JJ-II-1)) IF ( (MAX_D(II) .GE. MAX_V(II) ) .AND. + (MAX_D(II) .GE. MAX_H(JJ-II)) .AND. + (MAX_D(II) .GT. 0.0 )) THEN BESTNOW= MAX_D(II) ELSE IF ( (MAX_V(II) .GE. MAX_D(II) ) .AND. + (MAX_V(II) .GE. MAX_H(JJ-II)) .AND. + (MAX_V(II) .GT. 0.0 )) THEN BESTNOW= MAX_V(II) LH2(II,JJ-II)= 10000 + LDEL_V(II) ELSE IF ( (MAX_H(JJ-II) .GE. MAX_D(II)) .AND. + (MAX_H(JJ-II) .GE. MAX_V(II)) .AND. + (MAX_H(JJ-II) .GT. 0.0 )) THEN BESTNOW= MAX_H(JJ-II) LH2(II,JJ-II)= 20000 + LDEL_H(JJ-II) ELSE BESTNOW= 0.0 LH2(II,JJ-II)= 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN BESTNOW=0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= BESTNOW UP_LH(II)= BESTNOW IF (BESTNOW .GT. BESTVAL) THEN BESTVAL=BESTNOW BESTIIPOS=II BESTJJPOS=JJ-II ENDIF C END DIAGONAL ENDDO C IF (LSAMESEQ) THEN C X= ( FLOAT(IIBEG)/ 2.0) + (FLOAT(JJBEG)/2.0) C I=NINT(X) C LH1(I,I) = 0.0 ; LH2(I,I)= 0 C LEFT_LH(I)= 0.0 c up_lh(i) = 0.0 c WRITE(6,*)iibeg,jjbeg,i c endif C==================================================================== C next antidiagonal C==================================================================== ENDDO C-------------------------------------------------------------------- C profile 2 (profile for sequence 2) C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 2) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG CPAR$ DO_PARALLEL cccC$DIR PARALLEL cvd$ nodepchk DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 MAX_H(JJ-II) = (MAX_H(JJ-II) - ELONG_GAP_2(JJ-II-1)) IF (MAX_H(JJ-II) .GT. + (LEFT_LH(JJ-II)-OPEN_GAP_2(JJ-II-1)) .AND. + MAX_H(JJ-II) .GT.0.0 ) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF ( (LEFT_LH(JJ-II)-OPEN_GAP_2(JJ-II-1)) .GE. + MAX_H(JJ-II) .AND. + (LEFT_LH(JJ-II)-OPEN_GAP_2(JJ-II-1)) .GT. + 0.0) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II) = (LEFT_LH(JJ-II)-OPEN_GAP_2(JJ-II-1)) ELSE MAX_H(JJ-II) = 0.0 LDEL_H(JJ-II)= 0 ENDIF c===================================================================== c store best value and length for vertical deletion c===================================================================== MAX_V(II) = (MAX_V(II) - ELONG_GAP_2(JJ-II-1)) IF ( MAX_V(II).GT.(UP_LH(II) - OPEN_GAP_2(JJ-II-1)) .AND. + MAX_V(II) .GT. 0.0 ) THEN LDEL_V(II) = LDEL_V(II) + 1 ELSE IF ((UP_LH(II) - OPEN_GAP_2(JJ-II-1)).GE. MAX_V(II) + .AND. (UP_LH(II) - OPEN_GAP_2(JJ-II-1)) .GT. + 0.0) THEN MAX_V(II)= (UP_LH(II) - OPEN_GAP_2(JJ-II-1)) LDEL_V(II)= 1 ELSE MAX_V(II) = 0.0 LDEL_V(II)= 0 ENDIF c====================================================================== c which value is the best (diagonal,horizontal or vertical) c====================================================================== MAX_D(II)= DIAG_LH(JJ-II)+METRIC_2(JJ-II-1,LSQ_1(II-1)) IF ( MAX_D(II) .GE. MAX_V(II) .AND. + MAX_D(II) .GE. MAX_H(JJ-II) .AND. + MAX_D(II) .GT. 0.0) THEN BESTNOW = MAX_D(II) ELSE IF ( MAX_V(II) .GT. MAX_D(II) .AND. + MAX_V(II) .GT. MAX_H(JJ-II) .AND. + MAX_V(II) .GT. 0.0) THEN BESTNOW = MAX_V(II) LH2(II,JJ-II) = 10000 + LDEL_V(II) ELSE IF ( MAX_H(JJ-II) .GT. MAX_D(II) .AND. + MAX_H(JJ-II) .GT. MAX_V(II) .AND. + MAX_H(JJ-II) .GT. 0.0) THEN BESTNOW = MAX_H(JJ-II) LH2(II,JJ-II) = 20000 + LDEL_H(JJ-II) ELSE BESTNOW = 0.0 LH2(II,JJ-II) = 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN BESTNOW= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= BESTNOW UP_LH(II) = BESTNOW IF (BESTNOW .GT. BESTVAL) THEN BESTVAL=BESTNOW BESTIIPOS=II BESTJJPOS=JJ-II ENDIF ENDDO C==================================================================== C next antidiagonal C==================================================================== ENDDO c-------------------------------------------------------------------- C full profile alignment C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 3) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG CPAR$ DO_PARALLEL cccC$DIR PARALLEL cvd$ nodepchk DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 SUM=0.0 XMAX1=0.0 XMAX2=0.0 DO K=1,NTRANS SUM = SUM + ( METRIC_1(II-1,K) * METRIC_2(JJ-II-1,K) ) C IF ( METRIC_1(II-1,K) .GT. XMAX1 ) THEN C XMAX1 = METRIC_1(II-1,K) C ENDIF C IF ( METRIC_2(JJ-II-1,K) .GT. XMAX2 ) THEN C XMAX2 = METRIC_1(JJ-II-1,K) C ENDIF ENDDO C OPEN_GAP_1(II-1) = GAPOPEN_1(II-1) * XMAX1 C ELONG_GAP_1(II-1) = GAPELONG_1(II-1) * XMAX1 C OPEN_GAP_2(JJ-II-1) = GAPOPEN_2(JJ-II-1) * XMAX2 C ELONG_GAP_2(JJ-II-1) = GAPELONG_2(JJ-II-1) * XMAX2 c WRITE(6,*)ii,jj-ii,sum c MAX_D(II) = DIAG_LH(JJ-II) + (SUM/NTRANS) MAX_H(JJ-II)= MAX_H(JJ-II)- + (( ELONG_GAP_1(II-1)+ ELONG_GAP_2(JJ-II-1))* 0.5) IF (MAX_H(JJ-II) .GT. (LEFT_LH(JJ-II)- + (( OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .AND. MAX_H(JJ-II) .GT.0.0 ) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF ( (LEFT_LH(JJ-II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GE. MAX_H(JJ-II) .AND. (LEFT_LH(JJ-II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GT.0.0) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II) = (LEFT_LH(JJ-II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) ELSE MAX_H(JJ-II) = 0.0 LDEL_H(JJ-II)= 0 ENDIF C===================================================================== C store best value and length for vertical deletion C===================================================================== MAX_V(II)= MAX_V(II)- + (( ELONG_GAP_1(II-1)+ ELONG_GAP_2(JJ-II-1))* 0.5) IF ( MAX_V(II) .GT. (UP_LH(II)- + (( OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) .AND. + MAX_V(II) .GT. 0.0 ) THEN LDEL_V(II) = LDEL_V(II) + 1 ELSE IF ( (UP_LH(II)- (( OPEN_GAP_1(II-1)+ + OPEN_GAP_2(JJ-II-1))*0.5)) .GE. MAX_V(II) + .AND. (UP_LH(II)- ((OPEN_GAP_1(II-1)+ + OPEN_GAP_2(JJ-II-1))*0.5)).GT.0.0) THEN MAX_V(II)= (UP_LH(II)- + (( OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) LDEL_V(II)= 1 ELSE MAX_V(II) = 0.0 LDEL_V(II)= 0 ENDIF C====================================================================== C which value is the best (diagonal,horizontal or vertical) C store traceback C LDIREC and LDEL are coded in one number C====================================================================== MAX_D(II) = DIAG_LH(JJ-II) + SUM IF ( MAX_D(II) .GE. MAX_V(II) .AND. + MAX_D(II) .GE. MAX_H(JJ-II) .AND. + MAX_D(II) .GT. 0.0) THEN BESTNOW = MAX_D(II) ELSE IF ( MAX_V(II) .GT. MAX_D(II) .AND. + MAX_V(II) .GT. MAX_H(JJ-II) .AND. + MAX_V(II) .GT. 0.0) THEN BESTNOW = MAX_V(II) LH2(II,JJ-II) = 10000 + LDEL_V(II) ELSE IF ( MAX_H(JJ-II) .GT. MAX_D(II) .AND. + MAX_H(JJ-II) .GT. MAX_V(II) .AND. + MAX_H(JJ-II) .GT. 0.0) THEN BESTNOW = MAX_H(JJ-II) LH2(II,JJ-II) = 20000 + LDEL_H(JJ-II) ELSE BESTNOW = 0.0 LH2(II,JJ-II ) = 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN BESTNOW= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= BESTNOW UP_LH(II) = BESTNOW IF (BESTNOW .GT. BESTVAL) THEN BESTVAL=BESTNOW BESTIIPOS=II BESTJJPOS=JJ-II ENDIF ENDDO C==================================================================== C next antidiagonal C==================================================================== ENDDO C-------------------------------------------------------------------- C take sequences as representatives of family C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 4) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG CPAR$ DO_PARALLEL cccC$DIR PARALLEL cvd$ nodepchk DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 MAX_H(JJ-II)= MAX_H(JJ-II)- + ( (ELONG_GAP_1(II-1)+ELONG_GAP_2(JJ-II-1)) *0.5) IF (MAX_H(JJ-II) .GT. (LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .AND. MAX_H(JJ-II) .GT.0.0 ) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF ( (LEFT_LH(JJ-II) - ((OPEN_GAP_1(II-1)+ + OPEN_GAP_2(JJ-II-1))*0.5)) .GE. MAX_H(JJ-II) + .AND. (LEFT_LH(JJ-II) - ((OPEN_GAP_1(II-1)+ + OPEN_GAP_2(JJ-II-1))*0.5)) .GT. 0.0 ) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II) = (LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) ELSE MAX_H(JJ-II) = 0.0 LDEL_H(JJ-II)= 0 ENDIF C===================================================================== C STORE BEST VALUE AND LENGTH FOR VERTICAL DELETION C===================================================================== MAX_V(II)= (MAX_V(II)- + ( (ELONG_GAP_1(II-1)+ELONG_GAP_2(JJ-II-1)) *0.5)) IF ( MAX_V(II) .GT. (UP_LH(II)- + ((OPEN_GAP_1(II-1) +OPEN_GAP_2(JJ-II-1)) *0.5)) + .AND. MAX_V(II) .GT. 0.0 ) THEN LDEL_V(II) = LDEL_V(II) + 1 ELSE IF ( (UP_LH(II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GE. MAX_V(II) + .AND. (UP_LH(II)- ((OPEN_GAP_1(II-1) + + OPEN_GAP_2(JJ-II-1)) *0.5)) .GT. 0.0 ) THEN MAX_V(II)=(UP_LH(II)- + ((OPEN_GAP_1(II-1) +OPEN_GAP_2(JJ-II-1)) *0.5)) LDEL_V(II)= 1 ELSE MAX_V(II) = 0.0 LDEL_V(II)= 0 ENDIF C====================================================================== C which value is the best (diagonal,horizontal or vertical) C store traceback C LDIREC and LDEL are coded in one number C====================================================================== MAX_D(II)= DIAG_LH(JJ-II) + + (( METRIC_1 (II-1,LSQ_2(JJ-II-1)) + + METRIC_2 (JJ-II-1,LSQ_1(II-1)) ) * 0.5) IF ( MAX_D(II) .GE. MAX_V(II) .AND. + MAX_D(II) .GE. MAX_H(JJ-II) .AND. + MAX_D(II) .GT. 0.0) THEN BESTNOW = MAX_D(II) ELSE IF ( MAX_V(II) .GT. MAX_D(II) .AND. + MAX_V(II) .GT. MAX_H(JJ-II) .AND. + MAX_V(II) .GT. 0.0) THEN BESTNOW = MAX_V(II) LH2(II,JJ-II ) = 10000 + LDEL_V(II) ELSE IF ( MAX_H(JJ-II) .GT. MAX_D(II) .AND. + MAX_H(JJ-II) .GT. MAX_V(II) .AND. + MAX_H(JJ-II) .GT. 0.0) THEN BESTNOW = MAX_H(JJ-II) LH2(II,JJ-II ) = 20000 + LDEL_H(JJ-II) ELSE BESTNOW = 0.0 LH2(II,JJ-II ) = 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN BESTNOW= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= BESTNOW UP_LH(II) = BESTNOW IF (BESTNOW .GT. BESTVAL) THEN BESTVAL=BESTNOW BESTIIPOS=II BESTJJPOS=JJ-II ENDIF ENDDO C==================================================================== C next antidiagonal C==================================================================== ENDDO C-------------------------------------------------------------------- C take maximal value as consensus C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 5) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG CPAR$ DO_PARALLEL cccC$DIR PARALLEL cvd$ nodepchk DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 MAX_H(JJ-II)=MAX_H(JJ-II) - + ((ELONG_GAP_1(II-1)+ELONG_GAP_2(JJ-II-1))*0.5) IF (MAX_H(JJ-II) .GT. (LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .AND. MAX_H(JJ-II) .GT. 0.0 ) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF ((LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GE.MAX_H(JJ-II) .AND. (LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GT. 0.0) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II) = (LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) ELSE MAX_H(JJ-II) = 0.0 LDEL_H(JJ-II)= 0 ENDIF C===================================================================== C store best value and length for vertical deletion C===================================================================== MAX_V(II)= MAX_V(II) - + ( (ELONG_GAP_1(II-1)+ELONG_GAP_2(JJ-II-1))*0.5) IF ( MAX_V(II) .GT. (UP_LH(II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) .AND. + MAX_V(II) .GT. 0.0 ) THEN LDEL_V(II) = LDEL_V(II) + 1 ELSE IF ((UP_LH(II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GE.MAX_V(II) .AND. (UP_LH(II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GT. 0.0) THEN MAX_V(II)= (UP_LH(II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) LDEL_V(II)= 1 ELSE MAX_V(II) = 0.0 LDEL_V(II)= 0 ENDIF C====================================================================== C which value is the best (diagonal,horizontal or vertical) C store traceback C LDIREC and LDEL are coded in one number C====================================================================== MAX_D(II) = DIAG_LH(JJ-II) + + ((MAX_METRIC_1_VAL(II-1) + + MAX_METRIC_2_VAL(II-1)) * 0.5) IF ( MAX_D(II) .GE. MAX_V(II) .AND. + MAX_D(II) .GE. MAX_H(JJ-II) .AND. + MAX_D(II) .GT. 0.0) THEN BESTNOW = MAX_D(II) ELSE IF ( MAX_V(II) .GT. MAX_D(II) .AND. + MAX_V(II) .GT. MAX_H(JJ-II) .AND. + MAX_V(II) .GT. 0.0) THEN BESTNOW = MAX_V(II) LH2(II,JJ-II ) = 10000 + LDEL_V(II) ELSE IF ( MAX_H(JJ-II) .GT. MAX_D(II) .AND. + MAX_H(JJ-II) .GT. MAX_V(II) .AND. + MAX_H(JJ-II) .GT. 0.0) THEN BESTNOW = MAX_H(JJ-II) LH2(II,JJ-II ) = 20000 + LDEL_H(JJ-II) ELSE BESTNOW = 0.0 LH2(II,JJ-II ) = 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN BESTNOW= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= BESTNOW UP_LH(II) = BESTNOW IF (BESTNOW .GT. BESTVAL) THEN BESTVAL=BESTNOW BESTIIPOS=II BESTJJPOS=JJ-II ENDIF ENDDO C==================================================================== C next antidiagonal C==================================================================== ENDDO C==================================================================== ELSE IF (PROFILEMODE .EQ. 6) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG CPAR$ DO_PARALLEL cccC$DIR PARALLEL cvd$ nodepchk DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 C===================================================================== MAX_H(JJ-II) = (MAX_H(JJ-II) - ELONG_GAP_1(II-1)) IF ((MAX_H(JJ-II).GE.(LEFT_LH(JJ-II)-OPEN_GAP_1(II-1))) + .AND. (MAX_H(JJ-II) .GT.0.0 )) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF (((LEFT_LH(JJ-II)-OPEN_GAP_1(II-1)) .GE. + MAX_H(JJ-II)) + .AND. ((LEFT_LH(JJ-II)-OPEN_GAP_1(II-1)) + .GT. 0.0)) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II)= (LEFT_LH(JJ-II) - OPEN_GAP_1(II-1)) ELSE MAX_H(JJ-II)= 0.0 LDEL_H(JJ-II)= 0 ENDIF c===================================================================== MAX_V(II) = (MAX_V(II) - ELONG_GAP_1(II-1)) IF ( (MAX_V(II).GE.(UP_LH(II)-OPEN_GAP_1(II-1))) .AND. + ( MAX_V(II) .GT. 0.0) ) THEN LDEL_V(II)= LDEL_V(II) + 1 ELSE IF ( ((UP_LH(II)-OPEN_GAP_1(II-1)).GE.MAX_V(II)) + .AND. ((UP_LH(II) - OPEN_GAP_1(II-1)) + .GT. 0.0)) THEN MAX_V(II)= (UP_LH(II) - OPEN_GAP_1(II-1)) LDEL_V(II)=1 ELSE MAX_V(II)= 0.0 LDEL_V(II)= 0 ENDIF C====================================================================== MAX_D(II)= DIAG_LH(JJ-II) + + SIMORG(LSQ_1(II-1),LSQ_2(JJ-II-1),LSTRCLASS_1(II-1), + LACC_1(II-1),LSTRCLASS_2(JJ-II-1), + LACC_2(JJ-II-1) ) IF ( (MAX_D(II) .GE. MAX_V(II) ) .AND. + (MAX_D(II) .GE. MAX_H(JJ-II)) .AND. + (MAX_D(II) .GT. 0.0 )) THEN BESTNOW= MAX_D(II) ELSE IF ( (MAX_V(II) .GE. MAX_D(II) ) .AND. + (MAX_V(II) .GE. MAX_H(JJ-II)) .AND. + (MAX_V(II) .GT. 0.0 )) THEN BESTNOW= MAX_V(II) LH2(II,JJ-II)= 10000 + LDEL_V(II) ELSE IF ( (MAX_H(JJ-II) .GE. MAX_D(II)) .AND. + (MAX_H(JJ-II) .GE. MAX_V(II)) .AND. + (MAX_H(JJ-II) .GT. 0.0 )) THEN BESTNOW= MAX_H(JJ-II) LH2(II,JJ-II)= 20000 + LDEL_H(JJ-II) ELSE BESTNOW= 0.0 LH2(II,JJ-II)= 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN BESTNOW= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= BESTNOW UP_LH(II)= BESTNOW IF (BESTNOW .GT. BESTVAL) THEN BESTVAL=BESTNOW BESTIIPOS=II BESTJJPOS=JJ-II ENDIF ENDDO C==================================================================== C next antidiagonal C==================================================================== ENDDO C==================================================================== C PROFILE MODE SELECTION END ENDIF C==================================================================== C debug: output the LH (values and trace-back)matrix c call open_file(99,'matrix.dat','new,recl=2000',lerror) c nii=n1+1 ; njj=n2+1 c write(99,*)'TRACE-BACK MATRIX' c do i=1,nii c write(99,'(i6)')i ; write(99,'(2x,20(i6))')(lh2(i,j),j=1,njj) c enddo c close(99) C======================================================================= RETURN END C END SETMATRIX_FAST C...................................................................... C...................................................................... C SUB SETPIECES SUBROUTINE SETPIECES(MAXALSQ,ALI_1,ALI_2,LENALI,IFIR, + ILAS,JFIR,JLAS,IPIECE,MAXPIECES,NPIECES) C RS 89 C cut a sequence alignment in pieces if there are insertions/deletions C or chain-breaks (used in ALITOSTRUCRMS) C CAUTION : dont use an alignment like in MAXHOM (HSSP) C (no insertion in SEQ 1) C alignment must be : C ALI_1 : ACVEFG....FGHKLIPYDFGAS!KLHKLH C ALI_2 : ACAEFGAAAAFGH...PYDFGAS!KLHKLH C | | | | | | | | C PIECE : | 1 | |2| | 3 | |4 | C insertions must be marked by "." C chain-breaks by "!" C INPUT: C ALI_1,ALI_2 : sequence string for seq 1 and seq 2 (CHARACTER*(*)) C LENALI : length of the total alignment (include insertions) C IFIR,ILAS : first and last position of seq 1 (absolut position) C JFIR,JLAS : first and last position of seq 2 (absolut position) C OUTPUT: C IPIECE(2,2,MAXPIECES) : 1. index= begin and end of piece C 2. index= sequence 1 or sequence 2 C 3. index= number of piece C NPIECES : total number of pieces C INTERNAL: C ICOUNT : count alignend positions to get absolute position IMPLICIT NONE INTEGER MAXPIECES,MAXALSQ CHARACTER*1 ALI_1(MAXALSQ),ALI_2(MAXALSQ) INTEGER IPIECE(2,2,MAXPIECES),NPIECES, + LENALI,IFIR,ILAS,JFIR,JLAS C INTERNAL INTEGER IBEG,IEND,JBEG,JEND,ICOUNT,ISTART,ISTOP,K,I c init IBEG=IFIR IEND=ILAS JBEG=JFIR JEND=JLAS NPIECES=1 ICOUNT=0 C D WRITE(6,*)(ALI_1(I),I=1,LENALI) C D WRITE(6,*)(ALI_2(I),I=1,LENALI) C AUTION: FIRST AND LAST CHARACTER IN THE ALIGNMENT IS '<' IF (ALI_1(1).EQ.'<' .AND. ALI_2(1).EQ.'<') THEN ISTART=2 ELSE ISTART=1 ENDIF IF (ALI_1(LENALI).EQ.'<' .AND. ALI_2(LENALI).EQ.'<') THEN ISTOP=LENALI-1 ELSE ISTOP=LENALI ENDIF IF (ALI_1(ISTART).EQ. '!' .AND. ALI_2(ISTART) .EQ. '!') THEN ISTART=ISTART+1 IBEG=IBEG+1 JBEG=JBEG+1 ENDIF IF (ALI_1(ISTOP).EQ. '!' .AND. ALI_2(ISTOP) .EQ. '!') THEN ISTOP=ISTOP-1 IEND=ILAS-1 JEND=JEND-1 ILAS=IEND JLAS=JEND ENDIF C SET DEFAULT TO BEGIN AND END OF ALIGNMENT IPIECE(1,1,NPIECES)=IBEG IPIECE(2,1,NPIECES)=IEND IPIECE(1,2,NPIECES)=JBEG IPIECE(2,2,NPIECES)=JEND DO K=ISTART,ISTOP CALL CHECKRANGE(NPIECES,1,MAXPIECES,'MAXPIECES ','SETPIECES ') C search for an insertion in SEQuence 1 IF (ALI_1(K).EQ.'.' .AND. ALI_2(K).NE. '!') THEN C if: set end of previous piece, store piece in IPIECE and set begin C of next piece IF (ALI_1(K-1).NE.'.' .AND. ALI_1(K-1).NE.'!') THEN IEND=IBEG+ICOUNT-1 JEND=JBEG+ICOUNT-1 IPIECE(1,1,NPIECES)=IBEG IPIECE(2,1,NPIECES)=IEND IPIECE(1,2,NPIECES)=JBEG IPIECE(2,2,NPIECES)=JEND C D WRITE(6,*)' 1 SET PIECE : ',IBEG,IEND,JBEG,JEND C D WRITE(6,*)(ALI_1(I),I=1,K) WRITE(6,*)(ALI_2(I),I=1,K) IBEG=IEND+1 JBEG=JEND+2 NPIECES=NPIECES+1 ELSE JBEG=JBEG+1 ENDIF ICOUNT=0 C search for an insertion in SEQuence 2 ELSE IF (ALI_2(K).EQ.'.' .AND. ALI_1(K).NE.'!') THEN C if: set end of previous piece, store piece in IPIECE and set begin C of next piece IF (ALI_2(K-1).NE.'.' .AND. ALI_2(K-1).NE.'!') THEN IEND=IBEG+ICOUNT-1 JEND=JBEG+ICOUNT-1 IPIECE(1,1,NPIECES)=IBEG IPIECE(2,1,NPIECES)=IEND IPIECE(1,2,NPIECES)=JBEG IPIECE(2,2,NPIECES)=JEND C D WRITE(6,*)' 2 SET PIECE : ',IBEG,IEND,JBEG,JEND C D WRITE(6,*)(ALI_1(I),I=1,K) ; WRITE(6,*)(ALI_2(I),I=1,K) IBEG=IEND+2 JBEG=JEND+1 NPIECES=NPIECES+1 ELSE IBEG=IBEG+1 ENDIF ICOUNT=0 C search for a chain-break in SEQuence 1 or SEQuence 2 and set piece ELSE IF (ALI_1(K).EQ.'!' .OR. ALI_2(K).EQ.'!') THEN IF (ALI_1(K-1).NE.'.' .AND. ALI_2(K-1).NE.'.') THEN IEND=IBEG+ICOUNT-1 JEND=JBEG+ICOUNT-1 IPIECE(1,1,NPIECES)=IBEG IPIECE(2,1,NPIECES)=IEND IPIECE(1,2,NPIECES)=JBEG IPIECE(2,2,NPIECES)=JEND C D WRITE(6,*)' 3 SET PIECE : ',IBEG,IEND,JBEG,JEND C D WRITE(6,*)(ALI_1(I),I=1,K) ; WRITE(6,*)(ALI_2(I),I=1,K) NPIECES=NPIECES+1 IBEG=IEND+1 JBEG=JEND+1 ENDIF IF (ALI_1(K).EQ.'!' .AND. ALI_2(K).EQ.'!') THEN IBEG=IBEG+1 JBEG=JBEG+1 ELSE IF (ALI_1(K).EQ.'!') THEN IBEG=IBEG+1 ELSE IF (ALI_2(K).EQ.'!') THEN JBEG=JBEG+1 ENDIF ICOUNT=0 ELSE ICOUNT=ICOUNT+1 ENDIF ENDDO IPIECE(1,1,NPIECES)=IBEG IPIECE(2,1,NPIECES)=ILAS IPIECE(1,2,NPIECES)=JBEG IPIECE(2,2,NPIECES)=JLAS cd WRITE(6,*)' 4 set piece : ',ibeg,ilas,jbeg,jlas RETURN END C END SETPIECES C...................................................................... C...................................................................... C SUB SKIP_BRKCHAIN SUBROUTINE SKIP_BRKCHAIN(KIN,RLEN,FIRSTLINE,ERROR) C 15.5.93 IMPLICIT NONE C Import INTEGER KIN,RLEN CHARACTER*(*) FIRSTLINE C EXPORT LOGICAL ERROR C .. AND NEW LOCATION OF READ POINTER FOR UNIT KIN C INTERNAL INTEGER LINELEN PARAMETER (LINELEN= 1000) CHARACTER*(LINELEN) LINE *----------------------------------------------------------------------* IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF ERROR = .FALSE. LINE = FIRSTLINE DO WHILE ( LINE(1:3) .NE. 'TER' ) READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO GOTO 2 1 ERROR = .TRUE. WRITE(6,'(a)') ' ** error reading BRK file **' 2 CONTINUE RETURN END C END SKIP_BRKCHAIN C...................................................................... C...................................................................... C SUB SKIP_DSSPCHAIN SUBROUTINE SKIP_DSSPCHAIN(KIN,RLEN,FIRSTLINE,ERROR) C 15.5.93 IMPLICIT NONE C Import INTEGER KIN,RLEN CHARACTER*(*) FIRSTLINE C EXPORT LOGICAL ERROR C .. AND NEW LOCATION OF READ POINTER FOR UNIT KIN C INTERNAL INTEGER LINELEN PARAMETER (LINELEN= 1000) CHARACTER*(LINELEN) LINE *----------------------------------------------------------------------* IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF ERROR = .FALSE. LINE = FIRSTLINE DO WHILE ( LINE(14:14) .NE. '!' ) READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO GOTO 2 1 ERROR = .TRUE. WRITE(6,'(A)') ' ** error reading DSSP file **' 2 CONTINUE RETURN END C END SKIP_DSSPCHAIN C...................................................................... C...................................................................... C SUB SKIP_HSSPCHAIN SUBROUTINE SKIP_HSSPCHAIN(KIN,RLEN,FIRSTLINE,ERROR) C 18.5.93 IMPLICIT NONE C Import INTEGER KIN,RLEN CHARACTER*(*) FIRSTLINE C EXPORT LOGICAL ERROR C .. AND NEW LOCATION OF READ POINTER FOR UNIT KIN C INTERNAL INTEGER LINELEN PARAMETER (LINELEN= 1000) CHARACTER*(LINELEN) LINE *----------------------------------------------------------------------* IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF LINE = FIRSTLINE DO WHILE ( LINE(15:15) .NE. '!' ) READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO GOTO 2 1 ERROR = .TRUE. WRITE(6,'(a)') '*** ERROR SKIP_HSSPCHAIN reading HSSP file' 2 CONTINUE RETURN END C END SKIP_HSSPCHAIN C...................................................................... C...................................................................... C SUB STR_TO_INT SUBROUTINE STR_TO_INT(NRES,STRUC,LSTRUC,STRSTATES) IMPLICIT NONE INTEGER NRES,LSTRUC(*) CHARACTER*(*) STRUC(*),STRSTATES c internal INTEGER I c======================================================================= DO I=1,NRES LSTRUC(I)=INDEX(STRSTATES,STRUC(I)) IF (LSTRUC(I) .EQ. 0) THEN WRITE(6,*)' unknown structure in str_to_int:',struc(i),':' WRITE(6,*)STRSTATES CALL FLUSH_UNIT(6) STOP ENDIF ENDDO RETURN END C END STR_TO_INT C...................................................................... C...................................................................... C SUB STR_TO_CLASS SUBROUTINE STR_TO_CLASS(MAXSTATES,STR_STATES,NRES, + STRUC,CLASS,ICLASS) C convert DSSP-secondary structure symbol to secondary structure C classes (L,H,E..) ; first symbol in str_states(x) C str_states(1)='L TCStclss' C str_states(2)='EBAPMebapm' C str_states(3)='HGIhgiiiii' C given STRUC, what is the class number ICLASS and class representative CLASS ? C undefined states is set CLASS='U', ICLASS=0 C IMPLICIT NONE C input INTEGER MAXSTATES,NRES CHARACTER*(*) STRUC(*),STR_STATES(*) c output CHARACTER*(*) CLASS INTEGER ICLASS(*) C internal INTEGER I,J C====================================================================== DO I=1,NRES DO J=1,MAXSTATES IF ( INDEX(STR_STATES(J),STRUC(I)) .NE. 0) THEN GOTO 100 ENDIF ENDDO c iclass(i)=0 c class(i:i)='U' c WRITE(6,*)' symbol not known in STR_TO_CLASS: ',struc(i) 100 ICLASS(I)=J CLASS(I:I)=STR_STATES(J)(1:1) c WRITE(6,*)i,j,iclass(i),str_states(j)(1:1) ENDDO RETURN END C END STR_TO_CLASS C...................................................................... C...................................................................... C SUB StringLen SUBROUTINE STRINGLEN(CSTRING,ILEN) C searches for the last non-blank character in a string CHARACTER*(*) CSTRING INTEGER ILEN ILEN=LEN(CSTRING) DO WHILE(ILEN.GT.0 .AND. CSTRING(ILEN:ILEN).EQ. ' ') ILEN=ILEN-1 ENDDO RETURN END C END STRINGLEN C...................................................................... C...................................................................... C SUB STRTRIM SUBROUTINE STRTRIM(SOURCE,DEST,LENGTH) C StrTrim(Source,Dest,Length): Dest=Source(-1:-1)//filled with blanks C Length=length of Source(-1:-1) C ------------------------------------------------------------------- CHARACTER*(*) SOURCE,DEST CHARACTER*500 TEMPSTRING TEMPSTRING=' ' LENGTH=0 ISTART=1 ISTOP= LEN(SOURCE) IF (ISTOP .GT. LEN(TEMPSTRING) ) THEN WRITE(6,*)' STRTRIM: tempstring too short' STOP ENDIF I=1 DO WHILE (SOURCE(I:I) .EQ. ' ') I=I+1 IF (I .GT. LEN(SOURCE)) RETURN ENDDO ISTART=I I=LEN(SOURCE) DO WHILE (SOURCE(I:I) .EQ. ' ') I=I-1 IF (I .LE. 1)GOTO 10 ENDDO 10 ISTOP=I LENGTH=ISTOP-ISTART+1 TEMPSTRING(1:)= SOURCE(ISTART:ISTOP) DEST(1:LENGTH)=TEMPSTRING(1:LENGTH) DO I=LENGTH+1,LEN(DEST) DEST(I:I)=' ' ENDDO RETURN END C END STRTRIM C...................................................................... C...................................................................... C SUB STRPOS SUBROUTINE STRPOS(SOURCE,ISTART,ISTOP) C StrPos(Source,IStart,IStop): Finds the positions of the first and C last non-blank/non-TAB in Source. IStart=IStop=0 for empty Source CHARACTER*(*) SOURCE INTEGER ISTART,ISTOP ISTART=0 ISTOP=0 DO J=1,LEN(SOURCE) IF (SOURCE(J:J).NE.' ') THEN ISTART=J GOTO 20 ENDIF ENDDO RETURN 20 DO J=LEN(SOURCE),1,-1 IF (SOURCE(J:J).NE.' ') THEN ISTOP=J RETURN ENDIF ENDDO ISTART=0 ISTOP=0 RETURN END C END STRPOS C...................................................................... C...................................................................... C SUB STRUC_CLASS SUBROUTINE STRUC_CLASS(MAXSTRSTATES,STR_CLASSES, + STRUC,CLASS,ICLASS) C given struc, what is the class number ICLASS and class C representative CLASS ? C undefined states is set CLASS='U', ICLASS=0 INTEGER MAXSTRSTATES,ICLASS C---- br 99.03: watch hard_coded here, see maxhom.param CHARACTER*10 STR_CLASSES(MAXSTRSTATES) C---- --> REASON: the following produces warnings on SGI C CHARACTER*(*) STR_CLASSES(MAXSTRSTATES) CHARACTER STRUC,CLASS C INTERNAL INTEGER I CLASS='U' ICLASS=0 DO I=1,MAXSTRSTATES IF (INDEX(STR_CLASSES(I),STRUC) .NE. 0 ) THEN ICLASS=I CLASS=STR_CLASSES(I)(1:1) RETURN ENDIF ENDDO C WRITE(6,*)' SYMBOL NOT KNOWN IN STRUC_CLASS: ',STRUC RETURN END C END STRUC_CLASS C...................................................................... C...................................................................... C SUB STRTOINT SUBROUTINE STRTOINT(NRES,CSTR,LSTR,LDSSP) IMPLICIT NONE INTEGER NRES,LSTR(*) CHARACTER CSTR(*) LOGICAL LDSSP C internal INTEGER I CHARACTER*25 STRSTATES STRSTATES=' LTCSltcsEBAPMebapmHGIhgi' C======================================================================= IF (LDSSP) THEN DO I=1,NRES LSTR(I)=INDEX(STRSTATES,CSTR(I)) IF (LSTR(I).EQ.0) THEN WRITE(6,*)' UNKNOWN STRUCTURE IN STRTOINT: ',CSTR(I) STOP ENDIF ENDDO ELSE DO I=1,NRES LSTR(I)=0 CSTR(I)='U' ENDDO ENDIF RETURN END C END STRTOINT C...................................................................... C...................................................................... C SUB SWISSPROTRELEASE SUBROUTINE SWISSPROTRELEASE(KIN,INFILE,RELEASE,NENTRIES, + NRESIDUE) C IMPORT CHARACTER*(*) INFILE INTEGER KIN C EXPORT REAL RELEASE INTEGER NENTRIES,NRESIDUE c internal CHARACTER*200 LINE LOGICAL LERROR C...................................................................... C reads the latest version number and number of sequences of SWISSPROT C on VAX at EMBL logical filename is SWISS_PROT$RELEASE:RELNOTES.DOC C this file contains somwhere the following lines: C Release Date Number of entries Nb of amino acids C C 3.0 11/86 4160 969 641 C 4.0 04/87 4387 1 036 010 C 12.0 10/89 12305 3 797 482 C CSome 1466 new sequences have been added since the ............ C...................................................................... CALL OPEN_FILE(KIN,INFILE,'OLD,READONLY',LERROR) IF (LERROR .EQV. .TRUE.) THEN RELEASE= 0.0 NENTRIES=0 NRESIDUE=0 WRITE(6,*)'Error: No SwissProt release info found ' RETURN ENDIF LINE=' ' DO WHILE( INDEX (LINE,'Release Date Number of').EQ.0 .AND. + INDEX (LINE,'Release Date ').EQ.0) READ(KIN,'(A)')LINE ENDDO DO I=1,3 READ(KIN,'(A)')LINE ENDDO DO WHILE (LINE .NE. ' ') READ(LINE,'(3X,F4.1,25X,I6,10X,I12)')RELEASE,NENTRIES,NRESIDUE READ(KIN,'(A)')LINE ENDDO CLOSE(KIN) RETURN END C END SWISSPROTRELEASE C...................................................................... C...................................................................... C SUB TRACE SUBROUTINE TRACE(ISET,ND1,ND2,LH2,IPOSBEG,JPOSBEG,VALUE, + II,JJ,NTEST,SDEV,IALIGN,NRECORD) C NOTE: TRACE will aplpy threshold, and return LCONSIDER=.FALSE. C if below threshold! C C coming in with protein IALIGN+1, i.e. IALIGN gives alignments C so far! C=================================================================== C LDIREC and LDEL are the traceback indices unpacked C from the LH matrix. C LDIREC=1 indicates an unmatched terminal sequence, C LDIREC=2 indicates a diagonal optimal path, C LDIREC=3 indicates a vertical path in the matrix, C LDIREC=4 indicates a horizonal traceback path. C LDEL is the length of the deletion/insertion for LDIREC=3 or 4 C CVAL accumulated similarity values C CMAXVAL accumulated self matches ==> similarity C------------------------------------------------------------------- C PROFILEMODE C 0: no profiles, just a simple sequence alignment C 1: profile for sequence 1 (and not for sequence 2) C 2: profile for sequence 2 (and not for sequence 1) C 3: full alignment of two profiles, without taking into account the C sequence (structure,I/O...) information C 4: take the sequences as a representative of the family C 5: take the maximal value at each position as a "consensus" sequence C------------------------------------------------------------------- C weighted gap: C here opening and elongation are weighted C LDIREC=4 : horizontal deletion C LDIREC=3 : vertical deletion C C =======================================> II (matrix position) C | \ C | \ \ C | \ \ C | \ ^ open: II C | \ | elongation: JJ-LDEL+1 ==> JJ-1 C V \ LDEL | C JJ \<----------------------- II,JJ C \ C open: II-LDEL \ C elongation: II-LDEL+1 ====> II-1 \ C \ C C horizontal gap-open: C GAPOPEN(II-LDEL) * cons-weight (II-LDEL) C horizontal gap-elong: C sum (GAPELONG(II-LDEL+1) * cons-weight (II-LDEL+1) ===> (II-1) ) C------------------------------------------------------------------- IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C import INTEGER ISET,ND1,ND2 C REAL LH1(0:*) INTEGER*2 LH2(0:ND1,0:ND2) C INTEGER*2 LH2(0:*) C REAL LH(0:*) REAL VALUE,SDEV INTEGER II,JJ,NTEST,IPOSBEG,JPOSBEG C export C alignment attributes of the MAXALIGNS best alignments INTEGER IALIGN,NRECORD C======================================================================= C internal INTEGER MAXTRACE_LOC C PARAMETER (MAXTRACE_LOC= 150000) C PARAMETER (MAXTRACE_LOC= 9151515) PARAMETER (MAXTRACE_LOC= 30303030) CHARACTER*1 SK_1,SK_2 INTEGER IDELETION INTEGER INSPOINTER_LOCAL, + ITEMP_NO1(MAXTRACE_LOC),ITEMP_NO2(MAXTRACE_LOC), + ENDMARK,INDELMARK, + I,J,K,M,ILAS,JLAS,LDEL_DIREC, + LDEL,LEM,IFIR,JFIR,IDAL, + IDSAL,NDEL,ICLASS,LEN1,LENOCC,LINELEN,IBLOCKLEN, + ISTART,IPOS,JPOS,NLINE,IBEG,IEND, + LINETHICK,LEN_INSSEQ CHARACTER CTEMP_CHAIN1(MAXTRACE_LOC), + CTEMP_CHAIN2(MAXTRACE_LOC), + CTEMP*20000, + LINE(4)*(MAXALSQ) CHARACTER*4 PID_1,PID_2 REAL MAXDEVIATION, + MAX1,MAX2,SUM,SIM,CVAL,CMAXVAL,SELFSIM, + OPENWEIGHT,ELONGWEIGHT,W,PER,HOM, + DISTANCE,RMS LOGICAL LERROR,LKINK,LCALPHA,LDBG_LOCAL C CHARACTER*200 ERRORFILE C==================================================================== C init c line(2)=' ' ; line(3)=' ' c do i=1,maxalsq ; al_agree(i)=' ' ; sal_agree(i)=' ' ; enddo C BR 99.09: to write out some dbg msg LDBG_LOCAL= .FALSE. C LDBG_LOCAL= .TRUE. OPENWEIGHT= 0.0 ELONGWEIGHT= 0.0 C ENDMARK : '<' C INDELMARK: '.' ENDMARK= 999 INDELMARK= 99 M= 0 NTEST= NTEST+1 ILAS= II-1 JLAS= JJ-1 SIM= 0.0 CVAL= 0.0 CMAXVAL= 0.0 DISTANCE= 0.0 IINS= 0 INSPOINTER_LOCAL=1 INSSEQ= ' ' C LEN_NAME= LEN(NAME_2) C LEN_COMPND= LEN(COMPND_2) C LEN_ACCESSION= LEN(ACCESSION_2) C LEN_PDBREF= LEN(PDBREF_2) C LEN_LINE= LEN( LINE(1) ) LEN_INSSEQ= LEN(INSSEQ) c check subscripts IF (II .LE. 0 .OR. JJ .LE. 0) THEN WRITE(6,*)' FATAL ERROR IN TRACE' WRITE(6,*)' SUBSCRIPT OF II or JJ OUT OF RANGE',II,JJ STOP ENDIF C===================== TRACE BACK =================================== C alignment via loop back to 100 100 IF ((II .GT. IPOSBEG) .AND. (JJ .GT. JPOSBEG)) THEN LDEL_DIREC =ABS( LH2(II,JJ) ) c if (.not. lbackward) then c call get_ldirec_fast(nd1,nd2,lh2,ii,jj,ldel_direc) c else c call get_ldirec(nd1,nd2,lh2,ii,jj,ldel_direc) c endif c======================================================================= C diagonal: LIKE H(II,JJ) AND SEQ(II-1,1) SEQ(JJ-1,1) C======================================================================= IF (LH2(II,JJ) .EQ. -1) THEN RETURN ELSE IF (LDEL_DIREC .EQ. 1 ) THEN M=M+1 II=II-1 JJ=JJ-1 CALL CHECKRANGE(II,1,ND1-1,'subscr II','TRACE') CALL CHECKRANGE(JJ,1,ND2-1,'subscr JJ','TRACE') SELFSIM = 0.0 c-------------------------------------------------------------------- c no profile c selfsim is match with master sequence c-------------------------------------------------------------------- IF (PROFILEMODE .LT. 1) THEN SIM = METRIC_1(II,LSQ_2(JJ)) SELFSIM = METRIC_1(II,LSQ_1(II)) c-------------------------------------------------------------------- c profile 1 c selfsim is match with best possible match at position ii c-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 1) THEN SIM = METRIC_1(II,LSQ_2(JJ)) SELFSIM=-9999 DO K=1,NTRANS IF ( METRIC_1(II,K) .GT. SELFSIM ) THEN SELFSIM = METRIC_1(II,K) ENDIF ENDDO c-------------------------------------------------------------------- c profile 2 c-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 2) THEN SIM = METRIC_2(JJ,LSQ_1(II)) SELFSIM = METRIC_2(JJ,LSQ_2(JJ)) C-------------------------------------------------------------------- C full profile alignment C selfsim: sum ( (metric_1(i,k) * metric_1(i,k))+ C (metric_2(j,k) * metric_2(j,k) ) /2 ) C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 3) THEN SUM=0.0 SELFSIM=0.0 DO K=1,NTRANS SUM = SUM + ( METRIC_1(II,K) * METRIC_2(JJ,K) ) SELFSIM= SELFSIM + (METRIC_1(II,K) * METRIC_1(II,K)) WRITE(6,*)K,SUM,METRIC_1(II,K),METRIC_2(JJ,K), + METRIC_1(II,K) * METRIC_2(JJ,K) ENDDO c sim = (sum/ntrans) SIM = SUM c WRITE(6,*)sim,selfsim,metric_1(ii,1),metric_2(jj,1) C-------------------------------------------------------------------- C take sequences as representatives of family C selfsim: factor * (metric_1(i,lsq_1(i)) + metric_2(j,lsq_2(i))/2) C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 4) THEN SIM=(METRIC_1(II,LSQ_2(JJ))+METRIC_2( JJ,LSQ_1(II)) )*0.5 SELFSIM =METRIC_1( II,LSQ_1(II)) c-------------------------------------------------------------------- c take maximal value as consensus c selfsim ??? c-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 5) THEN MAX1=-10000.0 DO K=1,NTRANS IF (METRIC_1(II,K) .GT. MAX1)MAX1=METRIC_1(II,K) ENDDO MAX2=-10000.0 DO K=1,NTRANS IF (METRIC_2(JJ,K) .GT. MAX2)MAX2=METRIC_2(JJ,K) ENDDO SIM = ( (MAX1 + MAX2) * 0.5 ) SELFSIM=MAX(MAX1,MAX2) ELSE IF (PROFILEMODE .EQ. 6) THEN SIM = SIMORG(LSQ_1(II),LSQ_2(JJ),LSTRCLASS_1(II), + LACC_1(II),LSTRCLASS_2(JJ),LACC_2(JJ) ) ENDIF CVAL = CVAL + SIM CMAXVAL = CMAXVAL + SELFSIM LAL_1(M)=LSQ_1(II) LSAL_1(M)=LSTRCLASS_1(II) LAL_2(M)=LSQ_2(JJ) LSAL_2(M)=LSTRCLASS_2(JJ) ITEMP_NO1(M)=PDBNO_1(II) ITEMP_NO2(M)=PDBNO_2(JJ) CTEMP_CHAIN1(M)=CHAINID_1(II) CTEMP_CHAIN2(M)=CHAINID_2(JJ) ITRACE(M)=II JTRACE(M)=JJ GOTO 100 c======================================================================= C horizontal deletion C======================================================================= ELSE IF (LDEL_DIREC .GT. 20000) THEN LDEL=LDEL_DIREC - 20000 IF (PROFILEMODE .LE. 1) THEN OPENWEIGHT = OPEN_GAP_1(II-LDEL) ELSE IF (PROFILEMODE .EQ. 2) THEN OPENWEIGHT = OPEN_GAP_2(JJ-1) ELSE IF (PROFILEMODE .EQ. 6) THEN OPENWEIGHT = OPEN_GAP_1(II-LDEL) ELSE IF (PROFILEMODE .GE. 3) THEN OPENWEIGHT=(OPEN_GAP_1(II-LDEL) + OPEN_GAP_2(JJ-1)) * 0.5 ENDIF W = OPENWEIGHT DO I=II-LDEL+1,II-1 IF (PROFILEMODE .LE. 1) THEN ELONGWEIGHT = ELONG_GAP_1(I) ELSE IF (PROFILEMODE .EQ. 2) THEN ELONGWEIGHT = ELONG_GAP_2(JJ-1) ELSE IF (PROFILEMODE .EQ. 6) THEN ELONGWEIGHT = ELONG_GAP_1(I) ELSE IF (PROFILEMODE .GE. 3) THEN ELONGWEIGHT=(ELONG_GAP_1(I) + ELONG_GAP_2(JJ-1)) * 0.5 ENDIF W = W + ELONGWEIGHT ENDDO CVAL = CVAL - W DO K=1,LDEL M=M+1 ITRACE(M)=II-K JTRACE(M)=JJ-1 LAL_1(M) = LSQ_1(II-K) LAL_2(M) =INDELMARK LSAL_1(M)= LSTRCLASS_1(II-K) LSAL_2(M)=INDELMARK ITEMP_NO1(M)=PDBNO_1(II-K) ITEMP_NO2(M)=0 CTEMP_CHAIN1(M)=CHAINID_1(II-K) CTEMP_CHAIN2(M)=' ' ENDDO II=II-LDEL GOTO 100 C======================================================================= C VERTICAL DELETION C======================================================================= ELSE IF (LDEL_DIREC .GT. 10000) THEN LDEL=LDEL_DIREC - 10000 IF (PROFILEMODE .LE. 1) THEN OPENWEIGHT = OPEN_GAP_1(II-1) ELSE IF (PROFILEMODE .EQ. 2) THEN OPENWEIGHT = OPEN_GAP_2(JJ-LDEL) ELSE IF (PROFILEMODE .EQ. 6) THEN OPENWEIGHT = OPEN_GAP_1(II-1) ELSE IF (PROFILEMODE .GE. 3) THEN OPENWEIGHT=(OPEN_GAP_1(II-1) + OPEN_GAP_2(JJ-LDEL)) * 0.5 ENDIF W = OPENWEIGHT DO J=JJ-LDEL+1,JJ-1 IF (PROFILEMODE .LE. 1) THEN ELONGWEIGHT = ELONG_GAP_1(II-1) ELSE IF (PROFILEMODE .EQ. 2) THEN ELONGWEIGHT = ELONG_GAP_2(J) ELSE IF (PROFILEMODE .EQ. 6) THEN ELONGWEIGHT = ELONG_GAP_1(II-1) ELSE IF (PROFILEMODE .GE. 3) THEN ELONGWEIGHT=(ELONG_GAP_1(II-1) +ELONG_GAP_2(J)) * 0.5 ENDIF W = W + ELONGWEIGHT ENDDO CVAL = CVAL - W C store insertions of seq2: C iins: insertion counter C inslen: length of insertion C insbeg_1: DSSP position of insertion (last matched residue) C insbeg_1: position of the insertion in the alignend sequence C inspointer_local: pointer in the one-dim array for insertions C *aVGHYTREe: * is divider between different insertions C lower case characters are the residues before and C after the insertions IF (IINS+1 .LT. MAXINS) THEN IINS=IINS+1 INSLEN_LOCAL(IINS)=LDEL INSBEG_1_LOCAL(IINS)=II-1 INSBEG_2_LOCAL(IINS)=JJ-LDEL K=INSPOINTER_LOCAL IF (K+LDEL+3 .GT. LEN_INSSEQ) THEN WRITE(6,*)' ERROR: MAXINSBUFFER_LOCAL OVERFLOW: ' WRITE(6,*)' increase: ',len_insseq STOP ENDIF INSSEQ(K:K)='*' INSSEQ(K+1:K+1)=CSQ_2(JJ-LDEL-1:JJ-LDEL-1) INSSEQ(K+2:K+LDEL+2)=CSQ_2(JJ-LDEL:JJ-1) INSSEQ(K+LDEL+2:K+LDEL+2)=CSQ_2(JJ:JJ) CALL UPTOLOW(INSSEQ(K+1:K+1),1) CALL UPTOLOW(INSSEQ(K+LDEL+2:K+LDEL+2),1) INSPOINTER_LOCAL=INSPOINTER_LOCAL+LDEL+3 ELSE WRITE(6,*)' WARNING: maxins overflow: ',maxins WRITE(6,*)' insertion ingnored in HSSP-output' ENDIF DO K=1,LDEL M=M+1 JTRACE(M)= JJ-K ITRACE(M)=II-1 LAL_1(M) = INDELMARK LAL_2(M) = LSQ_2 (JJ-K) LSAL_1(M)= INDELMARK LSAL_2(M)= LSTRCLASS_2(JJ-K) ITEMP_NO1(M)=0 ITEMP_NO2(M)=PDBNO_2(JJ-K) CTEMP_CHAIN1(M)=' ' CTEMP_CHAIN2(M)=CHAINID_2(JJ-K) ENDDO JJ=JJ-LDEL GOTO 100 c======================================================================= C unmatched terminal sequence C======================================================================= Caution if you change this: decrease/increase ISTART/ISTOP in SETPIECES CP unnecessary complication CP do not add < CP do not have length+1 CP do not replot last point C ENDMARK is '<') C======================================================================= ELSE IF (LDEL_DIREC .EQ. 0 ) THEN M=M+1 LAL_1(M) =ENDMARK LAL_2(M) =ENDMARK LSAL_1(M)=ENDMARK LSAL_2(M)=ENDMARK ITEMP_NO1(M)=0 ITEMP_NO2(M)=0 c replot last point ITRACE(M)=ITRACE(M-1) JTRACE(M)=JTRACE(M-1) ELSE WRITE(6,*)' FATAL ERROR IN TRACE' WRITE(6,*)' LDEL_DIREC NOT KNOWN',LDEL_DIREC,ii,jj STOP ENDIF ENDIF c======================================================================= c end of trace back c======================================================================= CVAL = CVAL CMAXVAL = CMAXVAL C======================================================================= C aligned optimum subsequences of length M are in integer array C LAL_1(I) and LAL_2(J) C convert back to characters CALL INT_TO_SEQ(LAL_1,AL_1_ARRAY,M,TRANS,INDELMARK,ENDMARK) CALL INT_TO_SEQ(LAL_2,AL_2_ARRAY,M,TRANS,INDELMARK,ENDMARK) CALL INT_TO_STRCLASS(MAXSTRSTATES,MAXALSQ,M,LSAL_1, + STR_CLASSES,INDELMARK,ENDMARK,SAL_1_ARRAY) CALL INT_TO_STRCLASS(MAXSTRSTATES,MAXALSQ,M,LSAL_2, + STR_CLASSES,INDELMARK,ENDMARK,SAL_2_ARRAY) C process alignments C for terminal '<' M=LEM+1 IF (LAL_1(M) .EQ. ENDMARK) THEN LEM=M-1 ELSE LEM=M ENDIF IFIR=ITRACE(LEM) JFIR=JTRACE(LEM) CALL CHECKRANGE(LEM,1,MAXALSQ,'alilen LEM','TRACE') C======================================================================= C evaluate the alignments. C======================================================================= IDAL=0 IDSAL=0 IDELETION=0 NDEL=0 c count number of deletions/insertions c only if there is a '.' and the next character is no '.' DO K=1,M IF (K .LT. M) THEN IF (LAL_1(K) .EQ. INDELMARK) THEN IDELETION=IDELETION+1 IF (LAL_1(K+1) .NE. INDELMARK ) THEN NDEL=NDEL+1 ENDIF ENDIF IF (LAL_2(K) .EQ. INDELMARK) THEN IDELETION=IDELETION+1 IF (LAL_2(K+1) .NE. INDELMARK) THEN NDEL=NDEL+1 ENDIF ENDIF ENDIF IF (LAL_1(K) .EQ. LAL_2(K) .AND. LAL_1(K) .NE. ENDMARK) THEN IDAL=IDAL+1 AL_AGREE(K)= '*' ELSE AL_AGREE(K)= ' ' ENDIF C translate to three states H,E,L IF (LDSSP_1 .AND. LDSSP_2 ) THEN CALL STRUC_CLASS(MAXSTRSTATES,STR_CLASSES, + SAL_1_ARRAY(K),SK_1,ICLASS) CALL STRUC_CLASS(MAXSTRSTATES,STR_CLASSES, + SAL_2_ARRAY(K),SK_2,ICLASS) IF (SK_1 .EQ. SK_2 ) THEN IDSAL=IDSAL+1 SAL_AGREE(K)='+' ELSE SAL_AGREE(K)=' ' ENDIF ENDIF ENDDO C======================================================================= C LEN1 : is ILAS-IFIR+1 C LENOCC: is occupied postions (no INSDEL, used in HSSP) C LEM : length in SEQuence 1 including gaps C HOM : is identical postion / LENOCC C======================================================================= LEN1=ILAS-IFIR+1 LENOCC=0 DO I=1,ILAS-IFIR+1 IF (LAL_2(I) .NE. INDELMARK )LENOCC=LENOCC+1 ENDDO PER=VALUE/LENOCC cx per=value/lem HOM=FLOAT(IDAL)/FLOAT(LENOCC) IF (CMAXVAL .GT. -0.00001 .AND. CMAXVAL .LT. 0.00001) THEN SIM=0.0 ELSE SIM=(CVAL/CMAXVAL) ENDIF c WRITE(6,*)'trace ',cval,cmaxval C======================================================================= C test if threshold criterion is fulfilled (if specified) C======================================================================= LCONSIDER=.TRUE. IF (LTHRESHOLD .OR. LALL) THEN IF (LNEWCURVE) THEN CALL CHECKHSSPCUT99(LENOCC,HOM*100.0,ISOLEN,ISOIDE,NSTEP, + LFORMULA,LALL,ISAFE,LCONSIDER,DISTANCE) ELSE CALL CHECKHSSPCUT(LENOCC,HOM*100.0,ISOLEN,ISOIDE,NSTEP, + LFORMULA,LALL,ISAFE,LCONSIDER,DISTANCE) ENDIF ENDIF IF (CUTVALUE1 .GT. 0.0) THEN IF (VALUE .LT. (CMAXVAL/CUTVALUE1) ) LCONSIDER=.FALSE. ENDIF IF (CUTVALUE2 .GT. 0.0) THEN IF (VALUE .LT. CUTVALUE2 ) LCONSIDER=.FALSE. ENDIF C BR 99.09: write out debug IF (LDBG_LOCAL) THEN IF (LCONSIDER) THEN WRITE(6,'(A,I5,A)')' trace: nprot=',IALIGN+1,' take!' ELSE WRITE(6,'(A,I5,A)')' trace: nprot=',IALIGN+1,' reject!' ENDIF END IF C======================================================================= C compare 3D-structures of alignend fragments C======================================================================= LCALPHA=.TRUE. RMS=-1.0 IF (LCOMPSTR .AND. LCONSIDER) THEN IF (LDSSP_1 .AND. LDSSP_2 ) THEN CALL GETPIDCODE(NAME_1,PID_1) CALL FINDBRKFILE(BRKFILE_1,PDBPATH,PID_1,KBRK,KLOG,LERROR) IF (.NOT.LERROR) THEN CALL GETPIDCODE(NAME_2,PID_2) CALL FINDBRKFILE(BRKFILE_2,PDBPATH,PID_2,KBRK,KLOG, + LERROR) IF (.NOT.LERROR) THEN I=1 DO K=M,1,-1 ALI_1(I)=AL_1_ARRAY(K) ALI_2(I)=AL_2_ARRAY(K) I=I+1 ENDDO CALL ALITOSTRUCRMS(MAXALSQ,MAXSQ,BRKFILE_1,BRKFILE_2, + KBRK,PDBNO_1,CHAINID_1,PDBNO_2,CHAINID_2, + ALI_1,ALI_2,M,IFIR,ILAS,JFIR, + JLAS,LCALPHA,RMS) ENDIF ENDIF ENDIF ENDIF C=================================================================== C PRINT ALIGNED SEQS AND HOMOLGY VALUES.. C=================================================================== c if (ntest.eq.1) then c WRITE(6,*)'No IFIR ILAS JFIR JLAS NPOS NDEL '// c + 'VAL VPER NIDE IDE SIM RMS DIST' c endif c WRITE(6,1016)ntest,ifir,ilas,jfir,jlas,lenocc,ideletion, c + value,per,idal,hom,sim,rms,distance c1016 format(I4,2(1X,I4,'-',I4),2(I5),F7.2,F6.2,I5,1X,3(F6.1),F6.2) C======================================================================= C check value from setmatrix and recalculated value from trace back C======================================================================= LERROR=.FALSE. MAXDEVIATION=0.3 IF (ABS(CVAL-VALUE) .GT. MAXDEVIATION) LERROR=.TRUE. IF (LERROR) THEN WRITE(6,*)' *** FATAL ERROR IN TRACE ****' WRITE(6,*)' CVAL .NE. VALUE : ',CVAL,VALUE WRITE(6,*)' WRITE MATRIX AND TRACE BACK IN ??_MATRIX.ERROR' c$$$ call getpidcode(name_1,pid_1) c$$$ call concat_strings(pid_1,'_MATRIX.ERROR',errorfile) c$$$ call open_file(99,errorfile,'NEW,RECL=2000',lerror) c$$$ write(99,'(a,f12.5)')' CVAL : ',CVAL c$$$ write(99,'(a,f12.5)')' VALUE : ',VALUE c$$$C debug: output the LH (values and trace-back)matrix c$$$ write(99,*) 'H-MATRIX Hij' c$$$ write(99,*)'Index i runs for Sequence 1' c$$$ write(99,*)'Index j runs for Sequence 2' c$$$ do i=2,nd1 c$$$ write(99,'(i6)')i-1 c$$$ write(99,'(2x,20(f7.1))')(lh1(i,j),j=2,nd2) c$$$ enddo c$$$ write(99,*) ; write(99,*)'TRACE-BACK MATRIX' c$$$ do i=2,nd1 c$$$ write(99,'(i6)')i-1 c$$$ write(99,'(2x,20(f7.1))')(lh2(i,j),j=2,nd2) c$$$ enddo c$$$ close(99) STOP ENDIF IF (IALIGN+1 .GT. MAXALIGNS) THEN WRITE(6,*)'*** OVERFLOW, ALIGNMENTS TERMINATED ***' LALIOVERFLOW=.TRUE. RETURN ENDIF IALIGN=IALIGN+1 c alignments will be sorted according to this value IF (CSORTMODE .EQ. 'DISTANCE' ) THEN ALISORTKEY(IALIGN)=DISTANCE ELSE IF (CSORTMODE.EQ.'VALUE' .OR. CSORTMODE .EQ. 'ZSCORE') THEN ALISORTKEY(IALIGN)=VALUE ELSE IF (CSORTMODE .EQ. 'WSIM' ) THEN ALISORTKEY(IALIGN)=SIM ELSE IF (CSORTMODE .EQ. 'SIM' ) THEN ALISORTKEY(IALIGN)=SIM ELSE IF (CSORTMODE .EQ. 'SIGMA' ) THEN ALISORTKEY(IALIGN)=VALUE/SDEV ELSE IF (CSORTMODE .EQ. 'IDENTITY' ) THEN ALISORTKEY(IALIGN)=HOM ELSE IF (CSORTMODE .EQ. 'VALPER' ) THEN ALISORTKEY(IALIGN)=PER ELSE IF (CSORTMODE .EQ. 'VALFORM' ) THEN ALISORTKEY(IALIGN)=VALUE*(LENOCC**(-0.56158)) ELSE IF (CSORTMODE .EQ. 'NO' ) THEN ALISORTKEY(IALIGN)=FLOAT(MAXALIGNS - IALIGN) ENDIF C====================================================================== C STORE ALIGNMENTS IN FILE C====================================================================== IFILEPOI(IALIGN)=-999 IRECPOI(IALIGN)=-999 IF (LCONSIDER) THEN IFILEPOI(IALIGN)=ISET NRECORD=NRECORD+1 IRECPOI(IALIGN)=NRECORD IALIGN_GOOD=IALIGN_GOOD+1 WRITE(KCORE,REC=NRECORD)'*',LCONSIDER,VALUE C====================================================================== C WRITE AL_2_ARRAY(.) AND SAL_2_ARRAY(.) C mark insertions in SEQuence 1 by lower case letters in AL_2_ARRAY(*) DO K=M-1,2,-1 IF (LAL_1(K) .EQ. INDELMARK) THEN IF (LAL_1(K-1) .NE. INDELMARK) THEN CALL UPTOLOW(AL_2_ARRAY(K-1),1) ENDIF IF (LAL_1(K+1) .NE. INDELMARK) THEN CALL UPTOLOW(AL_2_ARRAY(K+1),1) ENDIF ENDIF ENDDO IPOS=1 DO I=M,1,-1 IF ( (LAL_1(I) .NE. INDELMARK) .AND. + (LAL_1(I) .NE. ENDMARK) ) THEN LINE(2)(IPOS:IPOS)=AL_2_ARRAY(I) IF (LDSSP_2) THEN LINE(3)(IPOS:IPOS)=SAL_2_ARRAY(I) WRITE(LINE(4)(IPOS:IPOS),'(I1)')LACC_2(I) ENDIF IPOS=IPOS+1 ENDIF ENDDO C====================================================================== NRECORD=NRECORD+1 WRITE(KCORE,REC=NRECORD)NAME_2 NRECORD=NRECORD+1 WRITE(KCORE,REC=NRECORD)COMPND_2 NRECORD=NRECORD+1 WRITE(KCORE,REC=NRECORD)ACCESSION_2,PDBREF_2,LDSSP_2 NRECORD=NRECORD+1 WRITE(KCORE,REC=NRECORD)IFIR,LEN1,LENOCC,JFIR,JLAS, + N2IN,IDELETION,NDEL,NSHIFTED,RMS,HOM, + SIM,SDEV,DISTANCE,IINS C store alignment IF (MOD(FLOAT(LEN1),FLOAT(MAXRECORDLEN)).EQ. 0.0) THEN NLINE= LEN1/MAXRECORDLEN ELSE NLINE=(LEN1/MAXRECORDLEN ) +1 ENDIF IBEG=1 IEND=MAXRECORDLEN DO I=1,NLINE NRECORD=NRECORD+1 WRITE(KCORE,REC=NRECORD)LINE(2)(IBEG:IEND) IF (LDSSP_2) THEN NRECORD=NRECORD+1 WRITE(KCORE,REC=NRECORD)LINE(3)(IBEG:IEND) NRECORD=NRECORD+1 WRITE(KCORE,REC=NRECORD)LINE(4)(IBEG:IEND) ENDIF IBEG=IEND+1 IEND=IEND+MAXRECORDLEN ENDDO C store insertions IF (IINS .GT. 0) THEN DO I=1,IINS NRECORD=NRECORD+1 WRITE(KCORE,REC=NRECORD)INSLEN_LOCAL(I), + INSBEG_1_LOCAL(I),INSBEG_2_LOCAL(I) ENDDO IF (MOD(FLOAT(INSPOINTER_LOCAL),FLOAT(MAXRECORDLEN)) .EQ. + 0.0) THEN NLINE= INSPOINTER_LOCAL/MAXRECORDLEN ELSE NLINE=(INSPOINTER_LOCAL/MAXRECORDLEN ) +1 ENDIF IBEG=1 IEND=MAXRECORDLEN DO I=1,NLINE NRECORD=NRECORD+1 WRITE(KCORE,REC=NRECORD)INSSEQ(IBEG:IEND) IBEG=IEND+1 IEND=IEND+MAXRECORDLEN ENDDO ENDIF C===================================================================== C unmark insertions in SEQuence 1 by lower case letters in AL_2_ARRAY(*) c===================================================================== DO I=1,M CALL LOWTOUP(AL_2_ARRAY(I),1) ENDDO C===================================================================== C write long output file C===================================================================== IF (LONG_OUT) THEN WRITE(KLONG,*)' No IFIR ILAS JFIR JLAS NPOS NDEL '// + 'VAL VPER NIDE IDE SIM RMS '// + 'DIST ACCESSION NAME' WRITE(KLONG,1017)NTEST,IFIR,ILAS,JFIR,JLAS,LENOCC, + IDELETION,VALUE,PER,IDAL,HOM,SIM,RMS,DISTANCE, + ACCESSION_2,NAME_2(1:50) 1017 FORMAT(I4,2(1X,I4,'-',I4),2(I5),2(F7.2),I5,1X,4(F6.2),A,A) JPOS=M ISTART=1 CTEMP=' ' J=ISTART DO K=JPOS,1,-1 WRITE(CTEMP(J:J),'(A)')AL_AGREE(K) J=J+1 ENDDO WRITE(KLONG,'(A)')CTEMP(1:J) CTEMP=' ' J=ISTART DO K=JPOS,1,-1 WRITE(CTEMP(J:J),'(A)')AL_1_ARRAY(K) J=J+1 ENDDO WRITE(KLONG,'(A)')CTEMP(1:J) IF (SAL_1_ARRAY(1) .NE. 'U') THEN CTEMP=' ' J=ISTART DO K=JPOS,1,-1 WRITE(CTEMP(J:J),'(A)')SAL_1_ARRAY(K) J=J+1 ENDDO WRITE(KLONG,'(A)')CTEMP(1:J) ENDIF CTEMP=' ' J=ISTART DO K=JPOS,1,-1 WRITE(CTEMP(J:J),'(A)')AL_2_ARRAY(K) J=J+1 ENDDO WRITE(KLONG,'(A)')CTEMP(1:J) IF (SAL_2_ARRAY(1).NE.'U') THEN CTEMP=' ' J=ISTART DO K=JPOS,1,-1 WRITE(CTEMP(J:J),'(A)')SAL_2_ARRAY(K) J=J+1 ENDDO WRITE(KLONG,'(A)')CTEMP(1:J) CTEMP=' ' J=ISTART DO K=JPOS,1,-1 WRITE(CTEMP(J:J),'(A)')SAL_AGREE(K) J=J+1 ENDDO WRITE(KLONG,'(A)')CTEMP(1:J) ENDIF WRITE(KLONG,*)' ' J=ISTART DO K=JPOS,1,-1 WRITE(KLONG,'(I6,A,2X,I6,A)') + ITEMP_NO1(K),CTEMP_CHAIN1(K), + ITEMP_NO2(K),CTEMP_CHAIN2(K) C J=J+1 ENDDO c jpos=m ; ipos=m-100+1 ; linelen=100 ; iblocklen=11 c istart=1 c do while( jpos .ge. 1) c ipos=max(ipos,1) ; ctemp=' ' ; j=istart c do k=jpos,ipos,-1 ; if (mod(j,iblocklen) .eq. 0)j=j+1 c write(ctemp(j:j),'(a)')al_agree(k) ; j=j+1 c ENDDO; write(klong,'(a)')ctemp(1:j) c ctemp=' ' ; j=istart c do k=jpos,ipos,-1 ; if (mod(j,iblocklen) .eq. 0)j=j+1 c write(ctemp(j:j),'(a)')al_1_array(k) ; j=j+1 c ENDDO; write(klong,'(a)')ctemp(1:j) c if (sal_1_array(1) .ne. 'U') then ; ctemp=' ';j=istart c do k=jpos,ipos,-1 ; if (mod(j,iblocklen) .eq. 0)j=j+1 c write(ctemp(j:j),'(a)')sal_1_array(k) ; j=j+1 c ENDDO; write(klong,'(a)')ctemp(1:j) c endif c ctemp=' ' ; j=istart c do k=jpos,ipos,-1 ; if (mod(j,iblocklen) .eq. 0)j=j+1 c write(ctemp(j:j),'(a)')al_2_array(k) ; j=j+1 c ENDDO; write(klong,'(a)')ctemp(1:j) c if (sal_2_array(1).ne.'U') then ; ctemp=' ' ; j=istart c do k=jpos,ipos,-1 ; if (mod(j,iblocklen).eq.0)j=j+1 c write(ctemp(j:j),'(a)')sal_2_array(k) ; j=j+1 c ENDDO; write(klong,'(a)')ctemp(1:j) c ctemp=' ' ; j=istart c do k=jpos,ipos,-1 ; if (mod(j,iblocklen) .eq. 0)j=j+1 c write(ctemp(j:j),'(a)')sal_agree(k) ; j=j+1 c ENDDO; write(klong,'(a)')ctemp(1:j) c endif c write(klong,*)' ' c jpos=jpos-linelen ; ipos=ipos-linelen c enddo ENDIF C===================================================================== C output to PLOT file TRACE.X C===================================================================== IF (LTRACEOUT) THEN CALL OPEN_FILE(KPLOT,PLOTFILE,'UNKNOWN,APPEND',LERROR) CALL PUTHEADER(KPLOT,CSQ_1,CSQ_2,STRUC_1,STRUC_2,ND1-1, + ND2-1,NAME_1,NAME_2) WRITE(KPLOT,'(1X,I3,A)')NTEST,' TRACE' C if lall linethickness is value/residue; else its the distance from the C chosen threshold IF (LALL) THEN LINETHICK=NINT(PER) ELSE LINETHICK=NINT(DISTANCE) ENDIF WRITE(KPLOT,'(3(I4))')ITRACE(1),JTRACE(1),LINETHICK C output only straight line end points C so, plot beginning, end, and kink points C kink if there is a discontinuity in either I or J increments DO K=2,M-1 LKINK=ABS(ITRACE(K)-ITRACE(K+1)) .NE. + ABS(ITRACE(K)-ITRACE(K-1)) .OR. + ABS(JTRACE(K)-JTRACE(K+1)) .NE. + ABS(JTRACE(K)-JTRACE(K-1)) IF (LKINK) THEN WRITE(KPLOT,'(3(I4))')ITRACE(K),JTRACE(K),LINETHICK ENDIF ENDDO WRITE(KPLOT,'(3(I4))')ITRACE(M),JTRACE(M),LINETHICK C DEFINES END OF TRACE IN TRACE-HOMOLOGY ITRACE(M+1)=0 JTRACE(M+1)=0 WRITE(KPLOT,'(3(I4))')ITRACE(M+1),JTRACE(M+1),LINETHICK CLOSE(KPLOT) ENDIF ENDIF C end if lconsider C====================================================================== RETURN END C END TRACE C...................................................................... C...................................................................... C SUB WRITE_ALB SUBROUTINE WRITE_ALB(KOUT,OUTFILE,SEQ,NBLOCKS,HEADERLINE, 1 NAMELABEL,SEQSTART,SEQSTOP,CHBPOS,NBREAKS,ERROR) IMPLICIT NONE C IMPORT INTEGER KOUT INTEGER NBLOCKS INTEGER SEQSTART,SEQSTOP INTEGER NBREAKS,CHBPOS(*) CHARACTER*(*) SEQ CHARACTER*(*) HEADERLINE,NAMELABEL,OUTFILE C EXPORT C ( OUTPUT TO UNIT KOUT ) LOGICAL ERROR C INTERNAL INTEGER BLOCKSIZE PARAMETER (BLOCKSIZE= 10) INTEGER ISTART, ISTOP, FIRSTPOS, LASTPOS, ISEQPOS INTEGER BEGIN, END, LENGTH, ICHAIN CHARACTER*(250) OUTLINE, ALBHEADLINE LOGICAL NOCHAINBREAKS ERROR = .FALSE. C try to open outfile; return if unsuccessful CALL OPEN_FILE(KOUT,OUTFILE,'new,recl=250',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN NOCHAINBREAKS = .TRUE. LENGTH = SEQSTOP-SEQSTART+1 C make up standard alb headerline C 1GD1 .... C 1336 C ^ length in col 200 WRITE(ALBHEADLINE,'(I4)') LENGTH - NBREAKS CALL RIGHTADJUST(ALBHEADLINE(1:200),1,200) CALL STRPOS(NAMELABEL,ISTART,ISTOP) ALBHEADLINE(1:ISTOP-ISTART+2) = ' ' // 1 NAMELABEL(MAX(ISTART,1):MAX(1,ISTOP)) C WRITE SEQUENCE ISEQPOS = 0 DO ICHAIN = 1,NBREAKS+1 IF ( ICHAIN .EQ. 1 ) THEN FIRSTPOS = SEQSTART ELSE FIRSTPOS = CHBPOS(ICHAIN-1) + 1 WRITE(KOUT,'(A)') '=' ENDIF IF ( ICHAIN .EQ. NBREAKS+1 ) THEN LASTPOS = SEQSTOP ELSE LASTPOS = CHBPOS(ICHAIN) - 1 ENDIF CALL STRPOS(ALBHEADLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') ALBHEADLINE(1:MAX(1,ISTOP)) WRITE(OUTLINE,'(2I4)') LASTPOS-FIRSTPOS+1, ISEQPOS CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) C BEGIN = FIRSTPOS BEGIN = SEQSTART C "REPEAT UNTIL" 1 CONTINUE C WRITESEQLINE RETURNS END CALL WRITESEQLINE(SEQ,BEGIN,BLOCKSIZE,NBLOCKS,SEQSTOP, 1 NOCHAINBREAKS,OUTLINE,END,ERROR) C CALL WRITESEQLINE(SEQ,BEGIN,BLOCKSIZE,NBLOCKS,LASTPOS, C 1 NOCHAINBREAKS,OUTLINE,END,ERROR) IF ( ERROR ) STOP CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) BEGIN = END + 1 C IF ( BEGIN .LE. LASTPOS ) GOTO 1 IF ( BEGIN .LE. SEQSTOP ) GOTO 1 C END "REPEAT UNTIL" CALL STRPOS(HEADERLINE,ISTART,ISTOP) WRITE(KOUT,'(A,A)') ' ',HEADERLINE(1:MAX(1,ISTOP)) WRITE(KOUT,'(A)') '=' ISEQPOS = ISEQPOS + LASTPOS-FIRSTPOS+1 ENDDO CLOSE(KOUT) RETURN END C END WRITE_ALB C...................................................................... C...................................................................... C SUB WRITE_EMBL SUBROUTINE WRITE_EMBL(KOUT,SEQ,NBLOCKS,INFILE,OUTFILE, 1 HEADERLINE,SEQSTART,SEQSTOP,ERROR) IMPLICIT NONE C IMPORT INTEGER KOUT INTEGER NBLOCKS INTEGER SEQSTART,SEQSTOP CHARACTER*(*) SEQ CHARACTER*(*) HEADERLINE,INFILE,OUTFILE C EXPORT C ( OUTPUT TO UNIT KOUT ) LOGICAL ERROR C INTERNAL INTEGER BLOCKSIZE PARAMETER (BLOCKSIZE= 10) INTEGER ISTART, ISTOP INTEGER BEGIN, END, LENGTH CHARACTER*(250) OUTLINE LOGICAL NOCHAINBREAKS *----------------------------------------------------------------------* ERROR = .FALSE. C try to open outfile; return if unsuccessful CALL OPEN_FILE(KOUT,OUTFILE,'new',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN NOCHAINBREAKS = .FALSE. LENGTH = SEQSTOP-SEQSTART+1 OUTLINE = 'ID X' CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(MAX(ISTART,1):MAX(1,ISTOP)) C BEGIN AND END CALL STRPOS(INFILE,ISTART,ISTOP) WRITE(OUTLINE,'(A,A,1X,A,I4,1X,A,I4)') 1 'DE ',infile(max(istart,1):max(1,istop)),'from: ', 2 seqstart,'to: ',seqstop CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(MAX(ISTART,1):MAX(1,ISTOP)) C copy passed headerline ( mark it with "DE" - not necessary (?)) CALL STRPOS(HEADERLINE,ISTART,ISTOP) OUTLINE = 'DE ' // HEADERLINE(1:MAX(1,ISTOP)) CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(MAX(ISTART,1):MAX(1,ISTOP)) C make up standard embl headerline C SQ SEQUENCE 344 AA; write(outline,'(a,i6,a)') 'SQ SEQUENCE',length,' AA;' CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) C write sequence BEGIN = SEQSTART C "repeat until" 1 CONTINUE C writeseqline returns end CALL WRITESEQLINE(SEQ,BEGIN,BLOCKSIZE,NBLOCKS,SEQSTOP, 1 NOCHAINBREAKS,OUTLINE,END,ERROR) IF ( ERROR ) STOP CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) BEGIN = END + 1 IF ( BEGIN .LE. SEQSTOP ) GOTO 1 C end "repeat until" C standard end marker WRITE(KOUT,'(A)') '//' CLOSE(KOUT) RETURN END C END WRITE_EMBL C...................................................................... C...................................................................... C SUB WRITE_GCG SUBROUTINE WRITE_GCG(KOUT,SEQ,NBLOCKS,NBREAKS,INFILE,OUTFILE, 1 HEADERLINE,SEQSTART,SEQSTOP,ERROR) IMPLICIT NONE C IMPORT INTEGER KOUT INTEGER NBLOCKS, NBREAKS INTEGER SEQSTART,SEQSTOP CHARACTER*(*) SEQ CHARACTER*(*) HEADERLINE,INFILE, OUTFILE C EXPORT C ( OUTPUT TO UNIT KOUT ) LOGICAL ERROR C INTERNAL INTEGER BLOCKSIZE PARAMETER (BLOCKSIZE= 10) INTEGER ISTART, ISTOP INTEGER BEGIN, END INTEGER CHECK, LENGTH CHARACTER*8 CTMP CHARACTER*9 DATESTRING CHARACTER*(250) OUTLINE, SEQLINE LOGICAL NOCHAINBREAKS ERROR = .FALSE. C try to open outfile; return if unsuccessful CALL OPEN_FILE(KOUT,OUTFILE,'new',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN NOCHAINBREAKS = .TRUE. LENGTH = SEQSTOP-SEQSTART+1 C BEGIN AND END CALL STRPOS(INFILE,ISTART,ISTOP) WRITE(OUTLINE,'(1X,A,1X,A,I4,1X,A,I4)') 1 infile(max(istart,1):max(1,istop)),'from: ', 2 seqstart,'to: ',seqstop CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(MAX(ISTART,1):MAX(1,ISTOP)) WRITE(KOUT,'(A)') ' ' C COPY PASSED HEADERLINE CALL STRPOS(HEADERLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') HEADERLINE(1:MAX(ISTOP,1)) C MAKE UP STANDARD GCG HEADERLINE CALL STRPOS(OUTFILE,ISTART,ISTOP) CALL GETDATE(DATESTRING) CALL CHECKSEQ(SEQ,SEQSTART,SEQSTOP,CHECK) WRITE(OUTLINE,'(1X,A,10X,A,I5,3X,A,2X,A,I5,1X,A)') 1 outfile(max(istart,1):max(1,istop)),'Length:', 2 length-nbreaks,datestring,'Check:',check,'..' CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(ISTOP,1)) WRITE(KOUT,'(A)') ' ' C write sequence C 1 RPDFCLEPPY TGPCKARIIR YFYNAKAGLC QTFVYGGCRA KRNNFKSAED BEGIN = SEQSTART C "repeat until" 1 CONTINUE WRITE(CTMP,'(I8)') BEGIN-SEQSTART+1 C writeseqline returns end CALL WRITESEQLINE(SEQ,BEGIN,BLOCKSIZE,NBLOCKS,SEQSTOP, 1 NOCHAINBREAKS,SEQLINE,END,ERROR) IF ( ERROR ) STOP CALL STRPOS(SEQLINE,ISTART,ISTOP) C gcg sequence lines are preceeded by a number (first pos. of line ) OUTLINE = CTMP // ' ' // 1 SEQLINE(MAX(ISTART,1):MAX(ISTOP,1)) CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(ISTOP,1)) WRITE(KOUT,'(A)') ' ' BEGIN = END + 1 IF ( BEGIN .LE. SEQSTOP ) GOTO 1 C END "REPEAT UNTIL" CLOSE(KOUT) RETURN END C END WRITE_GCG C...................................................................... C...................................................................... C SUB WRITE_HSSP SUBROUTINE WRITE_HSSP(KOUT,MAXRES,NALIGN,NRES,EMBLID,STRID, + ACCESSION,IDE,SIM,IFIR,ILAS,JFIR,JLAS, + LALI,NGAP,LGAP,LENSEQ,PROTNAME,ALIPOINTER, + ALISEQ,PDBNO,CHAINID,PDBSEQ,SECSTR,COLS, + BP1,BP2,SHEETLABEL,ACC,NOCC,VAR,SEQPROF, + NDEL,NINS,ENTROPY,RELENT,CONSWEIGHT, + INSNUMBER,INSALI,INSPOINTER,INSLEN, + INSBEG_1,INSBEG_2,INSBUFFER,ISOLEN, + ISOIDE,NSTEP,LFORMULA,LALL,ISAFE, + EXCLUDEFLAG,LCONSERV,LHSSP_LONG_ID) IMPLICIT NONE C---- import INTEGER KOUT,MAXRES,NALIGN,NRES, + IFIR(*),ILAS(*),JFIR(*),JLAS(*),LALI(*), + NGAP(*),LGAP(*),LENSEQ(*),ALIPOINTER(*),PDBNO(*), + BP1(*),BP2(*), + ACC(*),NOCC(*),VAR(*),SEQPROF(MAXRES,*),NDEL(*), + NINS(*),RELENT(*), + ISOLEN(*),NSTEP,ISAFE, + INSNUMBER,INSALI(*),INSPOINTER(*), + INSLEN(*),INSBEG_1(*),INSBEG_2(*) CHARACTER*(*) EMBLID(*),STRID(*),ACCESSION(*),PROTNAME(*), + ALISEQ(*), + CHAINID(*),PDBSEQ(*),SECSTR(*), + EXCLUDEFLAG(*) CHARACTER*7 COLS(*) CHARACTER*1 SHEETLABEL(*),INSBUFFER(*) REAL IDE(*),SIM(*),ENTROPY(*),CONSWEIGHT(*),ISOIDE(*) LOGICAL LCONSERV,LFORMULA,LALL,LHSSP_LONG_ID C---- internal parameter INTEGER NBLOCKSIZE,NBLOCKINS, + MAXAA,MAXALIGNS_LOC PARAMETER (NBLOCKSIZE= 70) PARAMETER (NBLOCKINS= 100) C maximal number of symbols PARAMETER (MAXAA= 20) C maximal number of alignments PARAMETER (MAXALIGNS_LOC= 8765) C PARAMETER (MAXALIGNS_LOC= 19999) C---- internal veriable INTEGER NALIGN_FILTER, + I,J,ILEN,LENLINE,K, + NBLOCK,IALIGN,JUMP,ISTART,ISTOP,IRUL,IBLOCK, + LPOS,JPOS,IPOS, + IAL,IBEG,IEND,IINS, + INS_NEW,INS_ORDER(MAXALIGNS_LOC),NRES2 LOGICAL LINSERTION,LCONSIDER CHARACTER PROFILESEQ*(MAXAA), + CRULER*(NBLOCKSIZE), + CTEMP*(NBLOCKSIZE),CTEMPINS*(NBLOCKINS), + LINE*512 REAL DISTANCE C---- ------------------------------------------------------------------ C---- C---- ------------------------------------------------------------------ C order of amino acid symbols in the HSSP sequence profile block PROFILESEQ='VLIMFWYGAPSTCHRKQEND' C---- C---- check local array dimension C---- IF (NALIGN .GT. MAXALIGNS_LOC) THEN WRITE(6,*)'*** ERROR WRITE_HSSP: MAXALIGNS_LOC overflow' WRITE(6,*)'*-> increase dimension !' STOP ENDIF C---- C---- 99.01 br changed C---- CC C get number of alignments after filtering CC NALIGN_FILTER=0 CC DO I=1,NALIGN CC INS_ORDER(I)=0 CC CALL CHECKHSSPCUT(LALI(I),IDE(I)*100,ISOLEN, CC + ISOIDE,NSTEP,LFORMULA,LALL,ISAFE,LCONSIDER,DISTANCE) CC IF ( LCONSIDER ) THEN CC IF ( EXCLUDEFLAG(I) .EQ. ' ') THEN CC NALIGN_FILTER=NALIGN_FILTER+1 CC INS_ORDER(I)=NALIGN_FILTER CC ENDIF CC ELSE CC EXCLUDEFLAG(I)='*' CC ENDIF CC ENDDO C---- C---- 99.01 br: new version C---- C get number of alignments after filtering NALIGN_FILTER=0 DO I=1,NALIGN INS_ORDER(I)=0 IF ( EXCLUDEFLAG(I) .NE. '*') THEN NALIGN_FILTER=NALIGN_FILTER+1 INS_ORDER(I)=NALIGN_FILTER ENDIF ENDDO C---- no alignment -> write last line ('//') IF (NALIGN_FILTER .EQ. 0) THEN WRITE(6,*)'-*- WARNING WRITE_HSSP file empty (no ali found)!' WRITE(KOUT,'(A)')'//' CLOSE(KOUT) RETURN ENDIF C======================================================================= C write the PROTEINS-block C======================================================================= C## PROTEINS : EMBL/SWISSPROT identifier and alignment statistics C NR. ID STRID %IDE %WSIM IFIR ILAS JFIR JLAS LALI NGAP LGAP LSEQ2 P C 1 : IATR$BOVIN 0.43 12345 1 56 1 56 56 0 0 123 A C1234AAA123456789012AAAAAAX1234512345X1234X1234X1234X1234X1234X1234X1234X1234XX1 C C NR. ID STRID %IDE %WSIM IFIR ILAS JFIR JLAS LALI NGAP LGAP LSEQ2 C 1 : IATR$BOVIN.............................. 0.43 12345 1 56 1 56 56 0 0 123 A C1234AAA1234567890123456789012345678901234567890AAAAAAX1234512345X1234X1234X1234X1234X1234X1234X1234X1234XX WRITE(KOUT,'(A)')'## PROTEINS : EMBL/SWISSPROT identifier '// + 'and alignment statistics' IF (LCONSERV) THEN IF ( LHSSP_LONG_ID ) THEN WRITE(KOUT,'(A)') + ' NR. ID '// + ' STRID %IDE %WSIM IFIR ILAS'// + ' JFIR JLAS LALI NGAP LGAP LSEQ2 ACCESSION'// + ' PROTEIN' ELSE WRITE(KOUT,'(A)') + ' NR. ID STRID %IDE %WSIM'// + ' IFIR ILAS JFIR JLAS LALI NGAP LGAP LSEQ2'// + ' ACCESSION PROTEIN' ENDIF ELSE IF ( LHSSP_LONG_ID ) THEN WRITE(KOUT,'(A)') + ' NR. ID '// + ' STRID %IDE %SIM IFIR ILAS'// + ' JFIR JLAS LALI NGAP LGAP LSEQ2 ACCESSION'// + ' PROTEIN' ELSE WRITE(KOUT,'(A)') + ' NR. ID STRID %IDE %SIM'// + ' IFIR ILAS JFIR JLAS LALI NGAP LGAP LSEQ2'// + ' ACCESSION PROTEIN' ENDIF ENDIF J=0 DO I=1,NALIGN IF ( EXCLUDEFLAG(I).EQ.' ') THEN C -------------------------------------------------- C terrible hack br 99-11: shorten too long proteins NRES2=LENSEQ(I) IF (NRES2.GT.9999) NRES2=9999 C end of terrible hack C -------------------------------------------------- J=J+1 IF (LHSSP_LONG_ID ) THEN WRITE(LINE,50)J,' : ',EMBLID(I),STRID(I),IDE(I),SIM(I), + IFIR(I),ILAS(I),JFIR(I),JLAS(I),LALI(I),NGAP(I), + LGAP(I),NRES2,ACCESSION(I),PROTNAME(I)(1:41) CALL STRPOS(LINE,ILEN,LENLINE) WRITE(KOUT,'(A)')LINE(1:LENLINE) ELSE WRITE(LINE,100)J,' : ',EMBLID(I),STRID(I),IDE(I),SIM(I), + IFIR(I),ILAS(I),JFIR(I),JLAS(I),LALI(I),NGAP(I), + LGAP(I),NRES2,ACCESSION(I),PROTNAME(I)(1:41) CALL STRPOS(LINE,ILEN,LENLINE) WRITE(KOUT,'(A)')LINE(1:LENLINE) ENDIF ENDIF ENDDO 50 FORMAT (1X,I4,A,A40,A6,1X,F5.2,1X,F5.2,8(1X,I4),2X,A10,1X,A) 100 FORMAT (1X,I4,A,A12,A6,1X,F5.2,1X,F5.2,8(1X,I4),2X,A10,1X,A) C number of ALIGNMENTS-blocks IF (MOD(FLOAT(NALIGN_FILTER),FLOAT(NBLOCKSIZE)).EQ. 0.0) THEN NBLOCK=NALIGN_FILTER/NBLOCKSIZE ELSE NBLOCK=NALIGN_FILTER/NBLOCKSIZE+1 ENDIF IALIGN=0 JUMP=0 ISTOP=IALIGN+NBLOCKSIZE IF (ISTOP.GT.NALIGN_FILTER) ISTOP=NALIGN_FILTER IRUL=1 C======================================================================= C loop over ALIGNMENTS-blocks C======================================================================= DO IBLOCK=1,NBLOCK C make ruler LPOS=1 DO K=1,(NBLOCKSIZE/10) IF (IRUL.EQ.10) IRUL=0 WRITE(CRULER(LPOS:LPOS+9),'(A9,I1)')'....:....',IRUL LPOS=LPOS+10 IRUL=IRUL+1 ENDDO WRITE(KOUT,'(2(A,I4))')'## ALIGNMENTS ', + IALIGN+1-JUMP,' - ',ISTOP WRITE(KOUT,'(A)')' SeqNo PDBNo AA STRUCTURE '// + 'BP1 BP2 ACC NOCC VAR '//cruler C======================================================================= C rearange alignment in vertical order C======================================================================= DO I=1,NRES CTEMP=' ' JPOS=1 IPOS=1 JUMP=0 CCCC parsytec bug c stupid parsytec has problems here c do while(ipos .le. nblocksize .and. C + (ialign+jpos) .le. nalign) DO WHILE(IPOS .LE. NBLOCKSIZE ) IF ( (IALIGN+JPOS) .GT. NALIGN) THEN GOTO 10 ENDIF IAL=IALIGN+JPOS JPOS=JPOS+1 IF ( EXCLUDEFLAG(IAL) .EQ. ' ' ) THEN IF (I .GE. IFIR(IAL) .AND. I .LE. ILAS(IAL)) THEN J=ALIPOINTER(IAL)+(I-IFIR(IAL)) CTEMP(IPOS:IPOS)=ALISEQ(J) IPOS=IPOS+1 ELSE CTEMP(IPOS:IPOS)=' ' IPOS=IPOS+1 ENDIF ELSE JUMP=JUMP+1 ENDIF ENDDO 10 LINE=' ' C======================================================================= C write ALIGNMENTS-block C======================================================================= WRITE(LINE,200)I,PDBNO(I),CHAINID(I),PDBSEQ(I),SECSTR(I), + COLS(I),BP1(I),BP2(I),SHEETLABEL(I),ACC(I), + NOCC(I),VAR(I),CTEMP IF (PDBNO(I).EQ.0) LINE(7:11)=' ' CALL STRPOS(LINE,IBEG,IEND) WRITE(KOUT,'(A)')LINE(1:IEND) ENDDO IALIGN=IALIGN+NBLOCKSIZE+JUMP ISTOP=IALIGN+NBLOCKSIZE IF (ISTOP.GT.NALIGN_FILTER) ISTOP=NALIGN_FILTER IF (IBLOCK.EQ.NBLOCK) THEN WRITE(KOUT,'(A)')'## SEQUENCE PROFILE AND ENTROPY' WRITE(KOUT,'(1X,A,20(3X,A1),A,A,A,A,A,A)')'SeqNo PDBNo', + (profileseq(I:I),I=1,maxaa),' NOCC',' NDEL', + ' NINS',' ENTROPY',' RELENT',' WEIGHT' ENDIF ENDDO 200 FORMAT(2X,2(I4,1X),A1,1X,A1,2X,A1,1X,A7,2(I4),A1,2(I4,1X),I4,2X,A) C======================================================================= C write SEQUENCE PROFILE-block C======================================================================= DO I=1,NRES LINE=' ' WRITE(LINE,300)I,PDBNO(I),CHAINID(I), + (SEQPROF(I,K),K=1,MAXAA),NOCC(I),NDEL(I),NINS(I), + ENTROPY(I),RELENT(I),CONSWEIGHT(I) IF (PDBNO(I).EQ.0) LINE(7:11)=' ' CALL STRPOS(LINE,IBEG,IEND) WRITE(KOUT,'(A)')LINE(1:IEND) ENDDO 300 FORMAT (2(1X,I4),1X,A1,20(I4),1X,3(1X,I4),1X,F7.3,3X,I4,2X,F4.2) C======================================================================= C write insertion block C======================================================================= LINSERTION=.FALSE. IINS=1 DO WHILE (.NOT. LINSERTION .AND. IINS .LE. INSNUMBER) IF ( EXCLUDEFLAG (INSALI(IINS)) .EQ.' ')LINSERTION=.TRUE. IINS=IINS+1 ENDDO IF ( LINSERTION ) THEN WRITE(KOUT,'(A)')'## INSERTION LIST' WRITE(KOUT,'(A)')' AliNo IPOS JPOS Len Sequence' CTEMPINS=' ' DO IINS=1,INSNUMBER IF ( EXCLUDEFLAG (INSALI(IINS)) .EQ.' ') THEN JPOS=INSPOINTER(IINS) INS_NEW = INS_ORDER( INSALI(IINS) ) IF (INSLEN(IINS)+2 .LE. NBLOCKINS) THEN DO IPOS=1,INSLEN(IINS)+2 CTEMPINS(IPOS:IPOS)=INSBUFFER(JPOS) JPOS=JPOS+1 ENDDO WRITE(KOUT,'(4(I6),1X,A)')INS_NEW,INSBEG_1(IINS), + INSBEG_2(IINS),INSLEN(IINS),CTEMPINS(1:INSLEN(IINS)+2) ELSE DO IPOS=1,NBLOCKINS CTEMPINS(IPOS:IPOS)=INSBUFFER(JPOS) JPOS=JPOS+1 ENDDO WRITE(KOUT,'(4(I6),1X,A)')INS_NEW,INSBEG_1(IINS), + INSBEG_2(IINS),INSLEN(IINS),CTEMPINS(1:NBLOCKINS) IBEG=NBLOCKINS+1 DO WHILE (IBEG .LE. INSLEN(IINS)+2 ) IEND=MIN(IBEG+NBLOCKINS-1,INSLEN(IINS)+2 ) IPOS=0 DO J=IBEG,IEND IPOS=IPOS+1 CTEMPINS(IPOS:IPOS)=INSBUFFER(JPOS) JPOS=JPOS+1 ENDDO WRITE(KOUT,'(A,19X,A)')' +',CTEMPINS(1:IPOS) IBEG=IBEG+NBLOCKINS ENDDO ENDIF ENDIF ENDDO ENDIF C write last line ('//') WRITE(KOUT,'(A)')'//' CLOSE(KOUT) RETURN END C END WRITE_HSSP C...................................................................... C...................................................................... C SUB WRITE_KLEIN SUBROUTINE WRITE_KLEIN(KOUT,SEQ,NBLOCKS,NAME,INFILE,OUTFILE, 1 HEADERLINE,SEQSTART,SEQSTOP,ERROR) IMPLICIT NONE C IMPORT INTEGER KOUT INTEGER NBLOCKS INTEGER SEQSTART,SEQSTOP CHARACTER*(*) SEQ CHARACTER*(*) HEADERLINE, NAME, INFILE,OUTFILE C EXPORT C ( OUTPUT TO UNIT KOUT ) LOGICAL ERROR C INTERNAL INTEGER BLOCKSIZE PARAMETER (BLOCKSIZE= 10) INTEGER ISTART, ISTOP INTEGER BEGIN, END, LENGTH CHARACTER*(250) OUTLINE LOGICAL NOCHAINBREAKS ERROR = .FALSE. C TRY TO OPEN OUTFILE; RETURN IF UNSUCCESSFUL CALL OPEN_FILE(KOUT,OUTFILE,'new',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN NOCHAINBREAKS = .FALSE. LENGTH = SEQSTOP-SEQSTART+1 C BEGIN AND END CALL STRPOS(INFILE,ISTART,ISTOP) WRITE(OUTLINE,'(A,A,1X,A,I4,1X,A,I4)') 1 '; ',INFILE(MAX(ISTART,1):MAX(1,ISTOP)), 2 'from: ',seqstart,'to: ',seqstop CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(MAX(ISTART,1):MAX(1,ISTOP)) C headerline is a comment line: marked by ';' CALL STRPOS(HEADERLINE,ISTART,ISTOP) OUTLINE = '; ' // HEADERLINE(MAX(ISTART,1):MAX(1,ISTOP)) WRITE(KOUT,'(A)') OUTLINE(1:ISTOP+2) OUTLINE = ' ' C make up standard klein headerline C 1GD1 1339 CALL STRPOS(NAME,ISTART,ISTOP) WRITE(OUTLINE,'(1X,A,10X,I4)') 1 NAME(MAX(ISTART,1):MIN(MAX(ISTOP,1),6)),LENGTH CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) C write sequence BEGIN = SEQSTART C "repeat until" 1 CONTINUE C writeseqline returns end CALL WRITESEQLINE(SEQ,BEGIN,BLOCKSIZE,NBLOCKS,SEQSTOP, 1 NOCHAINBREAKS,OUTLINE,END,ERROR) IF ( ERROR ) STOP CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) BEGIN = END + 1 IF ( BEGIN .LE. SEQSTOP ) GOTO 1 C END "REPEAT UNTIL" CLOSE(KOUT) RETURN END C END WRITE_KLEIN C...................................................................... C...................................................................... C SUB WRITE_MSF SUBROUTINE WRITE_MSF(KUNIT,INFILE,OUTFILE,MAXALIGNS,MAXRES, 1 MAXCORE,MAXINS,MAXINSBUF,BEGIN,END,NBLOCKS,ALISEQ, 2 ALIPOINTER,IFIR,ILAS,TYPE,SEQNAMES,WEIGHT,SEQCHECK, 3 MSFCHECK,ALILEN,NSEQ,INSNUMBER,INSALI,INSPOINTER, 4 INSLEN,INSBEG_1,INSBUFFER,LDOEXP,ERROR) IMPLICIT NONE C 3.6.93 insertion lists C 4.11.93 C Import INTEGER MAXALIGNS, MAXRES, MAXCORE, MAXINS, MAXINSBUF INTEGER KUNIT, BEGIN, END, NBLOCKS, NSEQ INTEGER ALIPOINTER(MAXALIGNS) INTEGER ALILEN INTEGER IFIR(MAXALIGNS), ILAS(MAXALIGNS) INTEGER INSNUMBER,INSALI(MAXINS),INSPOINTER(MAXINS) INTEGER INSLEN(MAXINS),INSBEG_1(MAXINS) CHARACTER*(*) INFILE, OUTFILE C 'P' = PROTEIN SEQUENCES, 'N' = NUCLEOTIDE SEQ CHARACTER*1 TYPE CHARACTER*(*) SEQNAMES(MAXALIGNS) CHARACTER ALISEQ(MAXCORE) CHARACTER INSBUFFER(MAXINSBUF) REAL WEIGHT(MAXALIGNS) LOGICAL LDOEXP C EXPORT INTEGER MSFCHECK INTEGER SEQCHECK(MAXALIGNS) LOGICAL ERROR C INTERNAL INTEGER BLOCKSIZE INTEGER CODELEN INTEGER MAXALIGNS_LOC,MAXRES_LOC INTEGER LINELEN PARAMETER (BLOCKSIZE= 10) PARAMETER (MAXALIGNS_LOC= 8765) C PARAMETER (MAXALIGNS_LOC= 19999) PARAMETER (MAXRES_LOC= 9999) C PARAMETER (MAXRES_LOC= 10000) C PARAMETER (MAXRES_LOC= 30011) PARAMETER (LINELEN= 200) INTEGER*2 MAXLEN(MAXRES_LOC) INTEGER*2 INSLIST_POINTER(MAXRES_LOC) INTEGER*2 TOTALINSLEN(MAXRES_LOC) INTEGER POS1, POS2 INTEGER I, J, K, KK, IPOS, JPOS, ISEQ, IINS INTEGER LASTPOS, IAP INTEGER ISTART, ISTOP, I1START, I1STOP, I2START, I2STOP INTEGER ILEN, THISWIDTH INTEGER EFFECTIVE_BEGIN,EFFECTIVE_END INTEGER LENGTH(MAXALIGNS_LOC), IOUTPOS, NSPACES INTEGER LAST_INSERTION(MAXALIGNS_LOC) INTEGER NTRANS_INS(MAXALIGNS_LOC) C INTEGER INSPOS(MAXALIGNS_LOC) INTEGER LASTLEN(MAXALIGNS_LOC) INTEGER*2 IAPS(MAXRES_LOC) CHARACTER*1 C CHARACTER*1 CGAPCHAR CHARACTER*8 TIMESTRING CHARACTER*9 DATESTRING CHARACTER*64 DATE_TIME CHARACTER*(LINELEN) LINE CHARACTER*(MAXRES_LOC) STRAND LOGICAL NOCHAINBREAKS, NO_INS_HERE LOGICAL PARTIAL_INSERTION(MAXALIGNS_LOC) C REFORMAT OF: *.FRAG C C Nfi.Msf MSF: 594 Type: P February 17, 1992 14:37 Check: 1709 .. C C Name: Cnfi02 Len: 594 Check: 7754 Weight: 1.00 C Name: Cnfi03 Len: 594 Check: 4932 Weight: 1.00 C C// C C 1 50 CCnfi02 MMYSPICLTQ DEFHPFIEAL LPHVRAIAYT WFNLQARKRK YFKKHEKRMS CCnfi03 MMYSPICLTQ DEFHPFIEAL LPHVRAIAYT WFNLQARKRK YFKKHEKRMS ERROR = .FALSE. IINS=0 C try to open outfile; return if unsuccessful CALL OPEN_FILE(KUNIT,OUTFILE,'new,recl=200',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN IF ( NSEQ .GT. MAXALIGNS .OR. 1 NSEQ .GT. MAXALIGNS_LOC ) THEN WRITE(6,'(1X,A)') + 'ERROR: MAXALIGNS overflow in write_msf !' ERROR = .TRUE. RETURN ENDIF IF ( ALILEN .GT. MAXRES .OR. 1 ALILEN .GT. MAXRES_LOC ) THEN WRITE(6,'(1X,A)') + 'ERROR: MAXRES overflow in write_msf !' ERROR = .TRUE. RETURN ENDIF CGAPCHAR = '.' NOCHAINBREAKS = .FALSE. CODELEN = 1 DO I=1,NSEQ CALL STRPOS(SEQNAMES(I),ISTART,ISTOP) IF (ISTOP .GT. CODELEN)CODELEN=ISTOP+2 ENDDO IF (CODELEN .GT. LEN(SEQNAMES(1)) )CODELEN=LEN(SEQNAMES(1)) IF ( LDOEXP ) THEN CALL PREPARE_INSERTIONS(MAXRES,MAXALIGNS, 1 ALILEN,NSEQ, 2 IFIR,ILAS,INSNUMBER,INSALI,INSLEN, 3 INSBEG_1,MAXLEN,INSLIST_POINTER, 4 TOTALINSLEN,ERROR) ELSE CALL INIT_INT2_ARRAY(1,ALILEN,MAXLEN,0) CALL INIT_INT2_ARRAY(1,ALILEN,TOTALINSLEN,0) CALL INIT_INT2_ARRAY(1,NSEQ,INSLIST_POINTER,0) ENDIF EFFECTIVE_BEGIN = BEGIN + TOTALINSLEN(BEGIN) EFFECTIVE_END = END + TOTALINSLEN(END)-TOTALINSLEN(BEGIN) CALL STRPOS(INFILE,ISTART,ISTOP) WRITE(LINE,'(1X,A,A,1X,A,I4,1X,A,I4)') 'MSF of: ', 1 INFILE(ISTART:ISTOP),'from: ',begin, 2 'to: ',effective_end CALL STRPOS(LINE,ISTART,ISTOP) WRITE(KUNIT,'(A)') LINE(ISTART:ISTOP) C calculate single sequence checksums DO ISEQ = 1, NSEQ C j counts positions in "strand"; i counts alignment positions J = 0 DO I = 1,IFIR(ISEQ)-1 C .. no need to check whether this insertion belongs to the current seq C .. - this is impossible inside the region of the n-terminal gap DO K = 1,MAXLEN(I) J = J + 1 STRAND(J:J) = CGAPCHAR ENDDO J = J + 1 STRAND(J:J) = CGAPCHAR ENDDO IPOS = ALIPOINTER(ISEQ) DO I = IFIR(ISEQ),ILAS(ISEQ) J = J + 1 C = ALISEQ(IPOS) IF ( LDOEXP ) CALL LOWTOUP(C,1) STRAND(J:J) = C IF ( ( MAXLEN(I) .GT. 0 ) .AND. 1 ( INSLIST_POINTER(ISEQ) .NE. 0 ) ) THEN IINS = INSLIST_POINTER(ISEQ) DO WHILE ( INSBEG_1(IINS) .NE. I .AND. 1 INSALI(IINS) .EQ. ISEQ ) IINS = IINS + 1 ENDDO IF ( INSALI(IINS) .NE. ISEQ ) THEN NO_INS_HERE = .TRUE. ELSE NO_INS_HERE = .FALSE. ENDIF ELSE NO_INS_HERE = .TRUE. ENDIF IF ( .NOT. NO_INS_HERE ) THEN C WRITE(6,'(1x,3(i4,1x))') iseq, insali(iins), iins C .. this insertion belongs to current sequence - copy missing symbols C .. from INSBUFFER KK = INSPOINTER(IINS) C .. insertions are stored as lowercaseINSERTIONlowercase; with C .. INSPOINTER pointing to the leading "lowercase". this symbol C .. ( also in lowercase ) preceeds the insertion in aliseq; C .. this can be used as a check. C = INSBUFFER(KK) IF ( C .NE. ALISEQ(IPOS) ) THEN IF ( ALISEQ(IPOS) .NE. CGAPCHAR ) THEN ERROR = .TRUE. STOP 'MIST' ENDIF ENDIF KK = KK + 1 DO K = 1,INSLEN(IINS) J = J + 1 STRAND(J:J) = INSBUFFER(KK) KK = KK + 1 ENDDO DO K = INSLEN(IINS)+1,MAXLEN(I) J = J + 1 STRAND(J:J) = CGAPCHAR ENDDO ELSE C .. this insertion does not belong to current sequence - fill with C gap symbols DO K = 1,MAXLEN(I) J = J + 1 STRAND(J:J) = CGAPCHAR ENDDO ENDIF IPOS = IPOS + 1 ENDDO DO I = ILAS(ISEQ)+1,ALILEN C .. no need to check whether this insertion belongs to the current seq C .. - this is impossible inside the region of the n-terminal gap DO K = 1,MAXLEN(I) J = J + 1 STRAND(J:J) = CGAPCHAR ENDDO J = J + 1 STRAND(J:J) = CGAPCHAR ENDDO CALL CHECKSEQ(STRAND,1,J, 1 SEQCHECK(ISEQ)) ENDDO C calculate total checksum CALL MSFCHECKSEQ(SEQCHECK,NSEQ,MSFCHECK) C Write MSF identification line C get current date CALL GETDATE(DATESTRING) C get current time CALL GETTIME(TIMESTRING) C date + time DATE_TIME = DATESTRING // ' ' // TIMESTRING CALL STRPOS(OUTFILE,I1START,I1STOP) CALL STRPOS(DATE_TIME,I2START,I2STOP) WRITE(KUNIT,'(1X,A,2X,A,1X,I4,2X,A,A,1X,A,2X,A,I5,2X,A)') 1 OUTFILE(I1START:I1STOP),'MSF:', 2 EFFECTIVE_END-EFFECTIVE_BEGIN+1, 3 'Type: ',type,date_time(i2start:i2stop), 4 'Check:',msfcheck,'..' WRITE(KUNIT,'(A)') ' ' WRITE(KUNIT,'(A)') ' ' C Write sequence identification section DO ISEQ = 1,NSEQ WRITE(KUNIT,'(A,A,2X,A,I5,2X,A,I4,2X,A,F5.2)') 1 ' Name: ',seqnames(iseq)(1:codelen),'Len: ', 2 effective_end-effective_begin+1,'Check: ', 3 seqcheck(iseq), 'Weight: ', weight(iseq) C divider ENDDO WRITE(KUNIT,'(A)') ' ' WRITE(KUNIT,'(A)') '//' WRITE(KUNIT,'(A)') ' ' WRITE(KUNIT,'(A)') ' ' C WRITE ALIGNMENT DO ISEQ = 1,NSEQ LENGTH(ISEQ) = EFFECTIVE_BEGIN IAPS(ISEQ) = BEGIN-1 NTRANS_INS(ISEQ) = 0 LASTLEN(ISEQ) = 0 LAST_INSERTION(ISEQ) = 0 PARTIAL_INSERTION(ISEQ) = .FALSE. ENDDO ILEN = 0 DO WHILE ( ILEN .LT. EFFECTIVE_END-EFFECTIVE_BEGIN+1 ) C new block LASTPOS = 1 MIN(ILEN+NBLOCKS*BLOCKSIZE,EFFECTIVE_END-BEGIN+1) THISWIDTH = MIN(NBLOCKS*BLOCKSIZE,LASTPOS-ILEN) C write scale line IF ( MOD(THISWIDTH,BLOCKSIZE) .EQ. 0 ) THEN NSPACES = THISWIDTH / BLOCKSIZE - 1 ELSE NSPACES = THISWIDTH / BLOCKSIZE ENDIF CALL WRITESCALELINE(CODELEN+1,CODELEN+THISWIDTH+NSPACES, 1 ILEN+1,LASTPOS,LINE) WRITE(KUNIT,'(A)') LINE(1:CODELEN+THISWIDTH+NBLOCKS+1) C provide as many symbols in "strand" as writescalline will need to C transfer to next output line C .. steps : C ... - find alignment position x which is greater or equal the C ..... desired end point, INCLUDING INSERTIONS C ... - in case of "greater", there is an insertion crossing the C ..... boundary of the line to be output. C ..... SPLIT this insertion DO ISEQ = 1,NSEQ IOUTPOS = 1 IAP = IAPS(ISEQ) IPOS = LENGTH(ISEQ) IF ( PARTIAL_INSERTION(ISEQ) ) THEN IF ( LAST_INSERTION(ISEQ) .NE. 0 ) THEN JPOS = 1 INSPOINTER(LAST_INSERTION(ISEQ))+NTRANS_INS(ISEQ)+1 DO WHILE ( 1 IPOS .LE. EFFECTIVE_END 2 .AND. 3 IPOS .LE. LENGTH(ISEQ)+NBLOCKS*BLOCKSIZE-1 4 .AND. 5 JPOS .LE. INSPOINTER(LAST_INSERTION(ISEQ)) + 6 INSLEN(LAST_INSERTION(ISEQ)) 7 ) STRAND(IOUTPOS:IOUTPOS) = INSBUFFER(JPOS) IOUTPOS = IOUTPOS + 1 IPOS = IPOS + 1 NTRANS_INS(ISEQ) = NTRANS_INS(ISEQ) + 1 JPOS = JPOS + 1 ENDDO ENDIF DO WHILE ( 1 IPOS .LE. EFFECTIVE_END 2 .AND. 3 IPOS .LE. LENGTH(ISEQ)+NBLOCKS*BLOCKSIZE-1 4 .AND. 5 NTRANS_INS(ISEQ) .LT. LASTLEN(ISEQ) 6 ) STRAND(IOUTPOS:IOUTPOS) = CGAPCHAR IOUTPOS = IOUTPOS + 1 IPOS = IPOS + 1 NTRANS_INS(ISEQ) = NTRANS_INS(ISEQ) + 1 ENDDO IF ( NTRANS_INS(ISEQ) .EQ. LASTLEN(ISEQ) ) THEN PARTIAL_INSERTION(ISEQ) = .FALSE. NTRANS_INS(ISEQ) = 0 ENDIF ENDIF DO WHILE ( 1 IPOS .LE. EFFECTIVE_END .AND. 2 IPOS .LE. LENGTH(ISEQ)+NBLOCKS*BLOCKSIZE-1 3 ) IAP = IAP + 1 IF ( IAP .LT. IFIR(ISEQ) .OR. 1 IAP .GT. ILAS(ISEQ) ) THEN STRAND(IOUTPOS:IOUTPOS) = CGAPCHAR ELSE C = ALISEQ( 1 ALIPOINTER(ISEQ)+IAP- 2 IFIR(ISEQ) 3 ) IF ( LDOEXP ) CALL LOWTOUP(C,1) STRAND(IOUTPOS:IOUTPOS) = C ENDIF IOUTPOS = IOUTPOS + 1 IPOS = IPOS + 1 IF ( ( MAXLEN(IAP) .GT. 0 ) .AND. 1 ( INSLIST_POINTER(ISEQ) .NE. 0 ) ) THEN IINS = INSLIST_POINTER(ISEQ) DO WHILE ( INSBEG_1(IINS) .NE. IAP .AND. 1 INSALI(IINS) .EQ. ISEQ ) IINS = IINS + 1 ENDDO IF ( INSALI(IINS) .NE. ISEQ ) THEN NO_INS_HERE = .TRUE. ELSE NO_INS_HERE = .FALSE. ENDIF ELSE NO_INS_HERE = .TRUE. ENDIF IF ( .NOT. NO_INS_HERE ) THEN JPOS = INSPOINTER(IINS)+1 DO WHILE ( 1 IPOS .LE. EFFECTIVE_END 2 .AND. 3 IPOS .LE. (LENGTH(ISEQ)+NBLOCKS*BLOCKSIZE -1) 4 .AND. 5 JPOS .LE. (INSPOINTER(IINS)+INSLEN(IINS)) 6 ) STRAND(IOUTPOS:IOUTPOS) = INSBUFFER(JPOS) IOUTPOS = IOUTPOS + 1 NTRANS_INS(ISEQ) = NTRANS_INS(ISEQ) + 1 IPOS = IPOS + 1 JPOS = JPOS + 1 ENDDO DO WHILE ( 1 IPOS .LE. EFFECTIVE_END 2 .AND. 3 IPOS .LE. (LENGTH(ISEQ)+NBLOCKS*BLOCKSIZE-1) 4 .AND. 5 NTRANS_INS(ISEQ) .LT. MAXLEN(IAP) 6 ) STRAND(IOUTPOS:IOUTPOS) = CGAPCHAR IPOS = IPOS + 1 IOUTPOS = IOUTPOS + 1 NTRANS_INS(ISEQ) = NTRANS_INS(ISEQ) + 1 ENDDO IF ( NTRANS_INS(ISEQ) .EQ. MAXLEN(IAP) ) THEN PARTIAL_INSERTION(ISEQ) = .FALSE. NTRANS_INS(ISEQ) = 0 ELSE PARTIAL_INSERTION(ISEQ) = .TRUE. LAST_INSERTION(ISEQ) = IINS LASTLEN(ISEQ) = MAXLEN(IAP) ENDIF ELSE DO WHILE ( 1 IPOS .LE. EFFECTIVE_END 2 .AND. 3 IPOS .LE. LENGTH(ISEQ)+NBLOCKS*BLOCKSIZE-1 4 .AND. 5 NTRANS_INS(ISEQ) .LT. MAXLEN(IAP) 6 ) STRAND(IOUTPOS:IOUTPOS) = CGAPCHAR IOUTPOS = IOUTPOS + 1 NTRANS_INS(ISEQ) = NTRANS_INS(ISEQ) + 1 IPOS = IPOS + 1 ENDDO IF ( NTRANS_INS(ISEQ) .EQ. MAXLEN(IAP) ) THEN PARTIAL_INSERTION(ISEQ) = .FALSE. NTRANS_INS(ISEQ) = 0 ELSE PARTIAL_INSERTION(ISEQ) = .TRUE. LAST_INSERTION(ISEQ) = 0 LASTLEN(ISEQ) = MAXLEN(IAP) ENDIF ENDIF ENDDO IOUTPOS = IOUTPOS - 1 POS1 = 1 C writeseqline returns pos2 ( position of last transferred symbol ) CALL WRITESEQLINE(STRAND,POS1,BLOCKSIZE,NBLOCKS,IOUTPOS, 1 NOCHAINBREAKS,LINE,POS2,ERROR) IF ( ERROR ) STOP CALL STRPOS(LINE,ISTART,ISTOP) LENGTH(ISEQ) = LENGTH(ISEQ) + POS2 IAPS(ISEQ) = IAP LINE = SEQNAMES(ISEQ)(1:CODELEN) // LINE(ISTART:ISTOP) CALL STRPOS(LINE,ISTART,ISTOP) WRITE(KUNIT,'(A)') LINE(ISTART:ISTOP) ENDDO WRITE(KUNIT,'(A)') ' ' ILEN = ILEN + NBLOCKS*BLOCKSIZE ENDDO CLOSE(KUNIT) RETURN END C END WRITE_MSF C...................................................................... C...................................................................... C SUB WRITE_PEARSON SUBROUTINE WRITE_PEARSON(KOUT,OUTFILE,SEQ,NBLOCKS,IDENTIFIER, 1 HEADERLINE,SEQSTART,SEQSTOP,ERROR) IMPLICIT NONE C IMPORT INTEGER KOUT INTEGER NBLOCKS INTEGER SEQSTART,SEQSTOP CHARACTER*(*) SEQ CHARACTER*(*) OUTFILE,HEADERLINE,IDENTIFIER C EXPORT C ( OUTPUT TO UNIT KOUT ) LOGICAL ERROR C INTERNAL INTEGER BLOCKSIZE PARAMETER (BLOCKSIZE= 10) INTEGER ISTART, ISTOP, JSTART, JSTOP INTEGER BEGIN, END C INTEGER LENGTH CHARACTER*(250) OUTLINE LOGICAL NOCHAINBREAKS ERROR = .FALSE. C TRY TO OPEN OUTFILE; RETURN IF UNSUCCESSFUL CALL OPEN_FILE(KOUT,OUTFILE,'unknown,append',error) C ERROR MESSAGES ARE ALREDY ISSUED BY OPEN_FILE IF ( ERROR ) RETURN NOCHAINBREAKS = .FALSE. C LENGTH = SEQSTOP-SEQSTART+1 C headerline is a comment line: marked by '>' CALL STRPOS(IDENTIFIER,ISTART,ISTOP) CALL STRPOS(HEADERLINE,JSTART,JSTOP) OUTLINE = '>' // IDENTIFIER(MAX(ISTART,1):MAX(1,ISTOP)) // 1 ' , ' // 2 HEADERLINE(MAX(JSTART,1):MAX(1,JSTOP)) CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:ISTOP) C WRITE SEQUENCE BEGIN = SEQSTART C "repeat until" 1 CONTINUE C writeseqline returns end CALL WRITESEQLINE(SEQ,BEGIN,BLOCKSIZE,NBLOCKS,SEQSTOP, 1 NOCHAINBREAKS,OUTLINE,END,ERROR) IF ( ERROR ) STOP IF ( END .EQ. SEQSTOP ) THEN CALL STRPOS(OUTLINE,ISTART,ISTOP) C OUTLINE = OUTLINE(1:MAX(1,ISTOP)) // '*' OUTLINE = OUTLINE(1:MAX(1,ISTOP)) ENDIF CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) BEGIN = END + 1 IF ( BEGIN .LE. SEQSTOP ) GOTO 1 C END "REPEAT UNTIL" CLOSE(KOUT) RETURN END C END WRITE_PEARSON C...................................................................... C...................................................................... C SUB WRITE_PHYLIP SUBROUTINE WRITE_PHYLIP(KOUT,MAXALIGNS,MAXCORE,BEGIN, 1 END,NBLOCKS,ALISEQ,ALIPOINTER, 2 IFIR,ILAS,SEQNAMES,NSEQ,ERROR) IMPLICIT NONE C This routine writes an "sequential" phylip format, i.e. one sequence C from begin to end, then next one C IMPORT INTEGER KOUT INTEGER MAXALIGNS, MAXCORE INTEGER BEGIN, END, NBLOCKS INTEGER NSEQ INTEGER ALIPOINTER(MAXALIGNS) INTEGER IFIR(MAXALIGNS), ILAS(MAXALIGNS) CHARACTER ALISEQ(MAXCORE) CHARACTER*(*) SEQNAMES(MAXALIGNS) C EXPORT C ( OUTPUT TO UNIT KOUT ) LOGICAL ERROR C INTERNAL INTEGER BLOCKSIZE INTEGER MAXRES_LOC INTEGER CODELEN_LOC INTEGER LINELEN PARAMETER (BLOCKSIZE= 10) PARAMETER (LINELEN= 250) PARAMETER (MAXRES_LOC= 9999) C PARAMETER (MAXRES_LOC= 10000) C PARAMETER (MAXRES_LOC= 30011) PARAMETER (CODELEN_LOC= 9) INTEGER IPOS,POS1,POS2,ISEQ,IOUTPOS INTEGER ISTART,ISTOP,ACTNBLOCKS C INTEGER I1START,I1STOP CHARACTER CGAPCHAR CHARACTER*(MAXRES_LOC) STRAND CHARACTER*(LINELEN) OUTLINE, TMPSTRING LOGICAL NOCHAINBREAKS CGAPCHAR = '-' NOCHAINBREAKS = .FALSE. IF ( LEN(SEQNAMES(1)) .LT. CODELEN_LOC ) THEN ERROR = .TRUE. WRITE(6,'(A)') ' CODELEN TOO SHORT IN WRITE_PHYLIP !' RETURN ENDIF C PHYLIP headerline: " nseq end-begin+1" WRITE(OUTLINE,'(I4,1X,I4)') NSEQ, END-BEGIN+1 CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:ISTOP) C Write alignment C provide one whole sequence in "strand" for w_scaleline to C transfer it DO ISEQ = 1,NSEQ IOUTPOS = 1 DO IPOS = BEGIN,END IF ( IPOS .LT. IFIR(ISEQ) .OR. 1 IPOS .GT. ILAS(ISEQ) ) THEN STRAND(IOUTPOS:IOUTPOS) = CGAPCHAR ELSE STRAND(IOUTPOS:IOUTPOS) = ALISEQ( 1 ALIPOINTER(ISEQ)+IPOS-IFIR(ISEQ) 2 ) ENDIF IOUTPOS = IOUTPOS + 1 ENDDO DO IPOS=1,IOUTPOS-1 IF ( STRAND(IPOS:IPOS) .EQ. '.') THEN STRAND(IPOS:IPOS)= CGAPCHAR ENDIF ENDDO C CALL CHARARRAYREPL(STRAND,IOUTPOS-1,'.',CGAPCHAR) POS1 = BEGIN C "REPEAT UNTIL" 1 CONTINUE IF ( POS1 .EQ. BEGIN ) THEN ACTNBLOCKS = NBLOCKS - 1 ELSE ACTNBLOCKS = NBLOCKS ENDIF C writeseqline returns pos2 CALL WRITESEQLINE(STRAND,POS1,BLOCKSIZE,ACTNBLOCKS,END, 1 NOCHAINBREAKS,TMPSTRING,POS2,ERROR) IF ( ERROR ) STOP CALL STRPOS(TMPSTRING,ISTART,ISTOP) IF ( POS1 .EQ. BEGIN ) THEN C SEQUENCE NAME APPEARS ONLY ONCE ( IN FIRST LINE ) OUTLINE = SEQNAMES(ISEQ)(1:CODELEN_LOC) // ' ' // 1 TMPSTRING(ISTART:ISTOP) ELSE OUTLINE = TMPSTRING(ISTART:ISTOP) ENDIF CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:ISTOP) POS1 = POS2 + 1 IF ( POS1 .LT. END ) GOTO 1 C END "REPEAT UNTIL" ENDDO RETURN END C END WRITE_PHYLIP C...................................................................... C...................................................................... C SUB WRITE_PIR SUBROUTINE WRITE_PIR(KOUT,SEQ,INFILE,OUTFILE,ACCESSION, 1 IDENTIFIER,NSYMBOLS,SEQSTART,SEQSTOP,ERROR) IMPLICIT NONE C IMPORT INTEGER KOUT INTEGER NSYMBOLS INTEGER SEQSTART, SEQSTOP CHARACTER*(*) SEQ CHARACTER*(*) INFILE, OUTFILE,ACCESSION, IDENTIFIER C EXPORT C ( OUTPUT TO UNIT KOUT ) LOGICAL ERROR C INTERNAL INTEGER ISTART, ISTOP INTEGER I1START, I1STOP, I2START, I2STOP, I3START, I3STOP INTEGER BEGIN, END C INTEGER LENGTH INTEGER MAX_LINE_LEN PARAMETER (MAX_LINE_LEN= 1000) CHARACTER*(MAX_LINE_LEN) OUTLINE LOGICAL NOCHAINBREAKS ERROR = .FALSE. C try to open outfile; return if unsuccessful CALL OPEN_FILE(KOUT,OUTFILE,'unknown,append',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN NOCHAINBREAKS = .FALSE. c length = seqstop-seqstart+1 CALL STRPOS(ACCESSION,ISTART,ISTOP) IF (ISTART.LT.1 .OR. ISTOP.LT.1 .OR. (ISTOP-ISTART).GT.10) THEN ACCESSION=' ' END IF WRITE(OUTLINE,'(A,A)') 1 '>P1; ',ACCESSION(MAX(1,ISTART):MAX(1,ISTOP)) C WRITE(OUTLINE,'(A,A)') C 1 '>',ACCESSION(MAX(1,ISTART):MAX(1,ISTOP)) CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) c call strpos(infile,i1start,i1stop) c call strpos(outfile,i2start,i2stop) c write(outline,'(a,1x,a,1x,a,1x,a,i4,1x,a,i4,1x,a)') c 1 outfile(i2start:i2stop),'(', infile(i1start:i1stop), c 2 'from: ',seqstart,'to: ', seqstop,')' CALL STRPOS(IDENTIFIER,ISTART,ISTOP) ISTOP=MIN(ISTOP,MAX_LINE_LEN) WRITE(OUTLINE,'(A)') 1 IDENTIFIER(MAX(1,ISTART):MAX(1,ISTOP)) CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) BEGIN = SEQSTART C "REPEAT UNTIL" 1 CONTINUE C WRITESEQLINE RETURNS END CALL WRITESEQLINE(SEQ,BEGIN,1,NSYMBOLS,SEQSTOP, 1 NOCHAINBREAKS,OUTLINE,END,ERROR) IF ( ERROR ) STOP IF ( END .EQ. SEQSTOP ) THEN CALL STRPOS(OUTLINE,ISTART,ISTOP) OUTLINE = OUTLINE(1:MAX(1,ISTOP)) // ' *' ENDIF CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) BEGIN = END + 1 IF ( BEGIN .LE. SEQSTOP ) GOTO 1 C END "REPEAT UNTIL" CLOSE(KOUT) RETURN END C END WRITE_PIR C...................................................................... C...................................................................... C SUB WRITE_STAR SUBROUTINE WRITE_STAR(KOUT,SEQ,NBLOCKS,INFILE,OUTFILE, 1 HEADERLINE,SEQSTART,SEQSTOP,ERROR) IMPLICIT NONE C IMPORT INTEGER KOUT INTEGER NBLOCKS INTEGER SEQSTART,SEQSTOP CHARACTER*(*) SEQ CHARACTER*(*) INFILE,OUTFILE,HEADERLINE C EXPORT C ( OUTPUT TO UNIT KOUT ) LOGICAL ERROR C INTERNAL INTEGER BLOCKSIZE PARAMETER (BLOCKSIZE= 10) INTEGER ISTART, ISTOP INTEGER BEGIN, END C INTEGER LENGTH CHARACTER*(250) OUTLINE LOGICAL NOCHAINBREAKS ERROR = .FALSE. C try to open outfile; return if unsuccessful CALL OPEN_FILE(KOUT,OUTFILE,'NEW',ERROR) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN NOCHAINBREAKS = .FALSE. c length = seqstop-seqstart+1 C begin and end CALL STRPOS(INFILE,ISTART,ISTOP) WRITE(OUTLINE,'(A,A,1X,A,I4,1X,A,I4)') 1 '* ',infile(istart:istop),'from: ', 2 seqstart,'to: ',seqstop CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(MAX(ISTART,1):MAX(1,ISTOP)) C headerline is a comment line: marked by '*' CALL STRPOS(HEADERLINE,ISTART,ISTOP) OUTLINE = '* ' // HEADERLINE(MAX(ISTART,1):MAX(1,ISTOP)) WRITE(KOUT,'(A)') OUTLINE(1:ISTOP+2) C write sequence BEGIN = SEQSTART C "repeat until" 1 CONTINUE C writeseqline returns end CALL WRITESEQLINE(SEQ,BEGIN,BLOCKSIZE,NBLOCKS,SEQSTOP, 1 NOCHAINBREAKS,OUTLINE,END,ERROR) IF ( ERROR ) STOP CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) BEGIN = END + 1 IF ( BEGIN .LE. SEQSTOP ) GOTO 1 C END "REPEAT UNTIL" CLOSE(KOUT) RETURN END C END WRITE_STAR C...................................................................... C...................................................................... C SUB WRITELINES SUBROUTINE WRITELINES(CSTRING) C if 'cstring' contains '/n' (new line) this routine writes cstring C line by line on screen; called by GETINT,GETREAL..... CHARACTER*(*) CSTRING INTEGER ICUTBEGIN(30),ICUTEND(30) CALL STRPOS(CSTRING,ISTART,ISTOP) ILINE=1 ICUTBEGIN(ILINE)=1 ICUTEND(ILINE)=ISTOP DO I=1,ISTOP-1 IF (CSTRING(I:I+1).EQ.'/n') THEN ILINE=ILINE+1 ICUTBEGIN(ILINE)=I+2 ICUTEND(ILINE-1)=I-1 ICUTEND(ILINE)=ISTOP ENDIF ENDDO DO I=1,ILINE WRITE(6,*)CSTRING(ICUTBEGIN(I):ICUTEND(I)) ENDDO RETURN END C END WRITELINES C...................................................................... C...................................................................... C SUB WRITEPROFILE SUBROUTINE WRITEPROFILE(KPROF,PROFILENAME,MAXRES, + NRES,NCHAIN,HSSPID,HEADER,COMPOUND,SOURCE,AUTHOR, + SMIN,SMAX,MAPLOW,MAPHIGH,METRICFILE, + PDBNO,CHAINID,SEQ,STRUC,ACC,COLS,SHEETLABEL, + BP1,BP2,NOCC,GAPOPEN,GAPELONG,CONSWEIGHT, + PROFILEMETRIC,MAXBOX,NBOX,PROFILEBOX,LDSSP) IMPLICIT NONE INTEGER nacid PARAMETER (nacid=20) INTEGER kprof,maxres,nres,acc(*),bp1(*),bp2(*),nocc(*) INTEGER NCHAIN,pdbno(*) INTEGER MAXBOX,NBOX,PROFILEBOX(MAXBOX,2) REAL profilemetric(maxres,*),gapopen(*),gapelong(*) REAL consweight(*) REAL smin,smax,maplow,maphigh CHARACTER*(*) hsspid,header,compound,source,author,metricfile CHARACTER*(*) profilename,seq(*),struc(*) CHARACTER*(*) chainid(*) character*7 cols(*) character sheetlabel(*) LOGICAL LDSSP C internal CHARACTER*500 line INTEGER ilen,i,j,ibox,istart,istop LOGICAL lerror C====================================================================== CALL OPEN_FILE(KPROF,PROFILENAME,'NEW,RECL=350',LERROR) IF (LDSSP) THEN WRITE(KPROF,'(A)') + '****** MAXHOM-PROFILE WITH SECONDARY-STRUCTURE V1.0 ******' ELSE WRITE(KPROF,'(A)')'****** MAXHOM-PROFILE V1.0 ******' ENDIF WRITE(KPROF,'(A)')'# ' CALL STRPOS(HSSPID,I,J) WRITE(KPROF,'(A,A)') 'ID : ',HSSPID(I:J) CALL STRPOS(HEADER,I,J) IF (I .GT. 0 .AND. J .GT. 0) THEN WRITE(KPROF,'(A,A)') 'HEADER : ',HEADER(I:J) ELSE WRITE(KPROF,'(A)') 'HEADER : ' ENDIF CALL STRPOS(COMPOUND,I,J) IF (I .GT. 0 .AND. J .GT. 0) THEN WRITE(KPROF,'(A,A)') 'COMPOUND : ',COMPOUND(I:J) ELSE WRITE(KPROF,'(A)') 'COMPOUND : ' ENDIF CALL STRPOS(SOURCE,I,J) IF (I .GT. 0 .AND. J .GT. 0) THEN WRITE(KPROF,'(A,A)') 'SOURCE : ',SOURCE(I:J) ELSE WRITE(KPROF,'(A)') 'SOURCE : ' ENDIF CALL STRPOS(AUTHOR,I,J) IF (I .GT. 0 .AND. J .GT. 0) THEN WRITE(KPROF,'(A,A)') 'AUTHOR : ',AUTHOR(I:J) ELSE WRITE(KPROF,'(A)') 'AUTHOR : ' ENDIF WRITE(KPROF,'(A,I4)') 'NRES : ',NRES WRITE(KPROF,'(A,I4)') 'NCHAIN : ',NCHAIN WRITE(KPROF,'(A,F7.2)')'SMIN : ',SMIN WRITE(KPROF,'(A,F7.2)')'SMAX : ',SMAX WRITE(KPROF,'(A,F7.2)')'MAPLOW : ',MAPLOW WRITE(KPROF,'(A,F7.2)')'MAPHIGH : ',MAPHIGH CALL STRPOS(METRICFILE,I,J) IF (I .GT. 0 .AND. J .GT. 0) THEN WRITE(KPROF,'(A,A)')'METRIC : ',METRICFILE(I:J) ELSE WRITE(KPROF,'(A)')'METRIC : ' ENDIF IF (NBOX.GT.1) THEN WRITE(KPROF,'(A,I6)')'NBOX : ',NBOX DO IBOX=1,NBOX WRITE(KPROF,'(A,I4,A,I4,A,I4)')'BOX',IBOX,' : ', + PROFILEBOX(IBOX,1),'-',PROFILEBOX(IBOX,2) ENDDO ENDIF write(kprof,'(a)')'#========================================='// + '======================================================='// + '======================================================='// + '===================================================' CSeqNo PDBNo AA STRUCTURE BP1 BP2 ACC NOCC open elong V L ... C line=' SeqNo PDBNo AA STRUCTURE BP1 BP2 ACC NOCC '// + 'OPEN ELONG WEIGHT '// + 'V L I M F W Y '// + 'G A P S T C H '// + 'R K Q E N D' CALL STRPOS(LINE,ISTART,ISTOP) WRITE(KPROF,'(A)')LINE(:ISTOP) DO I=1,NRES IF (I.GT.MAXRES) THEN WRITE(6,*)' *** ERROR IN WRITEPROFILE: NRES.GT.MAXRES' STOP ENDIF IF (STRUC(I).EQ.'U')STRUC(I)=' ' WRITE(LINE,100)I,PDBNO(I),CHAINID(I),SEQ(I),STRUC(I), + COLS(I),BP1(I),BP2(I),SHEETLABEL(I),ACC(I),NOCC(I) IF (PDBNO(I).EQ.0)LINE(7:11)=' ' CALL STRPOS(LINE,ISTART,ISTOP) WRITE(LINE(ISTOP+1:),'(2(F6.2),F7.2,20(F8.3))') + GAPOPEN(I),GAPELONG(I),CONSWEIGHT(I), + (PROFILEMETRIC(I,J),J=1,NACID) CALL STRPOS(LINE,ISTART,ISTOP) WRITE(KPROF,'(A)')LINE(:ISTOP) 100 FORMAT(2X,2(I4,1X),A1,1X,A1,2X,A1,1X,A7,2(I4),A1,2(I4,1X)) ENDDO WRITE(KPROF,'(A)')'//' CLOSE(KPROF) RETURN END C END WRITEPROFILE C...................................................................... C...................................................................... C SUB WRITESCALELINE SUBROUTINE WRITESCALELINE(ISTART,ISTOP,LABEL1,LABEL2,OUTLINE) IMPLICIT NONE C 4.11.93 C ISTART: POSITION AFTER WHICH TO PLACE LABEL1 C ISTOP : POSITION AT WHICH LABEL2 SHOULD END C IMPORT INTEGER ISTART,ISTOP,LABEL1,LABEL2 C EXPORT CHARACTER*(*) OUTLINE C INTERNAL INTEGER LABELLEN PARAMETER (LABELLEN= 4) CHARACTER*16 FORM CHARACTER*(LABELLEN) CTMP *----------------------------------------------------------------------* C PREPARE LABEL OUTPUT FORMAT CTMP=' ' WRITE(CTMP,'(I2)') LABELLEN CALL LEFTADJUST(CTMP,1,LABELLEN) FORM = '( I4' // ')' C BUILD UP OUTLINE OUTLINE = ' ' WRITE(CTMP,FORM) LABEL1 CALL LEFTADJUST(CTMP,1,LABELLEN) OUTLINE = OUTLINE(1:ISTART-1) // CTMP WRITE(CTMP,FORM) LABEL2 CALL RIGHTADJUST(CTMP,1,LABELLEN) OUTLINE = OUTLINE(1:ISTOP-LABELLEN) // CTMP RETURN END C END WRITESCALELINE C...................................................................... C...................................................................... C SUB WRITESEQLINE SUBROUTINE WRITESEQLINE(SEQ,ISTART,BLOCKSIZE,NBLOCKS,NRES, 1 NOCHAINBREAKS,OUTLINE,ISTOP,ERROR) IMPLICIT NONE C 4.11.93 C CCCCCCCCCC CCCCCCCCCC CCCCCCCCCC CCCCCCCCCC C ^ ^ ^ C istart: blocksize: istop: C first 10 here last C seq.pos. seq.pos. C to be transferred C transferred C C C nblocks: 4 here C line starts with 1 blank C istart is given, istop is returned ( if ( nochainbreaks ) maybe C some symbols are not transferred )) C C IMPORT INTEGER ISTART, ISTOP INTEGER BLOCKSIZE INTEGER NBLOCKS, NRES CHARACTER*(*) SEQ LOGICAL NOCHAINBREAKS C EXPORT CHARACTER*(*) OUTLINE LOGICAL ERROR C INTERNAL INTEGER ISEQPOS, ILINEPOS,IBLOCKPOS,IBLOCK ERROR = .FALSE. OUTLINE = ' ' ILINEPOS = 1 IBLOCKPOS = 0 IBLOCK = 1 ISEQPOS = ISTART - 1 DO WHILE ( ILINEPOS .LT. NBLOCKS*BLOCKSIZE+NBLOCKS .AND. 1 ISEQPOS .LT. NRES ) ISEQPOS = ISEQPOS + 1 IF ( IBLOCK .LT. NBLOCKS .AND. 1 IBLOCKPOS .EQ. BLOCKSIZE ) THEN IBLOCKPOS = 0 IBLOCK = IBLOCK + 1 ILINEPOS = ILINEPOS + 1 OUTLINE(ILINEPOS:ILINEPOS) = ' ' ENDIF IF ( .NOT. NOCHAINBREAKS .OR. 1 ( NOCHAINBREAKS .AND. SEQ(ISEQPOS:ISEQPOS) .NE. '!' ) 2 ) THEN ILINEPOS = ILINEPOS + 1 IBLOCKPOS = IBLOCKPOS + 1 OUTLINE(ILINEPOS:ILINEPOS) = SEQ(ISEQPOS:ISEQPOS) ENDIF ENDDO ISTOP = ISEQPOS RETURN END C END WRITESEQLINE C...................................................................... C...................................................................... C SUB U3B SUBROUTINE U3B(W,X,Y,N,MODE,RMS,U,T,IER) C this version copied July 1986. DO NOT REDISTRIBUTE. C If you want this routine, ask Wolfgang Kabsch C**** CALCULATES A BEST ROTATION & TRANSLATION BETWEEN TWO VECTOR SETS C**** SUCH THAT U*X+T IS THE CLOSEST APPROXIMATION TO Y. C**** THE CALCULATED BEST SUPERPOSITION MAY NOT BE UNIQUE AS INDICATED C**** BY A RESULT VALUE IER=-1. HOWEVER IT IS GARANTIED THAT WITHIN C**** NUMERICAL TOLERANCES NO OTHER SUPERPOSITION EXISTS GIVING A C**** SMALLER VALUE FOR RMS. C**** THIS VERSION OF THE ALGORITHM IS OPTIMIZED FOR THREE-DIMENSIONAL C**** REAL VECTOR SPACE. C**** USE OF THIS ROUTINE IS RESTRICTED TO NON-PROFIT ACADEMIC C**** APPLICATIONS. C**** PLEASE REPORT ERRORS TO C**** PROGRAMMER: W.KABSCH MAX-PLANCK-INSTITUTE FOR MEDICAL RESEARCH C JAHNSTRASSE 29, 6900 HEIDELBERG, FRG. C**** REFERENCES: W.KABSCH ACTA CRYST.(1978).A34,827-828 C W.KABSCH ACTA CRYST.(1976).A32,922-923 C C W - W(M) IS WEIGHT FOR ATOM PAIR # M (GIVEN) C X - X(I,M) ARE COORDINATES OF ATOM # M IN SET X (GIVEN) C Y - Y(I,M) ARE COORDINATES OF ATOM # M IN SET Y (GIVEN) C N - N IS NUMBER OF ATOM PAIRS (GIVEN) C MODE - 0:CALCULATE RMS ONLY (GIVEN) C 1:CALCULATE RMS,U,T (TAKES LONGER) C RMS - SUM OF W*(UX+T-Y)**2 OVER ALL ATOM PAIRS (RESULT) C U - U(I,J) IS ROTATION MATRIX FOR BEST SUPERPOSITION (RESULT) C T - T(I) IS TRANSLATION VECTOR FOR BEST SUPERPOSITION (RESULT) C IER - 0: A UNIQUE OPTIMAL SUPERPOSITION HAS BEEN DETERMINED(RESULT) C -1: SUPERPOSITION IS NOT UNIQUE BUT OPTIMAL C -2: NO RESULT OBTAINED BECAUSE OF NEGATIVE WEIGHTS W C OR ALL WEIGHTS EQUAL TO ZERO. C C----------------------------------------------------------------------- INTEGER IP(9),IP2312(4),I,J,K,L,M1,M,IER,N,MODE REAL W(*),X(3,*),Y(3,*),U(3,*),T(*),RMS,SIGMA c REAL*16 R(3,3),XC(3),YC(3),WC,A(3,3),B(3,3),E0, c 1 E(3),E1,E2,E3,D,H,G,SPUR,DET,COF,CTH,STH,SQRTH,P,TOL, c 2 RR(6),RR1,RR2,RR3,RR4,RR5,RR6,SS(6),SS1,SS2,SS3,SS4,SS5,SS6, c 3 ZERO,ONE,TWO,THREE,SQRT3 C most UNIX machines know only real*8 C on VAX compile it with /G_Floating DOUBLE PRECISION R(3,3),XC(3),YC(3),WC,A(3,3),B(3,3),E0, 1 E(3),E1,E2,E3,D,H,G,SPUR,DET,COF,CTH,STH,SQRTH,P,TOL, 2 RR(6),RR1,RR2,RR3,RR4,RR5,RR6,SS(6),SS1,SS2,SS3,SS4,SS5,SS6, 3 ZERO,ONE,TWO,THREE,SQRT3 EQUIVALENCE (RR1,RR(1)),(RR2,RR(2)),(RR3,RR(3)), 1 (RR4,RR(4)),(RR5,RR(5)),(RR6,RR(6)), 2 (SS1,SS(1)),(SS2,SS(2)),(SS3,SS(3)), 3 (SS4,SS(4)),(SS5,SS(5)),(SS6,SS(6)), 4 (E1,E(1)),(E2,E(2)),(E3,E(3)) DATA SQRT3,TOL/1.73205080756888D+00, 1.0D-2/ DATA ZERO,ONE,TWO,THREE/0.0D+00, 1.0D+00, 2.0D+00, 3.0D+00/ DATA IP/1,2,4, 2,3,5, 4,5,6/ DATA IP2312/2,3,1,2/ WC=ZERO RMS=0.0 E0=ZERO DO 1 I=1,3 XC(I)=ZERO YC(I)=ZERO T(I)=0.0 DO 1 J=1,3 D=ZERO IF (I.EQ.J)D=ONE U(I,J)=real(D) A(I,J)=D 1 R(I,J)=ZERO IER=-1 IF (N.LT.1)RETURN C**** DETERMINE CENTROIDS OF BOTH VECTOR SETS X AND Y IER=-2 DO 2 M=1,N IF (W(M).LT.0.0)RETURN WC=WC+W(M) DO 2 I=1,3 XC(I)=XC(I)+W(M)*X(I,M) 2 YC(I)=YC(I)+W(M)*Y(I,M) IF (WC.LE.ZERO)RETURN DO 3 I=1,3 XC(I)=XC(I)/WC 3 YC(I)=YC(I)/WC C**** DETERMINE CORRELATION MATRIX R BETWEEN VECTOR SETS Y AND X DO 4 M=1,N DO 4 I=1,3 E0=E0+W(M)*((X(I,M)-XC(I))**2+(Y(I,M)-YC(I))**2) D=W(M)*(Y(I,M)-YC(I)) DO 4 J=1,3 4 R(I,J)=R(I,J)+D*(X(J,M)-XC(J)) C**** CALCULATE DETERMINANT OF R(I,J) DET=R(1,1)*(R(2,2)*R(3,3)-R(2,3)*R(3,2)) 1 -R(1,2)*(R(2,1)*R(3,3)-R(2,3)*R(3,1)) 2 +R(1,3)*(R(2,1)*R(3,2)-R(2,2)*R(3,1)) SIGMA=real(DET) C**** FORM UPPER TRIANGLE OF TRANSPOSED(R)*R M=0 DO 5 J=1,3 DO 5 I=1,J M=M+1 5 RR(M)=R(1,I)*R(1,J)+R(2,I)*R(2,J)+R(3,I)*R(3,J) C***************** EIGENVALUES ***************************************** C**** FORM CHARACTERISTIC CUBIC X**3-3*SPUR*X**2+3*COF*X-DET=0 SPUR=(RR1+RR3+RR6)/THREE COF=(RR3*RR6-RR5*RR5+RR1*RR6-RR4*RR4+RR1*RR3-RR2*RR2)/THREE DET=DET*DET DO 6 I=1,3 6 E(I)=SPUR IF (SPUR.LE.ZERO)GOTO 40 C**** REDUCE CUBIC TO STANDARD FORM Y**3-3HY+2G=0 BY PUTTING X=Y+SPUR D=SPUR*SPUR H=D-COF G=(SPUR*COF-DET)/TWO-SPUR*H C**** SOLVE CUBIC. ROOTS ARE E1,E2,E3 IN DECREASING ORDER IF (H.LE.ZERO)GOTO 8 SQRTH=DSQRT(H) c SQRTH=QSQRT(H) D=H*H*H-G*G IF (D.LT.ZERO)D=ZERO D=DATAN2(DSQRT(D),-G)/THREE CTH=SQRTH*DCOS(D) STH=SQRTH*SQRT3*DSIN(D) c D=QATAN2(QSQRT(D),-G)/THREE c CTH=SQRTH*QCOS(D) c STH=SQRTH*SQRT3*QSIN(D) E1=SPUR+CTH+CTH E2=SPUR-CTH+STH E3=SPUR-CTH-STH IF (MODE)10,50,10 C HANDLE SPECIAL CASE OF 3 IDENTICAL ROOTS 8 IF (MODE)30,50,30 C**************** EIGENVECTORS ***************************************** 10 DO 15 L=1,3,2 D=E(L) SS1=(D-RR3)*(D-RR6)-RR5*RR5 SS2=(D-RR6)*RR2+RR4*RR5 SS3=(D-RR1)*(D-RR6)-RR4*RR4 SS4=(D-RR3)*RR4+RR2*RR5 SS5=(D-RR1)*RR5+RR2*RR4 SS6=(D-RR1)*(D-RR3)-RR2*RR2 J=1 IF (DABS(SS1).GE.DABS(SS3))GOTO 12 c IF (QABS(SS1).GE.QABS(SS3))GOTO 12 J=2 IF (DABS(SS3).GE.DABS(SS6))GOTO 13 c IF (QABS(SS3).GE.QABS(SS6))GOTO 13 11 J=3 GOTO 13 12 IF (DABS(SS1).LT.DABS(SS6))GOTO 11 c12 IF (QABS(SS1).LT.QABS(SS6))GOTO 11 13 D=ZERO J=3*(J-1) DO 14 I=1,3 K=IP(I+J) A(I,L)=SS(K) 14 D=D+SS(K)*SS(K) IF (D.GT.ZERO)D=ONE/DSQRT(D) c IF (D.GT.ZERO)D=ONE/QSQRT(D) DO 15 I=1,3 15 A(I,L)=A(I,L)*D D=A(1,1)*A(1,3)+A(2,1)*A(2,3)+A(3,1)*A(3,3) M1=3 M=1 IF ((E1-E2).GT.(E2-E3))GOTO 16 M1=1 M=3 16 P=ZERO DO 17 I=1,3 A(I,M1)=A(I,M1)-D*A(I,M) 17 P=P+A(I,M1)**2 IF (P.LE.TOL)GOTO 19 P=ONE/DSQRT(P) c P=ONE/QSQRT(P) DO 18 I=1,3 18 A(I,M1)=A(I,M1)*P GOTO 21 19 P=ONE DO 20 I=1,3 IF (P.LT.DABS(A(I,M)))GOTO 20 P=DABS(A(I,M)) c IF (P.LT.QABS(A(I,M)))GOTO 20 c P=QABS(A(I,M)) J=I 20 CONTINUE K=IP2312(J) L=IP2312(J+1) P=DSQRT(A(K,M)**2+A(L,M)**2) c P=QSQRT(A(K,M)**2+A(L,M)**2) IF (P.LE.TOL)GOTO 40 A(J,M1)=ZERO A(K,M1)=-A(L,M)/P A(L,M1)=A(K,M)/P 21 A(1,2)=A(2,3)*A(3,1)-A(2,1)*A(3,3) A(2,2)=A(3,3)*A(1,1)-A(3,1)*A(1,3) A(3,2)=A(1,3)*A(2,1)-A(1,1)*A(2,3) C****************** ROTATION MATRIX ************************************ 30 DO 32 L=1,2 D=ZERO DO 31 I=1,3 B(I,L)=R(I,1)*A(1,L)+R(I,2)*A(2,L)+R(I,3)*A(3,L) 31 D=D+B(I,L)**2 IF (D.GT.ZERO)D=ONE/DSQRT(D) c IF (D.GT.ZERO)D=ONE/QSQRT(D) DO 32 I=1,3 32 B(I,L)=B(I,L)*D D=B(1,1)*B(1,2)+B(2,1)*B(2,2)+B(3,1)*B(3,2) P=ZERO DO 33 I=1,3 B(I,2)=B(I,2)-D*B(I,1) 33 P=P+B(I,2)**2 IF (P.LE.TOL)GOTO 35 P=ONE/DSQRT(P) c P=ONE/QSQRT(P) DO 34 I=1,3 34 B(I,2)=B(I,2)*P GOTO 37 35 P=ONE DO 36 I=1,3 IF (P.LT.DABS(B(I,1)))GOTO 36 P=DABS(B(I,1)) c IF (P.LT.QABS(B(I,1)))GOTO 36 c P=QABS(B(I,1)) J=I 36 CONTINUE K=IP2312(J) L=IP2312(J+1) P=DSQRT(B(K,1)**2+B(L,1)**2) c P=QSQRT(B(K,1)**2+B(L,1)**2) IF (P.LE.TOL)GOTO 40 B(J,2)=ZERO B(K,2)=-B(L,1)/P B(L,2)= B(K,1)/P 37 B(1,3)=B(2,1)*B(3,2)-B(2,2)*B(3,1) B(2,3)=B(3,1)*B(1,2)-B(3,2)*B(1,1) B(3,3)=B(1,1)*B(2,2)-B(1,2)*B(2,1) DO 39 I=1,3 DO 39 J=1,3 39 U(I,J)=real( B(I,1)*A(J,1)+B(I,2)*A(J,2)+B(I,3)*A(J,3) ) C****************** TRANSLATION VECTOR ********************************* 40 DO 41 I=1,3 41 T(I)=real ( YC(I)-U(I,1)*XC(1)-U(I,2)*XC(2)-U(I,3)*XC(3) ) C********************** RMS ERROR ************************************** 50 DO 51 I=1,3 IF (E(I).LT.ZERO)E(I)=ZERO 51 E(I)=DSQRT(E(I)) c51 E(I)=QSQRT(E(I)) IER=0 IF (E2.LE.(E1*1.0D-05))IER=-1 D=E3 IF (SIGMA.GE.0.0)GOTO 52 D=-D IF ((E2-E3).LE.(E1*1.0D-05))IER=-1 52 D=D+E2+E1 RMS=real( E0-D-D ) IF (RMS.LT.0.0)RMS=0.0 C next line added June 1989 by Georg Tuparev RMS=SQRT(RMS/N) RETURN END C END U3B C...................................................................... C...................................................................... C SUB UNTAB SUBROUTINE UNTAB(STRING) C removes 'tabs' from a string PARAMETER (LINESIZE= 300) CHARACTER STRING*(*) CHARACTER TEMPLINE*(LINESIZE) INTEGER LENGTH,I,J,TABSIZE *----------------------------------------------------------------------* TABSIZE=8 J=0 I=1 LENGTH=LEN(STRING) IF (LENGTH .GT. LINESIZE) THEN WRITE(6,*)'*** UNTAB: string truncated' LENGTH=LINESIZE ENDIF DO WHILE(I .LE. LENGTH) J=J+1 IF (J .LE. LINESIZE) THEN IF (STRING(I:I) .NE. CHAR(9) ) THEN TEMPLINE(J:J)=STRING(I:I) ELSE TEMPLINE(J:J)=' ' DO WHILE( MOD(J,TABSIZE) .NE. 0) J=J+1 IF (J .LE. LINESIZE)TEMPLINE(J:J)=' ' ENDDO ENDIF ENDIF I=I+1 ENDDO STRING(1:LENGTH)=TEMPLINE(1:LENGTH) RETURN END C END UNTAB C...................................................................... C...................................................................... C SUB UPTOLOW SUBROUTINE UPTOLOW(STRING,LENGTH) CHARACTER*(*) STRING INTEGER LENGTH cx CHARACTER UPPER*26, LOWER*26, STRING*(*) cx CHARACTER UPPER*26, LOWER*26, STRING*(*) cx DATA UPPER/'ABCDEFGHIJKLMNOPQRSTUVWXYZ'/ cx DATA LOWER/'abcdefghijklmnopqrstuvwxyz'/ DO I=1,LENGTH IF (STRING(I:I).GE.'A' .AND. STRING(I:I).LE.'Z') THEN STRING(I:I)=CHAR( ICHAR(STRING(I:I))+32 ) C X K=INDEX(UPPER,STRING(I:I)) C X IF (K.NE.0) STRING(I:I)=LOWER(K:K) ENDIF ENDDO RETURN END C END UPTOLOW C...................................................................... C...................................................................... C SUB MP_INIT_FARM C init a farmer worker model C VAX/VMS dummy version ; does nothing ; just init the stuff SUBROUTINE MP_INIT_FARM() IMPLICIT NONE C import INTEGER MAXPROC CHARACTER*200 HOST_FILE,HOST_NAME,NODE_NAME C export INTEGER IDPROC,NWORKER,NP,NWORKSET, + IDTOP,LINK(1:100),ID_HOST, + LINK_HOST,LINK_NODE_SENDER,LINK_NODE_RECEIVER, + SENDER_NODE(1:100),RECEIVER_NODE(1:100), + WORKSETSIZE(1:100),WORKSETBEG(1:100),WORKSETEND(1:100) CHARACTER*20 MP_MODEL LOGICAL LMIXED_ARCH C init MP_MODEL='NIX' ID_HOST=0 IDPROC=0 IDTOP=0 LINK_HOST=0 LINK_NODE_SENDER=0 LINK_NODE_RECEIVER=0 NWORKER=0 NWORKSET=0 LINK(1) = 0 LMIXED_ARCH=.FALSE. RETURN END C END MP_INIT_FARM C C...................................................................... C SUB MP_INIT_NODE SUBROUTINE MP_INIT_NODE(NODE_NAME,IDPROC) CHARACTER*(*) NODE_NAME INTEGER IDPROC RETURN END C end mp_init_node C...................................................................... C...................................................................... C sub mp_getmyid C get ID of process C VAX/VMS dummy version ; return id=0 SUBROUTINE MP_GETMYID(ID) INTEGER ID ID=0 RETURN END C END MP_GETMYID C...................................................................... C...................................................................... C SUB MP_NPROCS C get number of processors C VAX/VMS dummy version ; nprocessor=1 SUBROUTINE MP_NPROCS(NPROCESSOR) INTEGER NPROCESSOR NPROCESSOR=1 RETURN END C END MP_NPROCS C...................................................................... C...................................................................... C SUB MP_SELECT C is there somewhere a message for me ? C VAX/VMS dummy version ; SUBROUTINE MP_SELECT(MSGTYPE,WORKSETBEG,WORKSETEND,LINK,IFLAG) C import INTEGER MSGTYPE,WORKSETBEG,WORKSETEND,LINK(*) C export INTEGER IFLAG IFLAG=0 RETURN END C END MP_SELECT C...................................................................... C...................................................................... C SUB MP_SELECT_SUBSET C is there somewhere a message for me ? C VAX/VMS dummy version ; SUBROUTINE MP_SELECT_SUBSET(MSGTYPE,NWORKSET,SENDER_NODE, + LINK,IFLAG) C import INTEGER MSGTYPE,NWORKSET,SENDER_NODE(*),LINK(*) INTEGER IFLAG IFLAG=0 RETURN END C end mp_select_subset C...................................................................... C...................................................................... C sub mp_init_send C dummy version SUBROUTINE MP_INIT_SEND() RETURN END C end mp_init_send C...................................................................... C...................................................................... C sub mp_send_data C dummy version SUBROUTINE MP_SEND_DATA(MSGTYPE,RECEIVER_NAME) C input INTEGER MSGTYPE,LINK CHARACTER*(*) RECEIVER_NAME RETURN END C end mp_send_data C...................................................................... C...................................................................... C sub mp_init_receive C dummy version SUBROUTINE MP_INIT_RECEIVE(MSGTYPE) INTEGER MSGTYPE RETURN END C end mp_init_receive C...................................................................... C...................................................................... C sub mp_receive_data C dummy version SUBROUTINE MP_RECEIVE_DATA(MSGTYPE,LINK) C input INTEGER MSGTYPE C output INTEGER LINK RETURN END C end mp_receive_data C...................................................................... C...................................................................... C sub mp_put_int4 C VAX/VMS dummy version ; does nothing SUBROUTINE MP_PUT_INT4(IDTOP,LINK,DATA,NBYTE) INTEGER IDTOP,LINK,DATA(*),NBYTE RETURN END C end mp_put_int4 C...................................................................... C...................................................................... C sub mp_get_int4 C VAX/VMS dummy version ; does nothing SUBROUTINE MP_GET_INT4(IDTOP,LINK,DATA,NBYTE) INTEGER IDTOP,LINK,NBYTE,DATA(*) RETURN END C end mp_get_int4 C...................................................................... C...................................................................... C sub mp_put_real4 C VAX/VMS dummy version ; does nothing SUBROUTINE MP_PUT_REAL4(IDTOP,LINK,DATA,NBYTE) INTEGER IDTOP,LINK,NBYTE REAL DATA(*) RETURN END C end mp_put_real4 C...................................................................... C...................................................................... C sub mp_get_real4 C VAX/VMS dummy version ; does nothing SUBROUTINE MP_GET_REAL4(IDTOP,LINK,DATA,NBYTE) INTEGER IDTOP,LINK,NBYTE REAL DATA(*) RETURN END C end mp_get_real4 C...................................................................... C...................................................................... C sub mp_put_string_array C dummy version SUBROUTINE MP_PUT_STRING_ARRAY(IDTOP,LINK,DATA,NDIM) INTEGER IDTOP,LINK,NDIM,INFO CHARACTER*(*) DATA(NDIM) RETURN END C end mp_put_string_array C...................................................................... C...................................................................... C sub mp_put_string C dummy version SUBROUTINE MP_PUT_STRING(IDTOP,LINK,DATA,ILEN) INTEGER IDTOP,LINK,ILEN,INFO CHARACTER*(*) DATA RETURN END C end mp_put_string C...................................................................... C...................................................................... C sub mp_get_string_array C dummy version SUBROUTINE MP_GET_STRING_ARRAY(IDTOP,LINK,DATA,NDIM) INTEGER IDTOP,LINK,NDIM,INFO CHARACTER*(*) DATA(NDIM) RETURN END C end mp_get_string_array C...................................................................... C...................................................................... C sub mp_get_string C dummy version SUBROUTINE MP_GET_STRING(IDTOP,LINK,DATA,ILEN) INTEGER IDTOP,LINK,ILEN,INFO CHARACTER*(*) DATA RETURN END C end mp_get_string C...................................................................... C...................................................................... C SUB MP_LEAVE SUBROUTINE MP_LEAVE() RETURN END C END MP_LEAVE C...................................................................... C...................................................................... C sub mp_probe C is there somewhere a message for me ? C if not, return C PVM version SUBROUTINE MP_PROBE(MSGTYPE,IFLAG) C import INTEGER MSGTYPE C export INTEGER IFLAG IFLAG=0 RETURN END C end mp_probe C...................................................................... C...................................................................... C sub mp_get_int4 C PVM version SUBROUTINE MP_GET_INT4_ARRAY(IDTOP,LINK,DATA,NDATA) INTEGER IDTOP,LINK,NDATA,DATA(*) INTEGER INFO RETURN END C end mp_get_int4 C...................................................................... C...................................................................... C sub mp_put_int4 C PVM version SUBROUTINE MP_PUT_INT4_ARRAY(IDTOP,LINK,DATA,NDATA) INTEGER IDTOP,LINK,DATA(*),NDATA INTEGER INFO RETURN END C end mp_put_int4 C...................................................................... C...................................................................... C sub mp_get_real4 C PVM version SUBROUTINE MP_GET_REAL4_ARRAY(IDTOP,LINK,DATA,NDATA) INTEGER IDTOP,LINK,NDATA REAL DATA(*) INTEGER INFO RETURN END C end mp_get_real4 C...................................................................... C...................................................................... C sub mp_put_real4 C PVM version SUBROUTINE MP_PUT_REAL4_ARRAY(IDTOP,LINK,DATA,NDATA) INTEGER IDTOP,LINK,NDATA REAL DATA(*) INTEGER INFO RETURN END C end mp_put_real4 C...................................................................... C...................................................................... C sub mp_cast C PVM version SUBROUTINE MP_CAST(NTASKS,MSGTYPE,LINK) C input INTEGER NTASKS,MSGTYPE,LINK(*) C internal INTEGER INFO RETURN END C end mp_cast C...................................................................... C vim:et:ts=2: profphd-utils-1.0.10/lib-maxhom-node-pvm3.f0000755015075101507510000006613512012371465017720 0ustar lkajanlkajanC============================================ PVM3 C maxhom node interface for PVM3 subroutine node_interface(lh1,lh2) implicit none include 'maxhom.param' include 'maxhom.common' c import real lh1(0:maxmat) integer*2 lh2(0:maxtrace) c internal C on a mixed architecture network the node is reading the alignment C data on request and sends them to the host C local for each node real value,sim,hom,rms,distance,checkval,sdev integer ifir,jfir,jlas,idel,ndel,len1,lenocc integer irecord,nrecord,ialign,ifile,imsgtag integer i,j,k,ipoint,iend integer ilen_name,ilen_compnd,ilen_ACCESSION,ilen_pdbref,ilen_al integer ilen_insseq integer ilconsider,ildssp_2 character*100 coretemp character csymbol,ctemp logical lerror c logical lendbase integer file_buffer(maxqueue),nfile_queue character*80 filename_buffer(maxqueue_list) character*80 filename,tempname character pdbrefline*3000 character*80 chainremark,profilemetric character line*100 integer iseq,ibuf_poi c checkformat character*20 seqformat integer ifirst_round,iflag integer ipos,nchain,kselect c integer kchain real xmaplow,xmaphigh,xsmin,xsmax logical ltruncated c logical ldb_read_one C init C timing total_time=0.0 current_dir=' ' architecture=' ' lfirst_scan=.true. c ldb_read_one=.false. nbuffer_len = 6 + 12 + len(compnd_2) + + len(ACCESSION_2) + len(pdbref_2) c nbuffer_len = 6 + len(name_2) + len(compnd_2) + c + len(ACCESSION_2) + len(pdbref_2) call get_machine_name(machine_name) tempname='ARCH' call get_enviroment_variable(tempname,architecture) tempname='MAXHOM_DEFAULT' call get_enviroment_variable(tempname,maxhom_default) if(maxhom_default .eq. ' ')then write(*,*)'WARN: env.var. MAXHOM_DEFAULT not set; '// + 'now: ./maxhom.default' maxhom_default(1:)='maxhom.default' call flush_unit(6) endif c write(*,'(a,a,a,a,i5)')' maxhom_node : ', c + machine_name(1:15),' ',architecture(1:6),idproc c call flush_unit(6) C receive all the necessary data from host C loop entry for list processing 10 call receive_data_from_host(id_host) call flush_unit(6) c open log file for parameter,warnings, error..... if(lfirst_scan .eqv. .true.)then call concat_string_int('MAXHOM.LOG_',idproc,logfile) tempname=logfile if(corepath .ne. ' ')then call concat_strings(corepath,tempname,logfile) endif c tempname= 'NEW,RECL=200' c call open_file(klog,logfile,tempname,lerror) c call log_file(klog,'**************************** MAXHOM-'// c + 'LOGFILE ***************************',1) endif C end signal if we had a list of sequences if(n1 .eq. -999) goto 900 do i=1,maxqueue file_buffer(i) = 0 enddo do i=1,maxqueue_list filename_buffer(i)=' ' enddo ipoint=1 ialign=0 nrecord=0 irecord=0 ifile=0 ialign_good=0 insseq=' ' sdev=0.0 C overwrite default stuff in case the actual machine is on a different C file system (connected via Ethernet, Internet..) C Caution: maxhom.default has to be on the right place c call get_default() ldssp_2=.false. n2in=0 csymbol=' ' csq_2=' ' header_2=' ' compnd_2=' ' author_2=' ' source_2=' ' do i=1,maxsq cresid(i)=' ' pdbno_2(i)=0 bp1_2(i)=0 bp2_2(i)=0 lacc_2(i)=0 lstruc_2(i)=0 consweight_2(i)=1.0 enddo call init_char_array(1,maxsq,cols_2,csymbol) call init_char_array(1,maxsq,chainid_2,csymbol) call init_char_array(1,maxsq,sheetlabel_2,csymbol) call init_char_array(1,maxsq,struc_2,csymbol) call concat_string_int(corefile,link(idproc),tempname) coretemp= tempname if(corepath .ne. ' ')then call concat_strings(corepath,tempname,coretemp) endif call concat_string_int('UNFORMATTED,DIRECT,NEW,RECL=', + maxrecordlen,tempname) c write(*,*)idproc,' : ',coretemp(1:60) c call flush_unit(6) call open_file(kcore,coretemp,tempname,lerror) c$$$ if (ldb_read_one .eqv. .true.)then c$$$ lfirst_scan=.false. c$$$ msgtype=8000 c$$$ call mp_receive_data(msgtype,link(id_host)) c$$$ call mp_get_int4(msgtype,id_host,ipoint,N_one) c$$$ call mp_get_int4(msgtype,id_host,nseq_warm_start,N_one) c$$$ msgtype=9000 c$$$ call mp_receive_data(msgtype,link(id_host)) c$$$ call mp_get_string_array(msgtype,id_host,cdatabase_buffer, c$$$ + ipoint) c$$$ msgtype=10000 c$$$ call mp_receive_data(msgtype,link(id_host)) c$$$ call mp_get_int4(msgtype,id_host,i,N_one) c$$$ write (*,*)' got buffer and start signal: ',idproc,ipoint, c$$$ + nseq_warm_start c$$$ call flush_unit(6) c$$$ ipoint=1 c$$$ endif call get_cpu_time('time init:',idproc, + itime_old,itime_new,total_time,logstring) call log_file(klog,logstring,1) C====================================================================== C NO warm-start C====================================================================== if(lfirst_scan .eqv. .true.)then lfirst_scan=.false. nfile_queue=0 ifirst_round=0 c fill queue when first communication if( listofseq_2 .eqv. .true.)then msgtype=9000 do i=1,maxqueue_list call mp_receive_data(msgtype,link(id_host)) call mp_get_string(msgtype,id_host, + filename_buffer(i),len(filename_buffer(i))) write(*,'(a,a,i6)')'1. got: ', + filename_buffer(i)(1:30),idproc call flush_unit(6) if (filename_buffer(i) .eq. 'STOP')then nfile_queue=i goto 180 endif enddo ifirst_round=1 nfile_queue=maxqueue_list goto 180 endif c check if there is a (are) message(s) from the host to C fill our work-queue C if not, work on the next file on the stack 100 msgtype=3000 call mp_probe(msgtype,iflag) if ( iflag .gt. 0)then nfile_queue=nfile_queue+1 call mp_receive_data(msgtype,link(id_host)) if( listofseq_2 .eqv. .false.)then call mp_get_int4(msgtype,id_host, + file_buffer(nfile_queue),N_one) c write(logstring,*)'fill file stack: ',idproc,nfile_queue, c + file_buffer(nfile_queue) c call log_file(klog,logstring,1) if (file_buffer(nfile_queue) .lt. 0)goto 180 goto 100 else call mp_get_string(msgtype,id_host, + filename_buffer(nfile_queue), + len(filename_buffer(1)) ) if (filename_buffer(nfile_queue) .eq. 'STOP')goto 180 goto 100 endif endif C send work-request to host msgtype=2000 call mp_init_send() call mp_put_int4(msgtype,id_host,idproc,N_one) call mp_send_data(msgtype,link(id_host)) c write(logstring,'(a,i4,i4)')'send work_message:',idproc, c + link(id_host) c call log_file(klog,logstring,1) msgtype=3000 c fill queue when first communication if (ifirst_round .eq. 0) then if( listofseq_2 .eqv. .false.)then call mp_receive_data(msgtype,link(id_host)) do i=1,maxqueue call mp_get_int4(msgtype,id_host,file_buffer(i), + N_one) enddo ifirst_round=1 nfile_queue=maxqueue c else c do i=1,maxqueue_list c call mp_receive_data(msgtype,link(id_host)) c call mp_get_string(msgtype,id_host, c + filename_buffer(i),len(filename_buffer(i))) c write(*,'(a,a,i6)')'1. got: ', c + filename_buffer(i)(1:30),idproc c call flush_unit(6) c if (filename_buffer(i) .eq. 'STOP')then c nfile_queue=i c goto 180 c endif c enddo c ifirst_round=1 c nfile_queue=maxqueue_list endif c either the master is too slow, so we have to wait :-( c or we have finished the work and are waiting for C the "finish-signal" :-) else if (nfile_queue .le. 0)then write(logstring,'(a,i8,2x,i6)')'WARNING stack empty: ', + idproc,file_buffer(1) call log_file(klog,logstring,1) call flush_unit(6) call mp_receive_data(msgtype,link(id_host)) if( listofseq_2 .eqv. .false.)then call mp_get_int4(msgtype,id_host,file_buffer(1),N_one) else call mp_get_string(msgtype,id_host, + filename_buffer(1), + len(filename_buffer(1))) endif nfile_queue=1 endif C ====================================================================== C database C ====================================================================== 180 if(listofseq_2 .eqv. .false.)then C next file is always the first on the stack ifile=file_buffer(1) C now the real work starts if(ifile .gt. 0)then c write(logstring,'(a,i8,i4)')'work on file:',idproc, c + file_buffer(1) c call log_file(klog,logstring,1) call open_sw_data_file(kbase,lbinary,ifile, + split_db_data,split_db_path,hostname) if(lwarm_start .eqv. .false.)ipoint=1 if(lbinary .eqv. .true.)then read(kbase)ibuf_poi,iseq else read(kbase,'(i10,i10)')ibuf_poi,iseq endif if(ipoint + ibuf_poi .lt. maxdatabase_buffer)then if(lbinary .eqv. .true.)then read(kbase) + (cdatabase_buffer(i),i=ipoint,ipoint+ibuf_poi-1) else read(kbase,'(a)') + (cdatabase_buffer(i),i=ipoint,ipoint+ibuf_poi-1) endif close(kbase) do k=1,iseq do ipos=1,nbuffer_len cbuffer_line(ipos:ipos)= + cdatabase_buffer(ipoint) ipoint=ipoint+1 enddo read(cbuffer_line(1:),111)n2in,name_2, + ACCESSION_2,pdbref_2,compnd_2 c read(cbuffer_line(1:),111)n2in,name_2, c + compnd_2,ACCESSION_2,pdbref_2 111 format(i6,a12,a,a,a) iend=n2in if(n2in .gt. maxsq)iend=maxsq do ipos=1,iend csq_2(ipos:ipos)=cdatabase_buffer(ipoint) ipoint=ipoint+1 enddo if(n2in .gt. maxsq)then ipoint=ipoint + (n2in-iend) n2in=maxsq endif call do_align(lh1,lh2,idproc,ialign,nrecord,sdev) c ialign_processed=ialign_processed+1 enddo else write(logstring,*)' ** FATAL ERROR **/n'// + ' database_buffer overflow increase/n'// + ' dimension of MAXDATABASE_BUFFER' call log_file(klog,logstring,1) STOP endif nseq_warm_start=ialign c refresh the content of the work-queue nfile_queue=nfile_queue-1 do i=1,nfile_queue file_buffer(i) = file_buffer(i+1) enddo c write(logstring,'(a,i4,i4)')'on stack:',idproc,nfile_queue c call log_file(klog,logstring,1) if ( nfile_queue .gt. 0)then if (file_buffer(nfile_queue) .lt. 0)then goto 180 endif endif goto 100 endif C======================================================================= C list of filenames C======================================================================= else C next file is always the first on the stack filename=filename_buffer(1) C now the real work starts if(filename .ne. 'STOP')then name_2=filename call checkformat(kbase,name_2,seqformat,lerror) if(index(seqformat,'PROFILE') .ne. 0)lprofile_2=.true. if(index(seqformat,'DSSP' ) .ne. 0)ldssp_2=.true. if(lprofile_2)then c write(logstring,'(a,a)')'read PROFILE 2: ',name_2 c call log_file(klog,logstring,1) call readprofile(kprof,name_2,maxsq,ntrans,trans, + ldssp_2,n2in,nchain,hsspid_2,header_2,compnd_2, + source_2,author_2,xsmin,xsmax,xmaplow,xmaphigh, + profilemetric,pdbno_2,chainid_2,csq_2_array, + struc_2,nsurf_2,cols_2,sheetlabel_2,bp1_2, + bp2_2,nocc_2,gapopen_2,gapelong_2,consweight_2, + simmetric_2,maxbox,nbox_2,profilebox_2) do i=1,n2in csq_2(i:i)=csq_2_array(i) enddo caution C cstrstates,simorg and lsq_2 not known here C pass simorg and set lsq_2 if(metricfile .ne. 'PROFILE')then write(*,*)' option not possible, ask reinhard' stop endif if(smin_answer .eq. 'PROFILE')then smin=xsmin smax=xsmax maplow=xmaplow maphigh=xmaphigh else if(lprofile_2 .and. smin_answer .ne. + 'PROFILE')then maplow=xmaplow maphigh=xmaphigh endif if(openweight_answer .ne. 'PROFILE')then do i=1,maxsq gapopen_2(i)=open_1 enddo endif if(elongweight_answer .ne. 'PROFILE')then do i=1,maxsq gapelong_2(i)=elong_1 enddo endif C reset conservation weights for sequence 2 if not wanted if(.not. lconserv_2 )then do i=1,maxsq consweight_2(i)=1.0 enddo endif lnorm_profile=.false. c if(lnorm_profile)then c write(*,*)'CALL NORM_PROFILE ' c smin=0.0 ; smax=0.0 c maplow=0.0 ; maphigh=0.0 c call norm_profile(maxsq,ntrans,trans,n2in,n1, c + lsq_1,simmetric_2,profile_epsilon, c + profile_gamma,smin,smax,maplow, c + maphigh,gapopen_2,gapelong_2,sdev) c else c write(*,*)' call scale_profile disabled' c write(*,'(a,4(2x,f5.2)))')'CALL SCALE_PROFILE 2', c + smin,smax,maplow,maphigh c call scale_profile_metric(maxsq,ntrans,trans, c + simmetric_2,smin,smax,maplow,maphigh) c endif C not profile else C all chains wanted from dssp data set call checkformat(kbase,name_2,seqformat,lerror) if(index(seqformat,'DSSP') .ne. 0 .or. + index(seqformat,'PROFILE-DSSP') .ne.0)then ldssp_2=.true. endif c kchain=0 tempname=' ' i=index(name_2,'_!_') if(i.ne.0)then tempname(1:)=name_2(1:i-1) ctemp(1:)=name_2(i+3:) else tempname(1:)=name_2(1:) ctemp=' ' endif pdbrefline=' ' if(ldssp_2 .eqv. .false.)then call get_seq(kbase,tempname,trans,ctemp, + compnd_2,ACCESSION_2,pdbrefline,pdbno_2,n2in, + csq_2,struc_2_string,nsurf_2,ltruncated, + lerror) C convert cresid to pdb-number and chain identifier, used in 3d C superposition cresid from getseq is : C "1234AB" (number, alternate residue, chain identifier) C here skip alternate residue and append chain_id do i=1,n2in csq_2_array(i)=csq_2(i:i) struc_2(i)=struc_2_string(i:i) read(cresid(i),'(i4,1x,a)') + pdbno_2(i),chainid_2(i) enddo else C all chains wanted from DSSP data set k=0 chainremark=' ' i=index(tempname,'!')-1 if(i .gt. 0)then kselect=1 iend=len(tempname) do j=iend,i+1,-1 if(tempname(j:j) .eq. ',')kselect=kselect+1 enddo write(*,*)' use ',kselect,' chain(s) ', + tempname(i:) chainremark(1:)=tempname else call select_unique_chain(kgetseq, + tempname,line) chainremark= line(1:80) endif j=1 call getdsspforhssp(kgetseq,tempname, + maxsq,chainremark, + brkid_2,header_2,compnd_2,source_2, + author_2,n2in,i,j,k,pdbno_2, + chainid_2,csq_2_array,struc_2,cols_2, + bp1_2,bp2_2,sheetlabel_2,nsurf_2) do i=1,n2in csq_2(i:i)=csq_2_array(i) enddo endif call select_pdb_pointer(kref,dssp_path,pdbrefline, + pdbref_2) endif call do_align(lh1,lh2,idproc,ialign,nrecord,sdev) c refresh the content of the work-queue nfile_queue=nfile_queue-1 do i=1,nfile_queue filename_buffer(i) = filename_buffer(i+1) enddo if ( nfile_queue .gt. 0)then if (filename_buffer(nfile_queue) .eq. 'STOP')then goto 180 endif endif goto 100 endif C===================================================================== C list of filename C===================================================================== endif call get_cpu_time('database scan: ',idproc, + itime_old,itime_new,total_time,logstring) call log_file(klog,logstring,1) write(logstring,'(a,i6,i8,i10)')'internal buffer: ',idproc, + nseq_warm_start,ipoint call log_file(klog,logstring,1) if(listofseq_2 .eqv. .true.)lfirst_scan=.true. C======================================================================= C warm start C======================================================================= else do i=1,nseq_warm_start do ipos=1,nbuffer_len cbuffer_line(ipos:ipos)=cdatabase_buffer(ipoint) ipoint=ipoint+1 enddo read(cbuffer_line(1:),111)n2in,name_2, + compnd_2,ACCESSION_2,pdbref_2 iend=n2in if(n2in .gt. maxsq)iend=maxsq do ipos=1,iend csq_2(ipos:ipos)=cdatabase_buffer(ipoint) ipoint=ipoint+1 enddo if(n2in .gt. maxsq)then ipoint=ipoint + (n2in-iend) n2in=maxsq endif call do_align(lh1,lh2,idproc,ialign,nrecord,sdev) enddo c ialign_processed=nseq_warm_start call get_cpu_time('database scan warm start: ',idproc, + itime_old,itime_new,total_time,logstring) call log_file(klog,logstring,1) C===================================================================== C end warm-start C======================================================================= endif C===================================================================== C send results to host c write (logstring,'(a,i4)')'got end signal: ',idproc c call log_file(klog,logstring,1) msgtype=4000 call mp_init_send() call mp_put_int4(msgtype,id_host,ialign,N_one) call mp_put_int4(msgtype,id_host,ialign_good,N_one) call mp_send_data(msgtype,link(id_host)) if(ialign .gt. 0)then msgtype=5000 call mp_init_send() call mp_put_real4_array(msgtype,id_host,alisortkey,ialign) call mp_put_int4_array(msgtype,id_host,irecpoi,ialign) call mp_put_int4_array(msgtype,id_host,ifilepoi,ialign) call mp_send_data(msgtype,link(id_host)) write (logstring,'(a,i6,2x,i6)') + 'send result OK: ',idproc,ialign call log_file(klog,logstring,1) else write (logstring,'(a,i6)')'nothing found: ',idproc call log_file(klog,logstring,1) endif C on a mixed architecture cluster wait for request from host for C the alignment data and send them c if(lmixed_arch )then ilen_name=len(name_2) ilen_compnd=len(compnd_2) ilen_ACCESSION=len(ACCESSION_2) ilen_pdbref=len(pdbref_2) ilen_al=len(al_2) ilen_insseq=len(insseq) 200 msgtype=6000 call mp_receive_data(msgtype,link(id_host)) call mp_get_int4(msgtype,id_host,irecord,N_one) call mp_get_int4(msgtype,id_host,imsgtag,N_one) call mp_get_real4(msgtype,id_host,checkval,N_one) c write (*,*)' request: ',idproc,irecord ; call flush_unit(6) if(irecord .le. 0)goto 300 call getalign(kcore,irecord,ifir,len1,lenocc,jfir,jlas, + idel,ndel,value,rms,hom, + sim,sdev,distance,checkval) ilconsider=0 ildssp_2=0 if( lconsider )ilconsider=1 if( ldssp_2 )ildssp_2=1 msgtype=imsgtag call mp_init_send() call mp_put_int4(msgtype,id_host,ilconsider,N_one) call mp_put_int4(msgtype,id_host,ildssp_2,N_one) call mp_put_string(msgtype,id_host,name_2,ilen_name) call mp_put_string(msgtype,id_host,compnd_2,ilen_compnd) call mp_put_string(msgtype,id_host,ACCESSION_2,ilen_ACCESSION) call mp_put_string(msgtype,id_host,pdbref_2,ilen_pdbref) call mp_put_real4(msgtype,id_host,value,N_one) call mp_put_int4(msgtype,id_host,ifir,N_one) call mp_put_int4(msgtype,id_host,len1,N_one) call mp_put_int4(msgtype,id_host,lenocc,N_one) call mp_put_int4(msgtype,id_host,jfir,N_one) call mp_put_int4(msgtype,id_host,jlas,N_one) call mp_put_int4(msgtype,id_host,n2in,N_one) call mp_put_int4(msgtype,id_host,idel,N_one) call mp_put_int4(msgtype,id_host,ndel,N_one) call mp_put_int4(msgtype,id_host,nshifted,N_one) call mp_put_real4(msgtype,id_host,rms,N_one) call mp_put_real4(msgtype,id_host,hom,N_one) call mp_put_real4(msgtype,id_host,sim,N_one) call mp_put_real4(msgtype,id_host,sdev,N_one) call mp_put_real4(msgtype,id_host,distance,N_one) c call mp_put_string(msgtype,id_host,al_1,ilen_al) call mp_put_string(msgtype,id_host,al_2,ilen_al) call mp_put_string(msgtype,id_host,sal_2,ilen_al) call mp_put_int4(msgtype,id_host,iins,N_one) if(iins .gt. 0)then call mp_put_int4_array(msgtype,id_host,inslen_local,iins) call mp_put_int4_array(msgtype,id_host,insbeg_1_local,iins) call mp_put_int4_array(msgtype,id_host,insbeg_2_local,iins) call mp_put_string(msgtype,id_host,insseq,ilen_insseq) endif call mp_send_data(msgtype,link(id_host)) c write (*,*)' request send: ',idproc ; call flush_unit(6) goto 200 c endif 300 close(kcore) c if(lmixed_arch )then call del_oldfile(kcore,coretemp) c endif if ( (l3way .eqv. .true. ) .and. + (l3waydone .eqv. .false.) )then l3waydone=.true. write(*,*)' second scan; go back to start:', idproc goto 10 endif if(lwarm_start)then write(*,*)' warm-start; go back to start:', idproc goto 10 endif 900 call get_cpu_time('time finish: ',idproc, + itime_old,itime_new,total_time,logstring) call log_file(klog,logstring,1) return end C END NODE_INTERFACE.................................................... C SUBROUTINE REPORT_TIME............................................... c subroutine report_time(klog,maxstep,text,idproc,itime,istep, c + total_time) C import c integer klog,maxstep,idproc,itime(0:maxstep,*),istep c real total_time c character*(*) text(*) C internal c character*200 logstring c integer i c real xtime1,xtime2,xtime3 c real one_sec c parameter (one_sec=1000000.0) c write(logstring,*)' timing: ' c call log_file(klog,logstring,2) c do i=1,istep c xtime1= float(itime(i,1)) / one_sec c xtime2= float(itime(i,2)) / one_sec c xtime3= float(itime(i,3)) / one_sec c c write(logstring,'(a,i6,3(f10.2))')text(i),idproc, c + xtime1,xtime2,xtime3 c call log_file(klog,logstring,2) c enddo c write(logstring,*)' total: ',total_time c call log_file(klog,logstring,2) c return c end C END REPORT_TIME....................................................... C============================================ C maxhom host working interface for PVM3 subroutine host_interface(lh1,lh2,ifile,filename,ialign, + nrecord,ipoint) implicit none include 'maxhom.param' include 'maxhom.common' c import character*(*) filename real lh1(0:maxmat) integer*2 lh2(0:maxtrace) c real lh(0:maxmat*2) integer ipoint c internal C local for each node real sdev integer nrecord,ialign,ifile,iseq,k c logical lendbase integer ipos,i,ibuf_poi,iend C init nbuffer_len = 6 + 12 + len(compnd_2) + + len(ACCESSION_2) + len(pdbref_2) c nbuffer_len = 6 + len(name_2) + len(compnd_2) + c + len(ACCESSION_2) + len(pdbref_2) sdev=0.0 111 format(i6,a12,a,a,a) if(lfirst_scan .eqv. .true.)then call open_sw_data_file(kbase,lbinary,ifile,split_db_data, + split_db_path,hostname) if(lwarm_start .eqv. .false.)ipoint=1 if(lbinary .eqv. .true.)then read(kbase)ibuf_poi,iseq else read(kbase,'(i10,i10)')ibuf_poi,iseq endif if(ipoint + ibuf_poi .lt. maxdatabase_buffer)then if(lbinary .eqv. .true.)then read(kbase) + (cdatabase_buffer(i),i=ipoint,ipoint+ibuf_poi-1) else read(kbase,'(a)') + (cdatabase_buffer(i),i=ipoint,ipoint+ibuf_poi-1) endif close(kbase) do k=1,iseq do ipos=1,nbuffer_len cbuffer_line(ipos:ipos)=cdatabase_buffer(ipoint) ipoint=ipoint+1 enddo read(cbuffer_line(1:),111)n2in,name_2, + ACCESSION_2,pdbref_2,compnd_2 c read(cbuffer_line(1:),111)n2in,name_2, c + compnd_2,ACCESSION_2,pdbref_2 iend=n2in if(n2in .gt. maxsq)iend=maxsq do ipos=1,iend csq_2(ipos:ipos)=cdatabase_buffer(ipoint) ipoint=ipoint+1 enddo if(n2in .gt. maxsq)then ipoint=ipoint + (n2in-iend) n2in=maxsq endif call do_align(lh1,lh2,idproc,ialign,nrecord,sdev) nseq_warm_start=nseq_warm_start+1 enddo else write(logstring,*)' ** FATAL ERROR **/n'// + ' database_buffer overflow increase/n'// + ' dimension of MAXDATABASE_BUFFER' call log_file(klog,logstring,1) STOP endif c lendbase=.false. c do while(.not. lendbase) c call get_swiss_entry(maxsq,kbase,lbinary,n2in,name_2, c + compnd_2,ACCESSION_2,pdbref_2,csq_2,lendbase) c if(.not. lendbase)then c if (lwarm_start) then c if( (ipoint + nbuffer_len + n2in) .gt. c + maxdatabase_buffer)then c write(*,*)' **** FATAL ERROR ****' c write(*,*)' database_buffer overflow increase' c write(*,*)' dimension of MAXDATABASE_BUFFER' c STOP c endif c write(cbuffer_line(1:),'(i6,a,a,a,a)')n2in,name_2, c + compnd_2,ACCESSION_2,pdbref_2 c do ipos=1,nbuffer_len c cdatabase_buffer(ipoint)=cbuffer_line(ipos:ipos) c ipoint=ipoint+1 c enddo c do ipos=1,n2in c cdatabase_buffer(ipoint)=csq_2(ipos:ipos) c ipoint=ipoint+1 c enddo c endif c call do_align(lh1,lh2,idproc,ialign,nrecord,sdev) c nseq_warm_start=nseq_warm_start+1 c endif c enddo c close(kbase) else write(*,*)' host warm-start: ',nseq_warm_start do i=1,nseq_warm_start do ipos=1,nbuffer_len cbuffer_line(ipos:ipos)=cdatabase_buffer(ipoint) ipoint=ipoint+1 enddo c write(*,*)cbuffer_line,ipoint read(cbuffer_line(1:),111)n2in,name_2, + compnd_2,ACCESSION_2,pdbref_2 iend=n2in if(n2in .gt. maxsq)iend=maxsq do ipos=1,iend csq_2(ipos:ipos)=cdatabase_buffer(ipoint) ipoint=ipoint+1 enddo if(n2in .gt. maxsq)then ipoint=ipoint + (n2in-iend) n2in=maxsq endif call do_align(lh1,lh2,idproc,ialign,nrecord,sdev) enddo endif return end C END HOST_INTERFACE.................................................... profphd-utils-1.0.10/lib-maxhom.f0000755015075101507510000214354112012400773016104 0ustar lkajanlkajan*----------------------------------------------------------------------* C---- ------------------------------------------------------------------ C---- contains now all previously needed libraries for Schneider stuff C---- ------------------------------------------------------------------ C...................................................................... C FUN EMPTYSTRING(STRING) FUNCTION EMPTYSTRING(STRING) LOGICAL EMPTYSTRING CHARACTER*(*) STRING EMPTYSTRING=.TRUE. DO I=1,LEN(STRING) IF (STRING(I:I).NE.' ') THEN EMPTYSTRING=.FALSE. GOTO 10 ENDIF ENDDO 10 RETURN END C END EMPTYSTRING C...................................................................... C...................................................................... C FUN LCHAINBREAK LOGICAL FUNCTION LCHAINBREAK (CS,IS) C CS March 1988 C Check for '!', which is DSSP chain break CHARACTER CS*1 LCHAINBREAK=CS .EQ. '!' IF (LCHAINBREAK) THEN WRITE (*,*)'INFO: chain break detected at residue',IS ENDIF RETURN END C END LCHAINBREAK C...................................................................... C...................................................................... C FUN LEGALRES LOGICAL FUNCTION LEGALRES(CS,IS,TRANS,NTRANS,PUNCTUATION) C Brigitte Altenberg Dec 1987, changes by CS March 1988 C Check for legal residues. Unknown residues are reported (warning), C except for declared punctation. CHARACTER*(*) PUNCTUATION,TRANS,CS C Punctations are not reported. They are format-specific. C PIR: PUNCTUATION=' ,.:;()+' LEGALRES=.TRUE. L=INDEX(TRANS(1:NTRANS),CS) IF (L.EQ.0 .AND. PUNCTUATION .NE.' ') THEN M=INDEX(PUNCTUATION,CS) IF (M.EQ.0) THEN WRITE (*,*)'LEGALRES: unknown RESIDUE:',CS, + ': with ASCIIcode: ',ICHAR(CS), + ' after sequence position', IS WRITE (*,*)'CAUTION: GETAASEQ will replace this by "-"' ENDIF LEGALRES=.FALSE. ENDIF RETURN END C END LEGALRES C...................................................................... C...................................................................... C SUB ACC_TO_INT SUBROUTINE ACC_TO_INT(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES,NIOSTATES,IORANGE,NRES,LSQ,LSTR,NACC,LACC) IMPLICIT NONE C import INTEGER NTRANS CHARACTER*(*) TRANS INTEGER MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES,NIOSTATES REAL IORANGE(MAXSTRSTATES,MAXIOSTATES) INTEGER NRES,NACC(*),LSQ(*),LSTR(*) C export INTEGER LACC(*) C internal INTEGER MAXAA PARAMETER (MAXAA= 26) INTEGER ACCMAX(MAXAA),I,IOSTATE,ISTR REAL PER C max. Acc. in order of TRANS (VLIMFWYGAPSTCHRKQENDBZX!-.) C V L I M F W Y G A P S T C 142,164,169,188,197,227,222,84,106,136,130,142 C C H R K Q E N D B Z X ! - . C 135,184,248,205,198,194,157,163,157,194,0,0,0 0 DATA ACCMAX /142,164,169,188,197,227,222,84,106,136,130,142, + 135,184,248,205,198,194,157,163,157,194,0,0,0,0/ IF (TRANS .NE. 'VLIMFWYGAPSTCHRKQENDBZX!-.' ) THEN WRITE(6,*)'*** ERROR: TRANS NOT IN RIGHT ORDER in ACC_TO_INT' STOP ENDIF IF (NTRANS .GT. MAXAA) THEN WRITE(6,*)'*** ERROR: NTRANS .GT. MAXAA IN ACC_TO_INT' STOP ENDIF IF (NIOSTATES .EQ. 1) THEN CALL INIT_INT_ARRAY(1,NRES,LACC,1) RETURN ENDIF DO I=1,NRES IF (LSQ(I) .EQ. 0) THEN LACC(I)=0 ELSE ISTR=LSTR(I) IF (NSTRSTATES .EQ. 1)ISTR=1 IF (ISTR .EQ. 0) THEN WRITE(6,*)'*** ERROR: LSTR .EQ. 0 IN ACC_TO_INT' STOP ENDIF IF (ACCMAX(LSQ(I)) .NE. 0) THEN PER=(NACC(I)*100.0) / ACCMAX(LSQ(I)) IF (PER .GE. 100.0)PER=100.0 IOSTATE=1 DO IOSTATE=1,NIOSTATES IF (PER .LE. IORANGE(ISTR,IOSTATE) ) THEN LACC(I)=IOSTATE GOTO 100 ENDIF ENDDO ELSE LACC(I)=1 ENDIF 100 CONTINUE c100 if (i .le. 10) then c WRITE(6,*)' acctoint I,LSTR,LACC : ',i,iSTR, c + lacc(i) c WRITE(6,*)accmax(lsq(i)),nacc(i),per c endif ENDIF ENDDO RETURN END C END ACC_TO_INT C...................................................................... C...................................................................... C SUB ALISEQENVIRONMENT SUBROUTINE ALISEQENVIRONMENT(MAXRES,MAXALIGNS, 1 NRES,NALIGN,IFIR,ILAS,INSNUMBER,INSALI,INSLEN, 2 INSAP,LINS,NINS,TOTALINSLEN,ERROR) C 21.6.93 IMPLICIT NONE C IMPORT INTEGER MAXRES,MAXALIGNS,NRES,NALIGN,INSNUMBER INTEGER IFIR(*),ILAS(*),INSALI(*),INSLEN(*),INSAP(*) C EXPORT INTEGER TOTALINSLEN(MAXRES) INTEGER*2 NINS(MAXRES,0:MAXALIGNS) LOGICAL ERROR,LINS(MAXRES) C INTERNAL INTEGER*2 INT2_TEMP INTEGER MAXALIGNS_LOC PARAMETER (MAXALIGNS_LOC= 8765) C PARAMETER (MAXALIGNS_LOC= 12345) C PARAMETER (MAXALIGNS_LOC= 21987) INTEGER IALIGN,IPOS,IAP,IINS,TIL *----------------------------------------------------------------------* IF ( NALIGN .GT. MAXALIGNS .OR. 1 NALIGN .GT. MAXALIGNS_LOC ) THEN WRITE(6,'(1X,A)') 1 'MAXALIGNS overflow in AliseqEnvironment !' ERROR = .TRUE. RETURN ENDIF IF ( NRES .GT. MAXRES ) THEN WRITE(6,'(1X,A)') 'MAXRES overflow in AliseqEnvironment !' ERROR = .TRUE. RETURN ENDIF IPOS = 0 IINS = 1 DO IAP = 1,NRES TOTALINSLEN(IAP) = 0 NINS(IAP,0) = 0 ENDDO DO IALIGN = 1,NALIGN DO WHILE ( IINS .LT. INSNUMBER .AND. 1 INSALI(IINS) .LT. IALIGN ) IINS = IINS + 1 ENDDO DO IAP = IFIR(IALIGN),ILAS(IALIGN) IPOS = IPOS + 1 IF ( INSALI(IINS) .EQ. IALIGN .AND. 1 INSAP(IINS) .EQ. IAP ) THEN NINS(IAP,IALIGN) = IINS C CONVERSION INT*4 TO INT*2 INT2_TEMP = INSLEN(IINS) NINS(IAP,0) = MAX(NINS(IAP,0),INT2_TEMP) LINS(IAP) = .TRUE. IF ( INSALI(IINS+1) .EQ. IALIGN ) IINS = IINS + 1 ENDIF ENDDO ENDDO TIL = 0 DO IAP = 1,NRES IF ( LINS(IAP) ) TIL = TIL + NINS(IAP,0) TOTALINSLEN(IAP) = TIL ENDDO RETURN END C END ALISEQENVIRONMENT C...................................................................... C...................................................................... C SUB ALITOSTRUCRMS SUBROUTINE ALITOSTRUCRMS(MAXALSQ,MAXSQ,BRKFILE_1,BRKFILE_2, + KBRK,PDBNO_1,CHAINID_1,PDBNO_2,CHAINID_2, + ALI_1,ALI_2,LENALI,IFIR,ILAS,JFIR,JLAS,LCALPHA,RMS) C RS 89 C import an alignment, cut it in pieces (if necessary) and C calculate the RMS between pieces C use routines SETPIECES,GETCOOR,COMPALISTRUC C IMPORT : C BRKFILE_1,BRKFILE_2 : filename of coordinate files C KBRK : unit for coordinate files C ALI_1,ALI_2 : alignment string (see remark in SETPIECES) C LENALI : length of alignment including insertions C IFIR,ILAS : first and last position of seq 1 C JFIR,JLAS : first and lasr position of seq 2 C LCALPHA : compare only C-alpha atoms if true C OUTPUT: C RMS C IMPLICIT NONE C---- import INTEGER MAXALSQ,MAXSQ INTEGER KBRK,LENALI,IFIR,ILAS,JFIR,JLAS CHARACTER*(*) BRKFILE_1,BRKFILE_2 CHARACTER*1 ALI_1(MAXALSQ),ALI_2(MAXALSQ) CHARACTER*1 CHAINID_1(MAXSQ),CHAINID_2(MAXSQ) INTEGER PDBNO_1(MAXSQ),PDBNO_2(MAXSQ) REAL RMS C---- internal parameters INTEGER MXRES,MXATM C PARAMETER (MXRES= 10000) C PARAMETER (MXRES= 19876) PARAMETER (MXRES= 9999) PARAMETER (MXATM=10*MXRES) C---- internal variables C REAL RMS C INTEGER LENALI,IFIR,ILAS,JFIR,JLAS,KBRK C C if true compare only C-alpha LOGICAL LCALPHA CHARACTER*200 BRKBEFORE1,BRKBEFORE2 c alignment C CHARACTER*1 ALI_1(MAXALSQ),ALI_2(MAXALSQ) C CHARACTER*(*) CHAINID_1(*),CHAINID_2(*) C INTEGER PDBNO_1(*),PDBNO_2(*) C very long sequences are cut in pieces INTEGER NSHIFTED COMMON/CSHIFT1/NSHIFTED LOGICAL LSHIFTED COMMON/CSHIFT2/LSHIFTED c molecule attributes C CHARACTER*(*) BRKFILE_1,BRKFILE_2 CHARACTER NAMMOL1(5)*200,NAMMOL2(5)*200 INTEGER NRES_1,NRES_2,NATM1,NATM2 c residue attributes ; number and chain CHARACTER*6 CIDRES_1(MXRES),CIDRES_2(MXRES) C points to first, last and CEN atom. center residue coors INTEGER IPATM1RES(3,MXRES),IPATM2RES(3,MXRES) REAL RRES1(3,MXRES),RRES2(3,MXRES) C atom attributes C atom belongs to res number IPRESATM C atom coors C superposition weights. CHARACTER*4 NAMATM1(MXATM),NAMATM2(MXATM) INTEGER IPRES1ATM(MXATM),IPRES2ATM(MXATM) REAL RATM1(3,MXATM),RATM2(3,MXATM) REAL WSUP1(MXATM),WSUP2(MXATM) c piece attributes INTEGER MXPIECES PARAMETER (MXPIECES=50) INTEGER IRESPIE,NPIECES,NRESPIE,NATMPIE COMMON /CPIECE/IRESPIE(2,2,MXPIECES),NPIECES,NRESPIE(2), + NATMPIE(2) C compare only if sequences of BRK and DSSP are the same LOGICAL LCHECK *----------------------------------------------------------------------* C---- ------------------------------------------------------------------ C C---- ------------------------------------------------------------------ c get pieces from alignment IF (LSHIFTED) THEN RMS=-1.0 RETURN ENDIF CALL SETPIECES(MAXALSQ,ALI_1,ALI_2,LENALI,IFIR,ILAS,JFIR, + JLAS,IRESPIE,MXPIECES,NPIECES) c.get coordinates c if coordinates are still in memory dont read them again IF (BRKFILE_1 .NE. BRKBEFORE1) THEN CALL GETCOORFORHSSP(BRKFILE_1,KBRK,NAMMOL1,NRES_1,NATM1,MXRES, + MXATM,CIDRES_1,IPATM1RES,RRES1, + NAMATM1,IPRES1ATM,RATM1) ENDIF IF (BRKFILE_2.NE.BRKBEFORE2) THEN CALL GETCOORFORHSSP(BRKFILE_2,KBRK,NAMMOL2,NRES_2,NATM2,MXRES, + MXATM,CIDRES_2,IPATM2RES,RRES2, + NAMATM2,IPRES2ATM,RATM2) ENDIF IF (NRES_1.EQ.0 .OR. NRES_2.EQ.0) THEN WRITE(6,*)'**** IN ALITOSTRUCRMS *****' WRITE(6,*)' READ ERROR IN FILE: ',BRKFILE_1,' OR ',BRKFILE_2 WRITE(6,*)' STRUCTURE ALIGNMENT SKIPPED ' RMS=-1.0 RETURN ELSE BRKBEFORE1=BRKFILE_1 BRKBEFORE2=BRKFILE_2 ENDIF CALL CHECKPOSITION(PDBNO_1,CHAINID_1,PDBNO_2,CHAINID_2, + CIDRES_1,CIDRES_2,NRES_1,NRES_2,LCHECK) IF (LCHECK) THEN CALL COMPALISTRUC(BRKFILE_1,BRKFILE_2, + NRES_1,NRES_2,NATM1,NATM2, + IPATM1RES,IPATM2RES,RRES1,RRES2, + RATM1,RATM2,WSUP1,WSUP2,LCALPHA,RMS) ELSE RMS=-1.0 ENDIF RETURN END C END ALITOSTRUCRMS C...................................................................... C...................................................................... C SUB ASCIIFILTER SUBROUTINE ASCIIFILTER(LINE) C Chris Sander, May 1986 (changed by RS 92) C replaces non-printable characters by blanks. C specification in terms of ASCII-table integers C system and choice dependent PARAMETER (LOWLIMIT= 32) PARAMETER (HILIMIT= 126) c import CHARACTER*(*) LINE *----------------------------------------------------------------------* CALL STRPOS(LINE,IBEG,IEND) DO I=IBEG,IEND IASCII=ICHAR(LINE(I:I)) IF ( IASCII .LT. LOWLIMIT .OR. IASCII .GT. HILIMIT ) THEN LINE(I:I)=' ' WRITE(6,*)'* ASCIIFILTER: funny character replaced by blank' WRITE(6,*)' integer value is: ',IASCII ENDIF ENDDO RETURN END C END ASCII-FILTER C...................................................................... C...................................................................... C SUB CALC_PROF SUBROUTINE CALC_PROF(MAXRES,MAXAA,NRES,PDBSEQ,NALIGN, + EXCLUDEFLAG,IDE,IFIR,ILAS,ALISEQ,ALIPOINTER,TRANS, + SEQPROF,NOCC,NDEL,NINS,ENTROPY,RELENT) IMPLICIT NONE C import REAL IDE(*) INTEGER MAXRES,MAXAA,NRES,NALIGN, + IFIR(*),ILAS(*),ALIPOINTER(*) CHARACTER PDBSEQ(*),ALISEQ(*),EXCLUDEFLAG(*) CHARACTER*(*) TRANS C export INTEGER SEQPROF(MAXRES,MAXAA),RELENT(*), + NDEL(*),NINS(*),NOCC(*) REAL ENTROPY(*) C internal INTEGER NASCII,MAXALIGNS_LOC PARAMETER (NASCII= 256) PARAMETER (MAXALIGNS_LOC= 8765) C PARAMETER (MAXALIGNS_LOC= 12345) C PARAMETER (MAXALIGNS_LOC= 21987) REAL SUMENTROPY,X,XENTROPY,XMAXENTROPY INTEGER IRES,IALIGN,IPOS,I,J, + LOWERPOS(NASCII),ITEST INTEGER*2 INS_START(MAXALIGNS_LOC) CHARACTER C1,LOWER*26 *----------------------------------------------------------------------* WRITE(6,*)' CALC_PROF' IF (NALIGN .GT. MAXALIGNS_LOC) THEN WRITE(6,*)' CALC_PROF: MAXALIGNS_LOC overflow' STOP ELSE IF (NALIGN .LE. 0) THEN RETURN ENDIF C used to convert lower case characters from the DSSP-seq to 'C' (Cys) LOWER='abcdefghijklmnopqrstuvwxyz' CALL GETPOS(LOWER,LOWERPOS,NASCII) C initialize DO I=1,MAXRES DO J=1,MAXAA SEQPROF(I,J)=0 ENDDO NOCC(I)=0 NDEL(I)=0 NINS(I)=0 ENTROPY(I)=0 RELENT(I)=0 ENDDO DO IALIGN=1,NALIGN INS_START(IALIGN)=0 ENDDO C CALCULATE SEQUENCE PROFILE AND ENTROPY SUMENTROPY=0.0 DO IRES=1,NRES C residue of DSSP-sequence (SEQ1) C1=PDBSEQ(IRES) C convert lower case character in DSSP to 'Cys' I=LOWERPOS(ICHAR(C1)) IF (I.NE.0) C1='C' CALL GETSEQPROF(C1,TRANS,IRES,NOCC,SEQPROF,MAXRES,MAXAA) C residues of aligned sequences DO IALIGN=1,NALIGN IF ( EXCLUDEFLAG(IALIGN) .EQ. ' ') THEN IF (IRES.GE.IFIR(IALIGN).AND. + IRES.LE.ILAS(IALIGN)) THEN IPOS=ALIPOINTER(IALIGN)+IRES-IFIR(IALIGN) C1=ALISEQ(IPOS) ELSE C1=' ' ENDIF I=LOWERPOS(ICHAR(C1)) C if lower case character: insertions IF (I.NE.0 .AND. INS_START(IALIGN) .EQ. 0) THEN NINS(IRES)=NINS(IRES)+1 CALL LOWTOUP(C1,1) INS_START(IALIGN)=1 ELSE IF (INS_START(IALIGN) .EQ. 1) THEN INS_START(IALIGN)=0 ENDIF IF (C1 .NE.' ' ) THEN IF (C1.NE.'.') THEN CALL GETSEQPROF(C1,TRANS,IRES,NOCC, + SEQPROF,MAXRES,MAXAA) ELSE C if '.' : deletion NDEL(IRES)=NDEL(IRES)+1 ENDIF ENDIF ENDIF ENDDO ENDDO C calculate ENTROPY DO IRES=1,NRES SUMENTROPY=0.0 IF (NOCC(IRES).GT.1) THEN DO I=1,MAXAA IF (SEQPROF(IRES,I).NE. 0) THEN X=FLOAT (SEQPROF(IRES,I)) / FLOAT (NOCC(IRES)) XENTROPY=X * (-LOG(X)) SUMENTROPY=SUMENTROPY+XENTROPY ENDIF ENDDO ENTROPY(IRES)=SUMENTROPY IF (NOCC(IRES).LE.20) THEN XMAXENTROPY = -LOG (1 / FLOAT(NOCC(IRES))) ELSE C log(0.05) = ln (1/20) XMAXENTROPY = -LOG(0.05) ENDIF RELENT(IRES)=NINT(SUMENTROPY*100/ XMAXENTROPY) ENDIF ENDDO C normalize sequence profile DO IRES=1,NRES DO I=1,MAXAA IF (NOCC(IRES).GE.1) THEN X=FLOAT(SEQPROF(IRES,I)) *100.0 / FLOAT(NOCC(IRES)) SEQPROF(IRES,I)=NINT(X) ENDIF ENDDO C ITEST=0 C DO I=1,MAXAA C ITEST=ITEST+SEQPROF(IRES,I) C ENDDO C IF (ITEST .NE. 100) THEN C WRITE(6,*)'calc_prof: itest .ne. 100: ',itest C WRITE(6,*)ires,nocc(ires) C ENDIF ENDDO RETURN END C END CALCPROFILE C...................................................................... C...................................................................... C SUB CALC_VAR SUBROUTINE CALC_VAR(NALIGN,NRES,PDBSEQ,IDE,IFIR,ILAS, + ALIPOINTER,ALISEQ,EXCLUDEFLAG, + MAXSTRSTATES,MAXIOSTATES,NTRANS,MATSEQ, + MATRIX,VAR) C---- import IMPLICIT NONE INTEGER NALIGN,NRES,NTRANS,MAXSTRSTATES,MAXIOSTATES, + IFIR(*),ILAS(*),ALIPOINTER(*) CHARACTER PDBSEQ(*), ALISEQ(*),EXCLUDEFLAG(*) REAL IDE(*) C used for variability CHARACTER*(*) MATSEQ REAL MATRIX(NTRANS,NTRANS,MAXSTRSTATES,MAXIOSTATES, + MAXSTRSTATES,MAXIOSTATES) C---- export INTEGER VAR(*) C---- internal INTEGER MAXRES,NASCII PARAMETER (NASCII= 256) PARAMETER (MAXRES= 9999) C PARAMETER (MAXRES= 10000) C PARAMETER (MAXRES= 19876) INTEGER I,J,IALIGN,JALIGN,ILEN,IRES, + IPOS,JPOS,IBEG,IEND,IAGR,ICYS,MALIGN,KALIGN, + IPDB_SEQ(MAXRES), + IALIGN_SEQ(MAXRES),JALIGN_SEQ(MAXRES) REAL SUMVAR(MAXRES),SUMDIST(MAXRES), + TMPVAL(MAXRES),SEQDIST LOGICAL LEGALRES(MAXRES) C value of best match REAL VALMAX C only used to get rid of INDEX command (CPU time) INTEGER MATPOS(NASCII),LOWERPOS(NASCII) CHARACTER LOWER*26 *----------------------------------------------------------------------* C---- ------------------------------------------------------------------ WRITE(6,*)' calc_var' C used to convert lower case characters from the DSSP-seq to 'C' (Cys) LOWER='abcdefghijklmnopqrstuvwxyz' CALL GETPOS(LOWER,LOWERPOS,NASCII) C calculate variability only for the 22 (BZ) amino acids DO I=1,NASCII MATPOS(I)=0 ENDDO CALL GETPOS(MATSEQ(1:22),MATPOS,NASCII) IF (NRES .GT. MAXRES) THEN WRITE(6,*)'ERROR: nres.gt.maxres in calc_var' WRITE(6,*)'**** increase maxres ****' STOP ENDIF C---- initialise VALMAX=0.0 DO I=1,NTRANS DO J=1,NTRANS IF (MATRIX(J,I,1,1,1,1) .GT. VALMAX) THEN VALMAX=MATRIX(J,I,1,1,1,1) ENDIF ENDDO ENDDO DO I=1,NRES VAR(I)=0 SUMVAR(I)=0.0 SUMDIST(I)=0.0 IALIGN_SEQ(I)=0 JALIGN_SEQ(I)=0 ENDDO IF (NALIGN .LE. 0) RETURN C..................................................... C CALCULATE VARIABILITY C variability= distance(k,l) * matrix(i,j,1,1,1,1) C k,l = sequence C i,j = residue C distance= 1-(matches/length) C length=length of alignment - gaps C C convert DSSP-seq and first 'good' alignment seq to integers ICYS=MATPOS(ICHAR('C')) DO I=1,NRES IPDB_SEQ(I)=MATPOS( ICHAR(PDBSEQ(I) ) ) IF ( IPDB_SEQ(I) .EQ. 0) THEN J=LOWERPOS( ICHAR(PDBSEQ(I)) ) IF (J .NE. 0) IPDB_SEQ(I)=ICYS ENDIF ENDDO C find last alignment to be considered and store sequence of last C alignment in ialign_seq for first iteration of next loop MALIGN=0 DO IALIGN=1,NALIGN IF ( EXCLUDEFLAG(IALIGN) .EQ. ' ') THEN MALIGN=IALIGN ENDIF ENDDO IALIGN=MALIGN C---- BR 99.09: correct if none found IF (IALIGN .EQ. 0) RETURN IPOS=ALIPOINTER(IALIGN)-IFIR(IALIGN) DO IRES=IFIR(IALIGN),ILAS(IALIGN) IF (IRES .GT. 0) THEN IALIGN_SEQ(IRES)=MATPOS( ICHAR( ALISEQ(IPOS+IRES) ) ) IF ( IALIGN_SEQ(IRES) .EQ. 0) THEN IF (ALISEQ(IPOS+IRES) .GE. 'a' .AND. + ALISEQ(IPOS+IRES) .LE. 'z') THEN IALIGN_SEQ(IRES)=MATPOS(ICHAR(ALISEQ(IPOS+IRES))-32) ENDIF ENDIF END IF ENDDO C loop from last 'good' alignment till first DO IALIGN=MALIGN,1,-1 IF ( IALIGN .GT. 0 .AND. + EXCLUDEFLAG(IALIGN) .EQ. ' ' .AND. + IFIR(IALIGN) .GT. 0) THEN C distance between PDBseq and alignment SEQDIST=1.0-IDE(IALIGN) C accumulate distance etc. DO IRES=IFIR(IALIGN),ILAS(IALIGN) IF (IPDB_SEQ(IRES).NE.0.AND.IALIGN_SEQ(IRES).NE.0) THEN SUMVAR(IRES)=SUMVAR(IRES) + + (SEQDIST * + MATRIX(IPDB_SEQ(IRES),IALIGN_SEQ(IRES),1,1,1,1)) SUMDIST(IRES)=SUMDIST(IRES)+SEQDIST ENDIF ENDDO ENDIF C pairwise comparison of alignend sequences from first to "ialign" C store last 'good' alignment before "ialign" in "kalign" so we can C use the last "jalign"-seq for the next iteration of the "ialign"-seq KALIGN=0 DO JALIGN=1,IALIGN-1 IF ( EXCLUDEFLAG(JALIGN) .EQ. ' ') THEN KALIGN=JALIGN JPOS=ALIPOINTER(JALIGN)-IFIR(JALIGN) DO IRES=IFIR(JALIGN),ILAS(JALIGN) IF (IRES .GT. 0) THEN JALIGN_SEQ(IRES)= + MATPOS( ICHAR( ALISEQ(JPOS+IRES) ) ) IF ( JALIGN_SEQ(IRES) .EQ. 0) THEN IF (ALISEQ(JPOS+IRES) .GE. 'a' .AND. + ALISEQ(JPOS+IRES) .LE. 'z') THEN JALIGN_SEQ(IRES)= + MATPOS(ICHAR(ALISEQ(JPOS+IRES))-32) ENDIF ENDIF ENDIF ENDDO SEQDIST=0.0 IAGR=0 ILEN=0 C get distance between overlap of alignend seqs IBEG= MAX(IFIR(IALIGN),IFIR(JALIGN)) IEND= MIN(ILAS(IALIGN),ILAS(JALIGN)) DO IRES= IBEG,IEND IF (IRES .GT. 0) THEN LEGALRES(IRES)=.FALSE. IF ( IALIGN_SEQ(IRES) .NE. 0 .AND. + JALIGN_SEQ(IRES) .NE. 0) THEN LEGALRES(IRES)=.TRUE. IF (IALIGN_SEQ(IRES) .EQ. JALIGN_SEQ(IRES)) + IAGR=IAGR+1 TMPVAL(IRES)= + MATRIX(IALIGN_SEQ(IRES), + JALIGN_SEQ(IRES),1,1,1,1) ILEN=ILEN+1 ENDIF ENDIF ENDDO IF (ILEN .NE. 0) THEN SEQDIST=1.0-(FLOAT(IAGR)/ILEN) DO IRES=IBEG,IEND IF (LEGALRES(IRES)) THEN SUMDIST(IRES)=SUMDIST(IRES)+SEQDIST SUMVAR(IRES)=SUMVAR(IRES)+(SEQDIST*TMPVAL(IRES)) ENDIF ENDDO ENDIF ENDIF ENDDO IF (KALIGN .GT. 0) THEN DO I=IFIR(KALIGN),ILAS(KALIGN) IF (I .GT. 0) THEN IALIGN_SEQ(I)=JALIGN_SEQ(I) ENDIF ENDDO ENDIF ENDDO C calculate variability DO IRES=1,NRES IF (SUMDIST(IRES) .NE. 0.0) THEN VAR(IRES)=NINT((VALMAX- (SUMVAR(IRES)/SUMDIST(IRES)) )*100) ENDIF ENDDO RETURN END C END CALC_VAR C...................................................................... C...................................................................... C SUB CHARARRAYREPL c subroutine CHARARRAYREPL(string,length,c1,c2) c Implicit None C replaces all occurences of c1 by c2 C Import c integer length c character*1 c1, c2 C Import/Export c character string(*) C Internal c integer ipos c do ipos = 1,length c if ( string(ipos) .eq. c1 ) string(ipos) = c2 c enddo c return c end C END CHARARRAYREPL C...................................................................... C...................................................................... C SUB CHECKFORMAT SUBROUTINE CHECKFORMAT(IN,INNAME,FORMATNAME,ERRFLAG) C CHECK IF FORMAT ONE OF :DSSP,PIR,EMBL,GCG OR SOMETHING NOT SPECIFIED LOGICAL ERRFLAG CHARACTER*(*) FORMATNAME,INNAME CHARACTER*1000 FILENAME,LINE FORMATNAME='UNK' LINE=' ' FILENAME=' ' I=INDEX(INNAME,'_!_') J=0 K=0 C J=INDEX(INNAME,'hssp_') C K=INDEX(INNAME,'dssp_') L=INDEX(INNAME,'dssp_ca_') M=INDEX(INNAME,'dssp_mod') IF (I.NE.0) THEN FILENAME=INNAME(:I-1) ELSE IF (J .NE. 0) THEN FILENAME=INNAME(:J+3) ELSE IF ( (K .NE. 0) .AND. (L .LE. 0) .AND. (M .EQ. 0) ) THEN FILENAME=INNAME C FILENAME=INNAME(:K+3) ELSE FILENAME=INNAME ENDIF CALL OPEN_FILE(IN,FILENAME,'READONLY,OLD',ERRFLAG) IF (ERRFLAG) THEN WRITE(6,*)' open file error in CHECKFORMAT' GOTO 11 ENDIF I=0 LENGTH=LEN(LINE) DO WHILE(.TRUE.) I=I+1 READ(IN,'(A)',END=99) LINE IF (INDEX(LINE(:2),'ID').NE.0) THEN DO WHILE (.TRUE.) C LOOK FOR DIFF:EMBL,GCG READ (IN,'(A)',END=10)LINE IF (INDEX(LINE,'..').NE.0) THEN C there are still some swissprot files with '..' CALL LOWTOUP(LINE,LENGTH) IF (INDEX(LINE,'CHECK:').NE.0 .AND. + INDEX(LINE,'MSF:').NE.0) THEN FORMATNAME='MSF' GOTO 99 ELSE IF (INDEX(LINE,'CHECK:').NE.0) THEN FORMATNAME='GCG' GOTO 99 ENDIF ENDIF ENDDO 10 FORMATNAME='EMBL' ENDIF IF (INDEX(LINE,'PROGRAM DSSP,').NE.0) THEN FORMATNAME='DSSP' GOTO 99 ELSE IF (INDEX(LINE,'program DSSP,').NE.0) THEN FORMATNAME='DSSP' GOTO 99 ELSE IF (INDEX(LINE,'program dssp,').NE.0) THEN FORMATNAME='DSSP' GOTO 99 ELSE IF (INDEX(LINE,'-PROFILE').NE.0) THEN FORMATNAME='PROFILE' IF (INDEX(LINE,'SECONDARY').NE.0) THEN FORMATNAME='PROFILE-DSSP' ENDIF GOTO 99 ELSE IF (INDEX(LINE(1:5),'HSSP ').NE.0) THEN FORMATNAME='HSSP' GOTO 99 ELSE IF ( (I.EQ.1) .AND. (LINE(1:6) .EQ. 'HEADER') ) THEN FORMATNAME='BRK' GOTO 99 ELSE IF (INDEX(LINE,'..').NE.0) THEN CALL LOWTOUP(LINE,LENGTH) IF (INDEX(LINE,'CHECK:').NE.0 .AND. + INDEX(LINE,'MSF:').NE.0) THEN FORMATNAME='MSF' GOTO 99 ELSE IF (INDEX(LINE,'CHECK:').NE.0) THEN FORMATNAME='GCG' GOTO 99 ENDIF ELSE IF ( LINE(1:1) .EQ. '>') THEN FORMATNAME='FASTA' READ(IN,'(A)',END=99)LINE IF (LINE .NE. ' ') THEN CALL LOWTOUP(LINE,LEN(LINE)) CALL STRPOS(LINE,ISTART,ISTOP) DO I=ISTART,ISTOP IASCII=ICHAR(LINE(I:I)) IF ( (IASCII .EQ. 85) .OR. + (IASCII .EQ. 79) .OR. + (IASCII .EQ. 74) .OR. + (IASCII .GE. 33 .AND. IASCII .LE. 64) .OR. + (IASCII .GE. 91 ) ) THEN FORMATNAME='PIR' GOTO 20 ENDIF ENDDO ELSE FORMATNAME='PIR' ENDIF 20 DO WHILE(.TRUE.) READ(IN,'(A)',END=99)LINE IF ( LINE(1:1) .EQ. '>') THEN FORMATNAME='FASTA-DB' GOTO 99 ENDIF ENDDO GOTO 99 ELSE IF (LINE(1:1) .EQ. '*') THEN FORMATNAME='STAR' GOTO 99 ENDIF ENDDO 99 CLOSE (IN) RETURN 11 RETURN END C END CHECKFORMAT C...................................................................... C...................................................................... C SUB CHECKHSSPCUT SUBROUTINE CHECKHSSPCUT(LEN,IDENTITY,ISOLEN,ISOIDE,NSTEP, + LFORMULA,LALL,ISAFE,LCONSIDER,DISTANCE) C RS 89 C check if sequence identity <==> length of alignment are in the 'good' C part of the HSSP-PLOT C if OK : LCONSIDER= TRUE IMPLICIT NONE INTEGER I,LEN,IRANGE,JRANGE REAL IDENTITY,DISTANCE,Y LOGICAL LCONSIDER,LFORMULA,LALL INTEGER ISOLEN(*),NSTEP,ISAFE REAL ISOIDE(*) LCONSIDER=.FALSE. DISTANCE=0.0 C equation from cutoffs in the HSSP-plot IF (LFORMULA .OR. LALL) THEN IF (LEN.LT.10) THEN IF (.NOT. LFORMULA)LCONSIDER=.TRUE. RETURN ENDIF IF (LEN.GT.200) THEN Y= 24.767 + ISAFE C DISTANCE IS ALWAYS DISTANCE FROM ORIGINAL CURVE DISTANCE=IDENTITY - (290.15* (200**(-0.56158)) ) ELSE Y=( 290.15* (LEN**(-0.56158)) ) + ISAFE C distance is always distance from original curve DISTANCE=IDENTITY - (290.15* (LEN**(-0.56158)) ) ENDIF IF (IDENTITY .GE. Y)LCONSIDER=.TRUE. IF (.NOT. LFORMULA)LCONSIDER=.TRUE. RETURN ELSE C dont consider alignments less than smallest length in datafile IF (LEN .GE. ISOLEN(1)) THEN DO I=1,NSTEP IRANGE=ISOLEN(I) C if length is longer than longest specified set upper range to LENGTH+1 IF (I.NE.NSTEP) THEN JRANGE=ISOLEN(I+1) ELSE JRANGE=LEN+1 ENDIF IF (LEN .GE. IRANGE .AND. LEN .LT. JRANGE) THEN C if identity .GE. than ISOSIG-data IF (IDENTITY.GE.ISOIDE(I)) THEN LCONSIDER=.TRUE. DISTANCE=IDENTITY-ISOIDE(I) CD WRITE(6,*)len,identity,isolen(i),isoide(i) GOTO 10 ENDIF ENDIF ENDDO ELSE LCONSIDER=.FALSE. ENDIF 10 RETURN ENDIF END C END CHECKHSSPCUT C...................................................................... C...................................................................... C SUB CHECKHSSPCUT99 SUBROUTINE CHECKHSSPCUT99(LEN,IDENTITY,ISOLEN,ISOIDE,NSTEP, + LFORMULA,LALL,ISAFE,LCONSIDER,DISTANCE) C RS 1989 C BR 2003 C check if sequence identity <==> length of alignment are in the 'good' C part of the HSSP-PLOT C now the new one taken: C C pide= 480 * L ^ { -0.32 (1 + e ^-(L/1000)) } C C if OK (ie the particular pair IS taken): LCONSIDER= TRUE C---- local variables IMPLICIT NONE INTEGER I,LEN,IRANGE,JRANGE REAL IDENTITY,DISTANCE,Y,X1,X2 LOGICAL LCONSIDER,LFORMULA,LALL INTEGER ISOLEN(*),NSTEP,ISAFE REAL ISOIDE(*) ******------------------------------*-----------------------------****** C---- defaults LCONSIDER= .FALSE. DISTANCE= 0.0 C equation from cutoffs in the HSSP-plot IF (LFORMULA .OR. LALL) THEN Cbr- <= 11 is too short! IF (LEN.LE.11) THEN IF (.NOT. LFORMULA) LCONSIDER=.TRUE. RETURN ENDIF Cbr-- > 450 saturation at 19.5 IF (LEN.GT.450) THEN Y= 19.5 + ISAFE Cbr-- distance is NOT ALWAYS distance from curve DISTANCE=IDENTITY - 19.5 ELSE Cbr-- exponential function X1=-1*0.32*( 1 + EXP(-1*(REAL(LEN))/1000) ) X2=480 * (LEN**(X1)) C Y=(480 * (LEN**(-0.32* (1+EXP(-1*(LEN/1000))) )) )+ISAFE Y= X2 + ISAFE C distance is always distance from original curve C DISTANCE=IDENTITY - C + (480 * (LEN**(-0.32* (1+EXP(-1*(LEN/1000))) )) ) DISTANCE=IDENTITY - X2 ENDIF IF (IDENTITY .GE. Y) LCONSIDER=.TRUE. IF (.NOT. LFORMULA) LCONSIDER=.TRUE. RETURN ELSE C dont consider alignments less than smallest length in datafile IF (LEN .GE. ISOLEN(1)) THEN DO I=1,NSTEP IRANGE=ISOLEN(I) C if length is longer than longest specified set upper range to LENGTH+1 IF (I.NE.NSTEP) THEN JRANGE=ISOLEN(I+1) ELSE JRANGE=LEN+1 ENDIF IF (LEN .GE. IRANGE .AND. LEN .LT. JRANGE) THEN C if identity .GE. than ISOSIG-data IF (IDENTITY.GE.ISOIDE(I)) THEN LCONSIDER=.TRUE. DISTANCE=IDENTITY-ISOIDE(I) CD WRITE(6,*)len,identity,isolen(i),isoide(i) GOTO 10 ENDIF ENDIF ENDDO ELSE LCONSIDER=.FALSE. ENDIF 10 RETURN ENDIF END C END CHECKHSSPCUT99 C...................................................................... C...................................................................... C SUB CHECKPOSITION SUBROUTINE CHECKPOSITION(PDBNO_1,CHAINID_1,PDBNO_2,CHAINID_2, + CBRKID_1,CBRKID_2,NRES_1,NRES_2,LMATCH) C RS 89 C check if pieces from DSSP-alignment match the position in the C Brookhaven coordinate file C if not this routine tries to find the right position C piece attributes INTEGER MXPIECES PARAMETER (MXPIECES= 50) COMMON /CPIECE/IRESPIE(2,2,MXPIECES),NPIECES,NRESPIE(2), + NATMPIE(2) C ALIGNMENT AND SEQUENCES C BRK-NUMBER FROM DSSP INTEGER PDBNO_1(*),PDBNO_2(*) CHARACTER*(*) CHAINID_1(*),CHAINID_2(*) C BRK-NUMBER FROM BRK CHARACTER*(*) CBRKID_1(*),CBRKID_2(*) C INTERNAL C TRUE IF PIECES ARE THE SAME LOGICAL LMATCH CHARACTER*6 CTEST *----------------------------------------------------------------------* LMATCH=.FALSE. C CHECK PIECES DO IPIECE=1,NPIECES C CHECK PIECE FROM TEST SEQUENCE IB=IRESPIE(1,1,IPIECE) IE=IRESPIE(2,1,IPIECE) C put chain identifier of BRK at first position; in DSSP last position WRITE(CTEST,'(A,I4,A)')CHAINID_1(IB),PDBNO_1(IB),' ' IF (CTEST .NE. CBRKID_1(IB)) THEN WRITE(6,*)' CHECKPOSITION: DSSP/BRK pieces are '// + 'different try to find right positions in piece 1' DO IPOS=-NRES_1,NRES_1 IF (IB+IPOS .GT. 0 .AND. IB+IPOS .LT. NRES_1) THEN IF (CTEST .EQ. CBRKID_1(IB+IPOS)) THEN IRESPIE(1,1,IPIECE)=IB+IPOS IRESPIE(2,1,IPIECE)=IE+IPOS LMATCH=.TRUE. WRITE(6,*)' CHECKPOSITION: right position found ' WRITE(6,*)' IPIECE : ',ipiece WRITE(6,*)' DSSP-piece is: ',ib,ie WRITE(6,*)' BRK-piece is: ',ib+ipos,ie+ipos GOTO 100 ENDIF ENDIF ENDDO ELSE LMATCH=.TRUE. ENDIF 100 CONTINUE IF (.NOT. LMATCH) THEN WRITE(6,*)'CHECKPOSITION : NO MATCH, 3D COMPARISON SKIPPED' RETURN ENDIF c check piece of comparison sequence LMATCH=.FALSE. IB=IRESPIE(1,2,IPIECE) IE=IRESPIE(2,2,IPIECE) WRITE(CTEST,'(A,I4,A)')CHAINID_2(IB),PDBNO_2(IB),' ' IF (CTEST .NE. CBRKID_2(IB)) THEN WRITE(6,*)' CHECKPOSITION: DSSP/BRK pieces are different'// + ' try to find right positions in piece 2' DO IPOS=-NRES_2,NRES_2 IF (IB+IPOS .GT. 0 .AND. IB+IPOS .LT. NRES_2) THEN WRITE(6,*)':',CTEST,':',CBRKID_2(IB+IPOS),':' IF (CTEST .EQ. CBRKID_2(IB+IPOS) ) THEN IRESPIE(1,2,IPIECE)=IB+IPOS IRESPIE(2,2,IPIECE)=IE+IPOS LMATCH=.TRUE. WRITE(6,*)' CHECKPOSITION: right position found ' WRITE(6,*)' IPIECE : ',ipiece WRITE(6,*)' DSSP-piece is: ',ib,ie WRITE(6,*)' BRK-piece is: ',ib+ipos,ie+ipos GOTO 200 ENDIF ENDIF ENDDO ELSE LMATCH=.TRUE. ENDIF IF (.NOT. LMATCH) THEN WRITE(6,*)'CHECKPOSITION : NO MATCH, 3D COMPARISON SKIPPED' RETURN ENDIF 200 CONTINUE ENDDO RETURN END C END CHECKPOSITION C...................................................................... C...................................................................... C SUB CHECKRANGE SUBROUTINE CHECKRANGE(N,NLOWER,NUPPER,VARIABLE,ROUTINE) CHARACTER*(*) ROUTINE, VARIABLE IF (N .LT. NLOWER .OR. N .GT. NUPPER ) THEN WRITE(6,*)'*** fatal error in ',routine WRITE(6,*) ' integer ',variable,' out of range ' WRITE(6,*) ' legal limits are: ',nlower, nupper WRITE(6,*) ' current value is: ',n STOP 'IN CHECKRANGE' ENDIF RETURN END C END CHECKRANGE C...................................................................... C...................................................................... C SUB CHECKINEQUALITY SUBROUTINE CHECKINEQUALITY(N,M,VARIABLE,ROUTINE) CHARACTER*(*) ROUTINE, VARIABLE INTEGER N,M IF (N .EQ. M) THEN WRITE(6,*)'*** fatal error in ',routine WRITE(6,*)variable,' are equal but should be uneq' WRITE(6,*) ' current value is: ',n,m STOP 'IN CHECKINEQUALITY' ENDIF RETURN END C END CHECKINEQUALITY C...................................................................... C...................................................................... C SUB CHECKREALEQUALITY SUBROUTINE CHECKREALEQUALITY(X1,X2,EPSILON,VARIABLE,ROUTINE) CHARACTER*(*) ROUTINE, VARIABLE REAL X1,X2,EPSILON IF (EPSILON .LT. 0.0) THEN WRITE(6,*)' *** negative epsilon in checkrealequality' ENDIF IF (ABS(X1-X2) .GT. EPSILON) THEN WRITE(6,*)'*** fatal error in ',routine WRITE(6,*)' real nums ',variable,' are not eq within',epsilon WRITE(6,*)' values are: ',x1,x2 STOP 'IN CHECKREALEQUALITY' ENDIF RETURN END C END CHECKREALEQUALITY C...................................................................... C...................................................................... C SUB CHECKSEQ SUBROUTINE CHECKSEQ(STRAND,BEGIN,END,CHECK) IMPLICIT NONE C sub version of gcg function CheckSeq 18 C Changes: C - return value now additional parameter "check" C - additional parameters "begin","end" : Strand is now read C from begin to end, no longer from 1 to first occurence of char(0) C IMPORT CHARACTER*(*) STRAND C UG INTEGER BEGIN, END C INTERNAL INTEGER CHECKTMP, COUNT, I INTEGER TABLE(0:255) CHARACTER C C EXPORT INTEGER CHECK DO I = 0, 255 C = CHAR(I) CALL LOWTOUP(C,1) TABLE(I) = ICHAR(C) END DO CHECKTMP = 0 COUNT = 0 DO I = BEGIN, END COUNT = COUNT + 1 CHECKTMP = CHECKTMP + COUNT * TABLE(ICHAR(STRAND(I:I))) IF ( COUNT.EQ.57 ) COUNT = 0 END DO CHECK = MOD(CHECKTMP, 10000) C CHECK = MOD(CHECKTMP, 10000) RETURN END C END CHECKSEQ C...................................................................... C...................................................................... C SUB COMPALISTRUC C COMPARE-PROTEIN-STRUCTURES. C C.SANDER MAY 1983, as CELLO subroutine July 1985. C calcs best overlap of two protein pieces CP pass storage for spliced molecule as argument RRES1SPL RATM1SPL etc CP then remove parameter here - should only exist in GRAFIX-MOLEC:COMM c subroutine compalistruc() SUBROUTINE COMPALISTRUC(FILCOO1,FILCOO2,NRES_1,NRES_2,NATM1, + NATM2,IPATM1RES,IPATM2RES,RRES1, + RRES2,RATM1,RATM2,WSUP1,WSUP2,LCALPHA, + RMS) IMPLICIT NONE INTEGER MXRESMOL,MXATMMOL PARAMETER (MXRESMOL= 600) PARAMETER (MXATMMOL=10*MXRESMOL) c molecule attributes CHARACTER*(*) FILCOO1, FILCOO2 INTEGER NRES_1,NRES_2,NATM1,NATM2 C+++++variables shared with GETCOOR/S3TOS1 - from GET-PROTEIN-LIB C points to first, last and CEN atom INTEGER IPATM1RES(3,*), IPATM2RES(3,*) C center residue coors REAL RRES1(3,*),RRES2(3,*) C atom coors REAL RATM1(3,*), RATM2(3,*) C superposition weights. REAL WSUP1(*), WSUP2(*) LOGICAL LCALPHA C compare 3-d structure piece by piece LOGICAL LPIEBYPIE C result variables C BEST TRANSROT FROM SUPERPOSE REAL TRANS(3), ROT(3,3), RMS C piece attributes INTEGER MXPIECES PARAMETER (MXPIECES= 50) INTEGER IPRESPIE,NPIECES,NRESPIE,NATMPIE COMMON /CPIECE/IPRESPIE(2,2,MXPIECES),NPIECES,NRESPIE(2), + NATMPIE(2) C local atom storage for spliced coordinates REAL RRES1SPL(3,MXRESMOL), RRES2SPL(3,MXRESMOL) REAL RATM1SPL(3,MXATMMOL), RATM2SPL(3,MXATMMOL) C internal INTEGER I,K,IATM,IPIECE,LMOL,NRES,IRESPIE,IATMPIE, + IRES,IRES1,IRES2,IPIE1,IER REAL TOTALLEN,XRMSTOTAL,XRMS C [mol1 mol1] C [mol2 mol2] C C pointers: relative to beginning of each molecule C C molecule 1,NRESMOL residues C 1,NATMMOL atoms C C piece IPRESPIE(2,2,MXPIECES) C (2,2,MXPIECES)=(beg-end,mol1-mol2,IPIECE) C NATMPIE(2) C NRESPIE(2) (2)=(mol1-mol2) C C----------------------------------------------------------------------- WRITE(6,*)' enter COMPARE-STRUCS for molecules: ' WRITE(6,'(a,a,i6,a,i6)')FILCOO1(1:40), + ' NRES=',NRES_1,' NATM= ',NATM1 WRITE(6,'(a,a,i6,a,i6)')FILCOO2(1:40), + ' NRES=',NRES_2,' NATM= ',NATM2 C Set defaults LPIEBYPIE=.FALSE. DO I=1,NATM1 WSUP1(I)=1.0 ENDDO DO I=1,NATM2 WSUP2(I)=1.0 ENDDO GOTO 200 C COMPARE STRUCS 200 CONTINUE C get compare limits WRITE(6,*) WRITE(6,*)' ---------------------------------' WRITE(6,*)' mol A is: ',FILCOO1(1:50) WRITE(6,*)' mol B is: ',FILCOO2(1:50) C reset upper limit if needed DO IPIECE=1,NPIECES DO LMOL=1,2 IF (LMOL.EQ.1) THEN NRES=NRES_1 ENDIF IF (LMOL.EQ.2) THEN NRES=NRES_2 ENDIF IF (IPRESPIE(1,LMOL,IPIECE) .LT. 1) THEN IPRESPIE(1,LMOL,IPIECE)=1 ENDIF IF (IPRESPIE(2,LMOL,IPIECE) .GT. NRES) THEN IPRESPIE(2,LMOL,IPIECE)=NRES ENDIF ENDDO ENDDO C=============================================================== C GET RMS FOR EACH PIECE AND ADD RMSS IF (LPIEBYPIE) THEN WRITE(6,*)' compare structure piece by piece ' RMS=0.0 TOTALLEN=0.0 XRMSTOTAL=0.0 DO IPIECE=1,NPIECES XRMS=0.0 DO LMOL=1,2 IRESPIE=0 IATMPIE=0 IRES1=IPRESPIE(1,LMOL,IPIECE) IRES2=IPRESPIE(2,LMOL,IPIECE) DO IRES=IRES1,IRES2 IRESPIE=IRESPIE+1 IF (LMOL.EQ.1) THEN DO K=1,3 RRES1SPL(K,IRESPIE)=RRES1(K,IRES) ENDDO C first atom of residue to last atom of residue IRES DO IATM=IPATM1RES(1,IRES),IPATM1RES(2,IRES) IATMPIE=IATMPIE+1 IF (IATMPIE .GT. MXATMMOL) THEN WRITE(6,*)' MXATMMOL overflow ' STOP ENDIF DO K=1,3 RATM1SPL(K,IATMPIE)=RATM1(K,IATM) ENDDO ENDDO ENDIF IF (LMOL.EQ.2) THEN DO K=1,3 RRES2SPL(K,IRESPIE)=RRES2(K,IRES) ENDDO C first atom of residue to last atom of residue IRES DO IATM=IPATM2RES(1,IRES),IPATM2RES(2,IRES) IATMPIE=IATMPIE+1 IF (IATMPIE .GT. MXATMMOL) THEN WRITE(6,*)' MXATMMOL overflow ' STOP ENDIF DO K=1,3 RATM2SPL(K,IATMPIE)=RATM2(K,IATM) ENDDO ENDDO ENDIF C FOR IRES=IRES1,IRES2 ENDDO NRESPIE(LMOL)=IRESPIE NATMPIE(LMOL)=IATMPIE C FOR LMOL=1,2 ENDDO WRITE(6,*) ' IPIECE : ',IPIECE WRITE(6,*)' MOL1: from ',IPRESPIE(1,1,IPIECE),' to ', + IPRESPIE(2,1,IPIECE) WRITE(6,*)' MOL2: from ',IPRESPIE(1,2,IPIECE),' to ', + IPRESPIE(2,2,IPIECE) C superpose using U3B of Wolfgang Kabsch IPIE1=1 C first atom and number of residues of piece 1 and 2 WRITE(6,*)' # of residues ' WRITE(6,'(2I10)') ( NRESPIE(K),K=1,2 ) WRITE(6,*)'----------------------------' WRITE(6,*)' CALL U3B' CALL U3B(WSUP2,RRES1SPL(1,1),RRES2SPL(1,1),NRESPIE(IPIE1), + 0,XRMS,ROT,TRANS,IER) cx XN=FLOAT(NRESPIE(IPIE1)) CX XRMS=SQRT(XRMS/XN) IS NOW IN U3B WRITE(6,'('' RMS '',F18.7)') XRMS TOTALLEN=TOTALLEN+NRESPIE(IPIE1) XRMSTOTAL=XRMSTOTAL+NRESPIE(IPIE1)*XRMS C FOR IPIECE=1,NPIECES ENDDO RMS=XRMSTOTAL/TOTALLEN WRITE(6,*)' TOTAL RMS ',RMS C C end block: splice-coors (piece by piece) C================================================================== ELSE WRITE(6,*)' compare structures: splice-coors' C...block: splice-coors DO LMOL=1,2 IRESPIE=0 IATMPIE=0 DO IPIECE=1,NPIECES IRES1=IPRESPIE(1,LMOL,IPIECE) IRES2=IPRESPIE(2,LMOL,IPIECE) DO IRES=IRES1,IRES2 IRESPIE=IRESPIE+1 IF (LMOL.EQ.1) THEN DO K=1,3 RRES1SPL(K,IRESPIE)=RRES1(K,IRES) ENDDO C....first atom of residue to last atom of residue IRES DO IATM=IPATM1RES(1,IRES),IPATM1RES(2,IRES) IATMPIE=IATMPIE+1 IF (IATMPIE .GT. MXATMMOL) THEN WRITE(6,*)' MXATMMOL overflow ' STOP ENDIF DO K=1,3 RATM1SPL(K,IATMPIE)=RATM1(K,IATM) ENDDO ENDDO ENDIF IF (LMOL.EQ.2) THEN DO K=1,3 RRES2SPL(K,IRESPIE)=RRES2(K,IRES) ENDDO C.... first atom of residue to last atom of residue IRES DO IATM=IPATM2RES(1,IRES),IPATM2RES(2,IRES) IATMPIE=IATMPIE+1 IF (IATMPIE .GT. MXATMMOL) THEN WRITE(6,*)' MXATMMOL overflow ' STOP ENDIF DO K=1,3 RATM2SPL(K,IATMPIE)=RATM2(K,IATM) ENDDO ENDDO ENDIF C FOR IRES=IRES1,IRES2 ENDDO C FOR IPIECE=1,NPIECES ENDDO NRESPIE(LMOL)=IRESPIE NATMPIE(LMOL)=IATMPIE C FOR LMOL=1,2 ENDDO C C end block: splice-coors C CALL REPORTPIECES RMS=0.0 C superpose using U3B of Wolfgang Kabsch IPIE1=1 C first atom and number of residues of piece 1 and 2 IF (LCALPHA) THEN WRITE(6,*)' # of residues ' WRITE(6,'(2I10)') ( NRESPIE(K),K=1,2 ) WRITE(6,*)'----------------------------' WRITE(6,*)' CALL U3B' CALL U3B(WSUP2,RRES1SPL(1,1),RRES2SPL(1,1),NRESPIE(IPIE1), + 0,RMS,ROT,TRANS,IER) ELSE WRITE(6,*)' # of atoms ' WRITE(6,'(2I10)') ( NATMPIE(K),K=1,2 ) WRITE(6,*)'----------------------------' CALL U3B(WSUP2,RATM1SPL(1,1),RATM2SPL(1,1),NATMPIE(IPIE1), + 0,RMS,ROT,TRANS,IER) ENDIF WRITE(6,'('' RMS '',F18.7)') RMS C LPIEBYPIE ENDIF WRITE(6,*) RETURN END C END COMPALISTRUC C...................................................................... C...................................................................... C SUB CONCAT_STRINGS SUBROUTINE CONCAT_STRINGS(STRING1,STRING2,RESULT) C concatenate "string1" and "string2" into "result" CHARACTER*(*) STRING1,STRING2,RESULT INTEGER IBEG,IEND,JBEG,JEND,ILEN RESULT=' ' CALL STRPOS(STRING1,IBEG,IEND) CALL STRPOS(STRING2,JBEG,JEND) ILEN= (IEND-IBEG+1) + (JEND-JBEG+1) IF (ILEN .GT. LEN(RESULT) ) THEN ILEN=LEN(RESULT) WRITE(6,*)' WARNING: in concat_strings: length overflow' WRITE(6,*)' cut string at: ',ilen ENDIF RESULT(1:ILEN)=STRING1(IBEG:IEND)//STRING2(JBEG:JEND) RETURN END C END CONCAT_STRINGS C...................................................................... C...................................................................... C SUB CONCAT_3STRINGS SUBROUTINE CONCAT_3STRINGS(STRING1,STRING2,STRING3,RESULT) C concatenate "string1" and "string2" and "string3" into "result" CHARACTER*(*) STRING1,STRING2,STRING3,RESULT INTEGER IBEG,IEND,JBEG,JEND,KBEG,KEND,ILEN RESULT=' ' CALL STRPOS(STRING1,IBEG,IEND) CALL STRPOS(STRING2,JBEG,JEND) CALL STRPOS(STRING3,KBEG,KEND) ILEN= (IEND-IBEG+1) + (JEND-JBEG+1) + (KEND-KBEG+1) IF (ILEN .GT. LEN(RESULT) ) THEN ILEN=LEN(RESULT) WRITE(6,*)' WARNING: IN CONCAT_STRINGS: LENGTH OVERFLOW' WRITE(6,*)' cut string at: ',ilen ENDIF RESULT(1:ILEN)=STRING1(IBEG:IEND)//STRING2(JBEG:JEND)// + STRING3(KBEG:KEND) RETURN END C END CONCAT_3STRINGS C...................................................................... C...................................................................... C SUB CONCAT_INT_STRING SUBROUTINE CONCAT_INT_STRING(INUMBER,STRING,RESULT) C concatenate "inumber" and "string2" into "result" C import/export CHARACTER*(*) STRING,RESULT INTEGER INUMBER C internal CHARACTER TEMP*64,CFORMAT*100 INTEGER IBEG,IEND,JBEG,JEND,ILEN,ILOG C init TEMP=' ' RESULT=' ' ILOG=1 C get size of number C CAUTION can produce wrong results with very high opt-levels c xnumber=float( inumber ) c if (xnumber .gt. 0.0) then c ilog = nint( log10(xnumber) + 0.5 ) c else if (xnumber .lt. 0.0) then c ilog = nint( log10( abs(xnumber) ) + 1.5 ) c endif IF (INUMBER .GT. 0) THEN IF (INUMBER .LT. 10) THEN ILOG=1 ELSE IF (INUMBER .LT. 100) THEN ILOG=2 ELSE IF (INUMBER .LT. 1000) THEN ILOG=3 ELSE IF (INUMBER .LT. 10000) THEN ILOG=4 ELSE IF (INUMBER .LT. 100000) THEN ILOG=5 ELSE IF (INUMBER .LT. 1000000) THEN ILOG=6 ELSE IF (INUMBER .LT. 10000000) THEN ILOG=7 C too big for INT4 ? c else if (inumber .lt. 100000000) then c ilog=8 ELSE WRITE(6,*)' ERROR in CONCAT_INT_STRING: update plus' CALL FLUSH_UNIT(6) ENDIF ELSE IF (INUMBER .LT. 0) THEN IF (INUMBER .GT. -10) THEN ILOG=2 ELSE IF (INUMBER .GT. -100) THEN ILOG=3 ELSE IF (INUMBER .GT. -1000) THEN ILOG=4 ELSE IF (INUMBER .GT. -10000) THEN ILOG=5 ELSE IF (INUMBER .GT. -100000) THEN ILOG=6 ELSE IF (INUMBER .GT. -1000000) THEN ILOG=7 c else if (inumber .gt. -10000000) then c ilog=8 c else if (inumber .gt. -100000000) then c ilog=9 ELSE WRITE(6,*)' ERROR in CONCAT_INT_STRING: update minus' CALL FLUSH_UNIT(6) ENDIF ENDIF CALL CONCAT_STRING_INT('(I',ILOG,TEMP) CALL CONCAT_STRINGS(TEMP,')',CFORMAT) TEMP=' ' WRITE(TEMP(1:),CFORMAT)INUMBER CALL STRPOS(TEMP,IBEG,IEND) CALL STRPOS(STRING,JBEG,JEND) IEND=IBEG+ILOG-1 ILEN= (IEND-IBEG+1) + (JEND-JBEG+1) IF (ILEN .GT. LEN(RESULT) ) THEN ILEN=LEN(RESULT) WRITE(6,*)' WARNING: in concat_int_string: length overflow' WRITE(6,*)' cut string at: ',ilen ENDIF RESULT(1:ILEN)=TEMP(IBEG:IEND)//STRING(JBEG:JEND) RETURN END C END CONCAT_INT_STRING C...................................................................... C...................................................................... C SUB CONCAT_STRING_INT SUBROUTINE CONCAT_STRING_INT(STRING,INUMBER,RESULT) C concatenate "inumber" and "string2" into "result" C import/export CHARACTER*(*) STRING,RESULT INTEGER INUMBER C internal CHARACTER TEMP*64,CFORMAT*100 INTEGER IBEG,IEND,JBEG,JEND,ILEN,ILOG C init TEMP=' ' RESULT=' ' ILOG=1 C get size of number c with some agressive optimizations, this can go wrong c xnumber=float( inumber ) c if (xnumber .gt. 0.0) then c ilog = nint( log10(xnumber) ) + 1 c else if (xnumber .lt. 0.0) then c ilog = nint( log10( abs(xnumber) ) ) + 2 c endif IF (INUMBER .GT. 0) THEN IF (INUMBER .LT. 10) THEN ILOG=1 ELSE IF (INUMBER .LT. 100) THEN ILOG=2 ELSE IF (INUMBER .LT. 1000) THEN ILOG=3 ELSE IF (INUMBER .LT. 10000) THEN ILOG=4 ELSE IF (INUMBER .LT. 100000) THEN ILOG=5 ELSE IF (INUMBER .LT. 1000000) THEN ILOG=6 ELSE IF (INUMBER .LT. 10000000) THEN ILOG=7 C too big for INT4 ? c else if (inumber .lt. 100000000) then c ilog=8 c else if (inumber .lt. 1000000000) then c ilog=9 c else if (inumber .lt. 10000000000) then c ilog=10 ELSE WRITE(6,*)' ERROR in CONCAT_STRING_INT: update plus' CALL FLUSH_UNIT(6) ENDIF ELSE IF (INUMBER .LT. 0) THEN IF (INUMBER .GT. -10) THEN ILOG=2 ELSE IF (INUMBER .GT. -100) THEN ILOG=3 ELSE IF (INUMBER .GT. -1000) THEN ILOG=4 ELSE IF (INUMBER .GT. -10000) THEN ILOG=5 ELSE IF (INUMBER .GT. -100000) THEN ILOG=6 ELSE IF (INUMBER .GT. -1000000) THEN ILOG=7 c else if (inumber .gt. -10000000) then c ilog=8 c else if (inumber .gt. -100000000) then c ilog=9 c else if (inumber .gt. -1000000000) then c ilog=10 ELSE WRITE(6,*)' ERROR in CONCAT_STRING_INT: update minus' CALL FLUSH_UNIT(6) ENDIF ENDIF CALL MAKE_FORMAT_INT(ILOG,CFORMAT) WRITE(TEMP(1:),CFORMAT)INUMBER CALL STRPOS(TEMP,IBEG,IEND) CALL STRPOS(STRING,JBEG,JEND) IEND=IBEG+ILOG-1 ILEN= (IEND-IBEG+1) + (JEND-JBEG+1) IF (ILEN .GT. LEN(RESULT) ) THEN ILEN=LEN(RESULT) WRITE(6,*)' WARNING: in concat_int_string: length overflow' WRITE(6,*)' cut string at: ',ilen ENDIF RESULT(1:ILEN)=STRING(JBEG:JEND)//TEMP(IBEG:IEND) RETURN END C END CONCAT_STRING_INT C...................................................................... C...................................................................... C SUB DAMP_GAPWEIGHT SUBROUTINE DAMP_GAPWEIGHT(IBEG,IEND,VALUE,NDAMP,PUNISH) C damp the gap-open weights by taking the mean of the range +- ndamp C CAUTION set "punish" high enough C NOT true anymore: if indels in sec-struc are not allowed these C positions are not taken into account (punish) IMPLICIT NONE INCLUDE 'maxhom.param' REAL PUNISH C INPUT REAL VALUE(*) INTEGER IBEG,IEND,NDAMP,NPOS C INTERNAL INTEGER I,J REAL SUM DO I=IBEG,IEND SUM=0.0 NPOS=0 DO J=MAX(I-NDAMP,IBEG),MIN(I+NDAMP,IEND) SUM=SUM + VALUE(J) NPOS=NPOS+1 ENDDO VALUE(I)= SUM / FLOAT(NPOS) ENDDO RETURN END C END DAMP_GAPWEIGHT C...................................................................... C...................................................................... C SUB DO_ALIGN SUBROUTINE DO_ALIGN(LH1,LH2,ISET,IALIGN,NRECORD,SDEV) IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C C import implicit C LPASS2=(from maxhom) true if protein IALIGN to take for 2nd pass C C import C ISET= (from maxhom) number of processor (=0 if not parallel) C IALIGN=(from maxhom) number of proteins aligned before, i.e. C current protein is (IALIGN+1)! C INTEGER ISET,IALIGN,NRECORD REAL SDEV C internal REAL LH1(0:MAXMAT) INTEGER*2 LH2(0:MAXTRACE) C REAL LH(0:MAXMAT*2) LOGICAL LERROR INTEGER I,IBEG,IEND,ND1,ND2,NDMAT,N2,N2NEW,N2REST INTEGER NTEST,BESTIIPOS,BESTJJPOS,NREGION,IBREAK,JBREAK INTEGER IPOSBEG,IPOSEND,JPOSBEG,JPOSEND REAL BESTVAL CHARACTER CSYMBOL LOGICAL LDBG_LOCAL C---- ------------------------------------------------------------------ C INIT C---- ------------------------------------------------------------------ CSYMBOL= CHAR(0); LTRACEOUT= .FALSE. C BR 99.09: just to write out dbg LDBG_LOCAL=.FALSE. C LDBG_LOCAL=.TRUE. IF (LDSSP_2) THEN CALL LOWER_TO_CYS(CSQ_2,N2IN) ENDIF CALL SEQ_TO_INTEGER(CSQ_2,LSQ_2,N2IN,TRANSPOS) C get position of chain breaks CALL GETCHAINBREAKS(N2IN,LSQ_2,STRUC_2,TRANS,NBREAK_2,IBREAKPOS_2) IF (LDSSP_2) THEN CALL STR_TO_INT(N2IN,STRUC_2,LSTRUC_2,STRTRANS ) CALL STR_TO_CLASS(MAXSTRSTATES,STR_CLASSES,N2IN,STRUC_2, + STRCLASS_2,LSTRCLASS_2) CALL ACC_TO_INT(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES_2,NIOSTATES_2,IORANGE,N2IN, + LSQ_2,LSTRCLASS_2,NSURF_2,LACC_2) C not DSSP ELSE I=INDEX(STRTRANS,'U') CALL INIT_INT_ARRAY(1,N2IN,LSTRUC_2,I) DO I=1,MAXSTRSTATES IF ( INDEX(STR_CLASSES(I),STRUC_2(1)) .NE. 0) THEN CALL INIT_INT_ARRAY(1,N2IN,LSTRCLASS_2,I) CSYMBOL=STR_CLASSES(I)(1:1) ENDIF ENDDO DO I=1,N2IN STRCLASS_2(I:I)=CSYMBOL ENDDO CALL INIT_INT_ARRAY(1,N2IN,LACC_2,1) ENDIF C set gap-open to a high value in SECONDARY STRUCTURE SEGMENTS IF (.NOT. LINSERT_2 .AND. LDSSP_2) THEN CALL PUNISH_GAP(N2IN,STRUC_2,'HE',PUNISH,GAPOPEN_2 ) ENDIF IEND= 0 LSHIFTED=.FALSE. N2= N2IN N2REST= N2IN NSHIFTED=0 C ATTEMPT TO USE N2 FOR ALIGNMENT C RESET N2 TO A VALUE SMALLER THAN N2REST IF NEEDED C SET ND1 AND ND2, THE MATRIX DIMENSION TO BE USED CAUTION LH(O:ND1,0:ND2) 350 ND1= N1+1 ND2= N2+1 NDMAT= (1+ND1)*(1+ND2) LSHIFTED=(NDMAT.GT.MAXTRACE) IF (LSHIFTED) THEN ND2= (INT(MAXTRACE/(ND1+1)) )-1 N2=ND2-1 CALL OPEN_FILE(KWARN,WARNFILE,'UNKNOWN,APPEND',LERROR) CALL STRPOS(NAME_2,IBEG,IEND) WRITE(LOGSTRING(1:),'(A,I10,I10,I10,A,I8,A,A)') + ' *** WARN: MAXTRACE or MAXMAT OVERFLOW: ', + MAXTRACE,MAXMAT,NDMAT, + ' TRUNCATED TO:',N2,' FOR: ',name_2(ibeg:iend) CALL LOG_FILE(KLOG,LOGSTRING,1) CALL LOG_FILE(KWARN,LOGSTRING,0) NDMAT=(1+ND1)*(1+ND2) CALL CLOSE_FILE(KWARN,WARNFILE) ENDIF C======================================================================= C TRACE-FILE C HEADERS TO PLOT FILE..(after the second run ) C======================================================================= LTRACEOUT=.FALSE. IF (LTRACE .AND. .NOT. LPASS2) THEN LTRACEOUT=.TRUE. ENDIF C======================================================================= C THE MEAT C======================================================================= NTEST=0 BESTVAL=1000000.0 C======================================================================= C the NBEST alignments are selected via TRACE C======================================================================= NREGION=(NBREAK_1+1) * (NBREAK_2+1) DO WHILE (NTEST .LT. NREGION*NBEST .AND. BESTVAL.GT.0.0) DO IBREAK=1,NBREAK_1+1 IF (IBREAK .GT. NBREAK_1) THEN IPOSEND=N1 ELSE IPOSEND=IBREAKPOS_1(IBREAK)-1 ENDIF IF (IBREAK .EQ. 1) THEN IPOSBEG=1 ELSE IPOSBEG=IBREAKPOS_1(IBREAK-1)+1 ENDIF DO JBREAK=1,NBREAK_2+1 IF (JBREAK .GT. NBREAK_2) THEN JPOSEND=N2 ELSE JPOSEND=IBREAKPOS_2(JBREAK)-1 ENDIF IF (JBREAK .EQ. 1) THEN JPOSBEG=1 ELSE JPOSBEG=IBREAKPOS_2(JBREAK-1)+1 ENDIF C check if the 2 sequences are identical LSAMESEQ=.FALSE. IF (.NOT. LSHOW_SAMESEQ) THEN IF (IPOSEND-IPOSBEG .EQ. JPOSEND-JPOSBEG) THEN LSAMESEQ=.TRUE. I=1 DO WHILE (I .LT. (IPOSEND-IPOSBEG+1) + .AND. LSAMESEQ) IF (CSQ_1(I:I) .NE. CSQ_2(I:I) ) THEN LSAMESEQ=.FALSE. ENDIF I=I+1 ENDDO IF (LSAMESEQ) WRITE(6,*)' identical sequences ' ENDIF else ENDIF c default trace is diagonal IF (LBACKWARD) THEN DO I=0,NDMAT LH2(I)=1 ENDDO c do i=ndmat,ndmat*2 ; lh(i)=20000.0 ; enddo cwrong call init_real_array(ndmat,ndmat*2,lh,20000.0) CALL SETMATRIX(IPOSBEG,IPOSEND,JPOSBEG, + JPOSEND,N2,LH1,LH2) CALL GETBEST(IPOSBEG+1,IPOSEND+1,JPOSBEG+1, + JPOSEND+1,1,NTEST,LH1,LH2,ND1,ND2, + BESTVAL,BESTIIPOS,BESTJJPOS) WRITE(6,*)BESTVAL,BESTIIPOS,BESTJJPOS SUBOPT_VAL=BESTVAL-((FILTER_VAL*BESTVAL)/100.0) CALL SETBACK(IPOSBEG,IPOSEND,JPOSBEG, + JPOSEND,N2,LH1,LH2,BESTVAL) ELSE CALL INIT_INT2_ARRAY(0,NDMAT,LH2,1) CALL SETMATRIX_FAST(IPOSBEG,IPOSEND,JPOSBEG, + JPOSEND,N2,LH2,BESTVAL,BESTIIPOS, + BESTJJPOS) ENDIF C NOTE: TRACE will aplpy threshold, and return LCONSIDER=.FALSE. C if below threshold! IF (BESTVAL.GT.0.0) THEN CALL TRACE(ISET,ND1,ND2,LH2,IPOSBEG,JPOSBEG, + BESTVAL,BESTIIPOS,BESTJJPOS,NTEST,SDEV, + IALIGN,NRECORD) ENDIF ENDDO ENDDO ENDDO C======================================================================= IF (.NOT. LPASS2 .AND. LTRACE) THEN LTRACE=.FALSE. LTRACEOUT=.FALSE. CLOSE(KPLOT) ENDIF C======================================================================= C ENTRY FOR SHIFTED REPEAT OF TOO LONG SEQUENCE C N2 was used in previous alignment IF (LSHIFTED) THEN IEND=N2-1 IF (IEND.EQ.0) THEN STOP' MAXMAT, MAXTRACE OR MAXSQ TOO SMALL, IEND=0' ENDIF DO I=1,N2REST-IEND CSQ_2(I:I)=CSQ_2(I+IEND:I+IEND) STRUC_2(I)=STRUC_2(I+IEND) LSQ_2(I)=LSQ_2(I+IEND) NSURF_2(I)=NSURF_2(I+IEND) ENDDO DO I=N2REST-IEND+1,N2REST CSQ_2(I:I)=' ' STRUC_2(I)=' ' LSQ_2(I)=0 NSURF_2(I)=0 ENDDO N2NEW=N2REST-IEND C NEW LENGTH TO USE IS N2NEW N2REST=N2NEW NSHIFTED=NSHIFTED+IEND c WRITE(6,'(a,i6)')'>>REPEAT PASS, TOTAL SHIFT:',nshifted N2=N2REST GOTO 350 ENDIF C======================================================================= C calculate conservation weights C then next sequence in file list or global sort C======================================================================= IF ( LALIOVERFLOW .EQV. .FALSE.) THEN IF (LPASS2 .EQV. .TRUE. .AND. + LCONSERV_1 .EQV. .TRUE. .AND. + LCONSIMPORT .EQV. .FALSE. .AND. + IALIGN .GT. 0) THEN C WRITE(6,*)' CALL GETCONSWEIGHT i=',IALIGN CALL GETCONSWEIGHT(N1,IALIGN,LSQ_1) ENDIF IALIGNOLD=IALIGN ENDIF C======================================================================= C debug C======================================================================= C IF (LDBG_LOCAL) THEN C DO I=1,N1 C WRITE(6,'(I,F7.2)')I,SMIN C ENDDO C DO I=1,N1 C WRITE(6,'(I,F7.2)')I,SMAX C ENDDO C DO I=1,N1 C WRITE(6,'(I,F7.2)')I,OPEN_1 C ENDDO C DO I=1,N1 C WRITE(6,'(I,F7.2)')I,SMIN*CONSWEIGHT_1(I) C ENDDO C DO I=1,N1 C WRITE(6,'(I,F7.2)')I,SMAX*CONSWEIGHT_1(I) C ENDDO C DO I=1,N1 C WRITE(6,'(I,F7.2)')I,OPEN_GAP_1(I) C ENDDO C END IF C end dbg C======================================================================= RETURN END C END DO_ALIGN C...................................................................... C...................................................................... C SUB EXTRACT_INTEGER SUBROUTINE EXTRACT_INTEGER(LINE,CDIVIDE,KEYWORD,INTVAL) C extract an integer from a line beginning with a keyword ; cdivide C indicates the border between keyword and value for keyword C like: THIS_IS_A_KEYWORD : this_is_the_value_for_keyword IMPLICIT NONE C import CHARACTER*(*) LINE,KEYWORD,CDIVIDE c export INTEGER INTVAL c internal INTEGER LENKEY,I,J,IBEG c====================================================================== CALL STRPOS(KEYWORD,I,J) LENKEY=J-I+1 IF ( LINE(1:LENKEY) .EQ. KEYWORD(I:J) ) THEN CALL STRPOS(LINE,I,J) IBEG=INDEX(LINE,CDIVIDE) IF (IBEG .EQ. 0) THEN WRITE(6,'(A,A,A)') + 'ERROR IN EXTRACT_INTEGER: no ',cdivide,'in line' STOP ENDIF CALL STRPOS(LINE(IBEG+1:J),I,J) CALL READ_INT_FROM_STRING(LINE(IBEG+I:IBEG+J),INTVAL) c WRITE(6,'(A,A,I6)')line(1:lenkey),' is: ',intval ENDIF RETURN END C END EXTRACT_INTEGER C...................................................................... C...................................................................... C SUB EXTRACT_REAL SUBROUTINE EXTRACT_REAL(LINE,CDIVIDE,KEYWORD,REALVAL) C extract an integer from a line beginning with a keyword ; cdivide C indicates the border between keyword and value for keyword C like: THIS_IS_A_KEYWORD : this_is_the_value_for_keyword IMPLICIT NONE C import CHARACTER*(*) LINE,KEYWORD,CDIVIDE c export REAL REALVAL c internal INTEGER LENKEY,I,J,IBEG c====================================================================== CALL STRPOS(KEYWORD,I,J) LENKEY=J-I+1 IF ( LINE(1:LENKEY) .EQ. KEYWORD(I:J) ) THEN CALL STRPOS(LINE,I,J) IBEG=INDEX(LINE,CDIVIDE) IF (IBEG .EQ. 0) THEN WRITE(6,'(A,A,A)') + 'ERROR IN EXTRACT_REAL: no ',cdivide,'in line' STOP ENDIF CALL STRPOS(LINE(IBEG+1:J),I,J) CALL READ_REAL_FROM_STRING(LINE(IBEG+I:IBEG+J),REALVAL) c WRITE(6,'(A,A,F7.2)')line(1:lenkey),' is: ',realval ENDIF RETURN END C END EXTRACT_REAL C...................................................................... C...................................................................... C SUB EXTRACT_INTEGER_RANGE SUBROUTINE EXTRACT_INTEGER_RANGE(LINE,CDIVIDE1,CDIVIDE2,INTVAL) C extract two integers from a line ; C cdivide1 indicates the border between keyword and values for keyword C cdivide2 seperetes the two values C like: THIS_IS_A_KEYWORD : first_value_for_keyword - second_value IMPLICIT NONE C import CHARACTER*(*) LINE,CDIVIDE1,CDIVIDE2 c export INTEGER INTVAL(1,2) c internal INTEGER I,J,IBEG1,IBEG2 c====================================================================== IBEG1=INDEX(LINE,CDIVIDE1) IBEG2=INDEX(LINE,CDIVIDE2) IF (IBEG1.EQ.0 .OR. IBEG2 .EQ. 0) THEN WRITE(6,'(A,A,A,A)') + 'ERROR IN EXTRACT_INTEGER_RANGE: no ',cdivide1,' or ', + cdivide2 STOP ENDIF CALL STRPOS(LINE(IBEG1+1:IBEG2-1),I,J) CALL READ_INT_FROM_STRING(LINE(IBEG1+I:IBEG1+J),INTVAL(1,1) ) CALL STRPOS(LINE(IBEG2+1:),I,J) CALL READ_INT_FROM_STRING(LINE(IBEG2+I:IBEG2+J),INTVAL(1,2) ) RETURN END C END EXTRACT_INTEGER_RANGE C...................................................................... C...................................................................... C SUB EXTRACT_STRING SUBROUTINE EXTRACT_STRING(LINE,CDIVIDE,KEYWORD,STRING) C extract a string from a line beginning with a keyword ; cdivide C indicates the border between keyword and value for keyword C like: THIS_IS_A_KEYWORD : this_is_the_string_for_keyword IMPLICIT NONE C import CHARACTER*(*) LINE,KEYWORD,CDIVIDE C export CHARACTER*(*) STRING C internal INTEGER LENKEY,I,J,IBEG C====================================================================== CALL STRPOS(KEYWORD,I,J) LENKEY=J-I+1 IF ( LINE(1:LENKEY) .EQ. KEYWORD(I:J) ) THEN CALL STRPOS(LINE,I,J) IBEG=INDEX(LINE,CDIVIDE) IF (IBEG.EQ.0) THEN WRITE(6,'(A,A,A)') + 'ERROR IN EXTRACT_STRING: no ',CDIVIDE,'in line' STOP ENDIF IF (J .GT. IBEG+1) THEN CALL STRPOS(LINE(IBEG+1:J),I,J) STRING=LINE(IBEG+I:IBEG+J) ELSE STRING=' ' ENDIF c WRITE(6,*)LINE(1:LENKEY)//' is: '//LINE(IBEG+I:IBEG+J) ENDIF RETURN END C END EXTRACT_STRING C...................................................................... C...................................................................... C SUB EVALPRED SUBROUTINE EVALPRED(PROTEIN,METHOD,PRED,STRUC,NRES,LDSSP, + KOUT,KSTA) C EXTERNAL LOGICAL LDSSP CHARACTER*1 STRUC(*),PRED(*) CHARACTER*(*) METHOD, PROTEIN INTEGER NRES, KOUT, KSTA C files KOUT and KSTA must be open for write C INTERNAL PARAMETER (MSTATES= 3) C *10 ALIASES CHARACTER*10 STATES(MSTATES) C (PREDICTED,OBSERVED) sub=0 means undefined symbol. DIMENSION NC(0:MSTATES,0:MSTATES),NCOBS(0:MSTATES) DIMENSION NCPRE(0:MSTATES),MPERPRE(MSTATES),MPEROBS(MSTATES) CAUTION - ANY CHANGE IN THE ORDER OF C STATES MUST BE MADE IN PRED-STAT AS WELL C SHEET LOOP HELIX DATA STATES/'EBAPMebapm','TCLS tcls ','HGI..hgi..'/ *----------------------------------------------------------------------* C PROCEDURE DO NP=0,MSTATES DO NS=0,MSTATES NC(NP,NS)=0 ENDDO ENDDO NUNPRED=0 NPRED=0 DO I=1,NRES C FIND STRUCTURE INDEX NP=0 NS=0 DO LS=1,MSTATES IF (INDEX(STATES(LS), PRED(I)) .NE. 0) NP=LS IF (INDEX(STATES(LS), STRUC(I)) .NE. 0) NS=LS ENDDO C OBS only via DSSP IF (LDSSP) THEN IF (NS .EQ. 0) THEN WRITE(6,*)'UNKNOWN DSSP STATE AT RES',I, struc(i) c STOP'*** error in EVALPRED ' ENDIF ELSE NS=0 ENDIF C INCREMENT COUNTER NC(NP,NS)=NC(NP,NS)+1 IF (NP .NE. 0) THEN NPRED=NPRED+1 ELSE NUNPRED=NUNPRED+1 ENDIF ENDDO C (I,J) = (PREDICTED,OBSERVED) C SUCCESS RATES: NCII=SUM(OVER I.NE.0) NC(I,I) C NCOBS(J)=SUM(OVER I=1..3) NC(I,J) C of those predicted C NCPRE(I)=SUM(OVER J=0..3) NC(I,J) of all C PREDICTED RES : NPRED=SUM(OVER I=1..3) NCPRE(I) C UNPREDICTED NUNPRED=NCPRE(0) NCII=0 DO I=0,MSTATES NCOBS(I)=0 NCPRE(I)=0 DO J=0,MSTATES IF (I .EQ. J .AND. I .NE. 0) NCII=NCII+NC(I,J) C not the unpredicted IF (J .NE. 0) NCOBS(I)=NCOBS(I)+NC(J,I) C all (not) observed NCPRE(I)=NCPRE(I)+NC(I,J) ENDDO ENDDO IF (NRES.NE.0) THEN PERPRED=NINT(100.*NPRED/FLOAT(NRES)) ELSE PERPRED=0.0 WRITE(6,*)'***EVALPRED: NRES=0' ENDIF C check for consistency IF (NUNPRED .NE. NCPRE(0)) THEN WRITE(6,*) NUNPRED,NCPRE(0) STOP '*** EVALPRED: NUNPRED.NE.NCPRE(0), you idiot ' ENDIF IF (NPRED.NE.NRES-NUNPRED) THEN WRITE(6,*) NPRED, NUNPRED, NRES WRITE(6,*) + '*** EVALPRED ERROR: NPRED,NUNPRED,NRES do not add up' ENDIF C print IF (LDSSP) THEN IF (NPRED.NE.0) THEN CORRECT=NCII/FLOAT(NPRED)*100 ELSE CORRECT=0.0 WRITE(6,*)'***EVALPRED: NPRED=0' ENDIF IF (KOUT.NE.0) THEN WRITE(KOUT,110) PROTEIN,METHOD,NRES,NPRED,PERPRED,CORRECT ENDIF WRITE( *,110) PROTEIN,METHOD,NRES,NPRED,PERPRED,CORRECT ELSE CORRECT=0.0 IF (KOUT.NE.0) THEN WRITE(KOUT,110) PROTEIN,METHOD,NRES,NPRED,PERPRED ENDIF WRITE( *,110) PROTEIN,METHOD,NRES,NPRED,PERPRED 110 FORMAT(1X,A4,1X,A10,I5,' residues',I5,' predicted.',/, + ' Result: ',F5.1,'% predicted',F7.1,'% correct') C LDSSP ENDIF C percentage in the universe of predicted (NPRED.LE.NRES) DO I=1,MSTATES IF (NPRED.NE.0) THEN MPERPRE(I)=NINT(NCPRE(I)/FLOAT(NPRED)*100.0) ELSE MPERPRE(I)=0 ENDIF IF (NPRED.NE.0) THEN MPEROBS(I)=NINT(NCOBS(I)/FLOAT(NPRED)*100.0) ELSE MPEROBS(I)=0 ENDIF ENDDO C IF (KOUT.NE.0) THEN WRITE(KOUT,113) 113 FORMAT(40X,'P R E D I C T E D ') WRITE(KOUT,114) (STATES(J),J=1,MSTATES),'total',' %' 114 FORMAT(40X,10(1X,A5)) IF (LDSSP) THEN DO J=1,MSTATES WRITE(KOUT,112)'OBS',STATES(J),(NC(I,J),I=1,MSTATES), + NCOBS(J),MPEROBS(J) ENDDO ENDIF C DSSP or no DSSP: WRITE(KOUT,112)' ',' ',(NCPRE(I), I=1,MSTATES) 112 FORMAT(1X,30X,A3,1X,A5,10I6) WRITE(KOUT,112)' ','!',(MPERPRE(I), I=1,MSTATES) ENDIF C output for prediction statistics WRITE(KSTA,111) PROTEIN,METHOD,NRES,NPRED,CORRECT, + ((NC(I,J),I=1,MSTATES),J=1,MSTATES) 111 FORMAT(A4,1X,A10,2I5,F5.1,'%',20I5) RETURN END C END EVALPRED C...................................................................... C================================================================ c$$$ subroutine fetch_sw_seq(path,indexfile,datafile,kindex,kdat, c$$$ + MAXSQ,nres,name,compnd,ACCESSION,pdbref, c$$$ + seq,lend) c$$$ c$$$ implicit none c$$$C import c$$$ integer MAXSQ,kindex,kdat c$$$ character*(*) path,indexfile,datafile c$$$C export c$$$ integer nres c$$$ character*(*) name,compnd,ACCESSION,pdbref,seq c$$$ logical lend,lbinary c$$$C internal c$$$ integer maxchar,indexreclen,nsize c$$$ parameter (maxchar=38,indexreclen=40,nsize=12) c$$$ c$$$ integer i,j,ipos,jpos,irec,idatindex,ifile c$$$ logical lfound,lerror c$$$ character*132 templine,filename c$$$ character alphabet*(maxchar) c$$$ character testline*(indexreclen) c$$$ c$$$ alphabet='0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_.' c$$$ idatindex=0 c$$$ lbinary=.true. c$$$ c$$$ call concat_strings(path,indexfile,filename) c$$$ call open_file(kindex,filename, c$$$ + 'OLD,DIRECT,FORMATTED,READONLY,RECL=40',lerror) c$$$ c$$$ call strpos(name,i,j) c$$$ c$$$ lfound=.false. c$$$ ipos=index(alphabet,name(i:i)) c$$$ jpos=index(alphabet,name(i+1:i+1)) c$$$ irec= ( ( (ipos-1) * maxchar) + jpos) + 1 c$$$ read(kindex,'(2x,i8)',rec=irec)irec c$$$ if (irec .eq. 0)goto 900 c$$$ c$$$ do while(.not. lfound) c$$$ read(kindex,'(a)',rec=irec)testline c$$$ if (index (testline,name(i:j)) .ne. 0) then c$$$ read(testline,'(12x,a,i8,i8)') c$$$ + ACCESSION(1:nsize),idatindex,ifile c$$$ lfound=.true. c$$$ endif c$$$ irec=irec+1 c$$$ enddo c$$$ close(kindex) c$$$ if (idatindex .ge. 1) then c$$$ call concat_int_string(ifile,datafile,filename) c$$$ call concat_strings(path,filename,templine) c$$$ call open_file(kdat,templine,'OLD',lerror) c$$$ do i=1,idatindex-1 c$$$ read(kdat,'(a)')testline(1:1) c$$$ enddo c$$$ call get_swiss_entry(MAXSQ,kdat,lbinary,nres,name,compnd, c$$$ + ACCESSION,pdbref,seq,lend) c$$$ close(kdat) c$$$ return c$$$ endif c$$$900 WRITE(6,*)'*** ERROR: index in fetch_sw_seq is 0 ;' c$$$ WRITE(6,*)' or nothing found' c$$$ nres=0 c$$$ name=' ' c$$$ compnd=' ' c$$$ ACCESSION=' ' c$$$ pdbref=' ' c$$$ seq=' ' c$$$ return c$$$ end C====================================================================== C...................................................................... C...................................................................... C SUB FILLSIMMETRIC SUBROUTINE FILLSIMMETRIC(MAXRES,NTRANS,MAXSTRSTATES,maxiostates, + NSTRSTATES_1,NSTRSTATES_2,CSTRSTATES,SIMMETRIC,NRES, + LSEQ,LSTR,LACC,POSSIMMETRIC) IMPLICIT NONE INTEGER NTRANS,MAXRES,NRES INTEGER MAXSTRSTATES,maxiostates INTEGER NSTRSTATES_1,NSTRSTATES_2 CHARACTER*(*) CSTRSTATES REAL SIMMETRIC(NTRANS,NTRANS,MAXSTRSTATES,MAXIOSTATES, + MAXSTRSTATES,MAXIOSTATES) INTEGER LSEQ(*),LACC(*),LSTR(*) REAL POSSIMMETRIC(MAXRES,*) C internal INTEGER I,J,ISTR C IF (NSTRSTATES_2 .GT. 1) THEN WRITE(6,*)' **** ERROR: nstrstates_2 .gt. 1' WRITE(6,*)' not possible to fill position dependend metric' STOP ENDIF DO I=1,NRES IF (NSTRSTATES_1 .GT.1) THEN ISTR=LSTR(I) IF (ISTR .EQ. 0)ISTR=1 ELSE ISTR=1 ENDIF IF (LSEQ(I) .EQ. 0) THEN DO J=1,NTRANS WRITE(6,*)'fillsimmetric: lseq unknown: ',lseq(i) POSSIMMETRIC(I,J)=0.0 ENDDO ELSE DO J=1,NTRANS c WRITE(6,'(a)')'fill i,j,lseq,lstr,lacc: ' c WRITE(6,'(5(i4))')i,j,lseq(i),istr,lacc(i) POSSIMMETRIC(I,J)=SIMMETRIC(LSEQ(I),J,ISTR,LACC(I),1,1) ENDDO ENDIF ENDDO RETURN END C END FILLSIMMETRIC C...................................................................... C...................................................................... C SUB FINDBRKFILE SUBROUTINE FINDBRKFILE(PDBFILE,PDBPATH,PID,KPDB,KLOG,LERROR) IMPLICIT NONE CHARACTER*(*) PDBFILE,PDBPATH,PID CHARACTER CEXT*30 LOGICAL LERROR INTEGER KPDB,KLOG C internal CHARACTER*200 LOGSTRING LERROR=.FALSE. cext='.brk' c cext='.pdb' IF (PDBPATH.EQ.' ') THEN CALL CONCAT_STRINGS(PID,CEXT,PDBFILE) ELSE CALL CONCAT_STRINGS(PID,CEXT,LOGSTRING) CALL CONCAT_STRINGS(PDBPATH,LOGSTRING,PDBFILE) ENDIF CALL OPEN_FILE(KPDB,PDBFILE,'OLD,READONLY',LERROR) IF (LERROR) THEN CALL CONCAT_STRINGS('PDB-FILE NOT FOUND: ',PDBFILE,LOGSTRING) CALL LOG_FILE(KLOG,LOGSTRING,1) ENDIF CLOSE(KPDB) RETURN END C END FINDBRKFILE C...................................................................... C...................................................................... C SUB GET_DEFAULT SUBROUTINE GET_DEFAULT() C get the system specific location of files C MAXHOM_DEFAULT is a logical name pointing to the maxhom.default file C VMS : assign $1:[schneider.public]maxhom.default C UNIX: setenv maxhom_default /home/schneider/public/maxhom.default C a file "maxhom.default" in the current directory has higher priority C METRIC_PATH : location of exchange metrices C SWISSPROT_SEQ : location of swissprot files C RELEASE_NOTES : release notes of EMBL/SWISSPROT C PDB_PATH : location of Brookhaven files C DSSP_PATH : location of DSSP files C COREPATH : directory path for corefile C COREFILE : where to put the temporary binary file to store the C alignments IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' c internal INTEGER IBEG,IEND LOGICAL LEXIST,LERROR CHARACTER*200 LINE CHARACTER*1 CDIVIDE C INIT LEXIST=.FALSE. METRICPATH=' ' SWISSPROT_SEQ=' ' SW_CURRENT=' ' SPLIT_DB_NAMES=' ' C SW_DATA=' ' ; SW_INDEX=' ' ; SW_PATH=' ' RELNOTES=' ' PDBPATH=' ' DSSP_PATH=' ' COREPATH=' ' COREFILE=' ' FILTER_FASTA_EXE=' ' FASTA_EXE=' ' FILTER_BLASTP_EXE=' ' BLASTP_EXE=' ' CONVERTSEQ_EXE=' ' CDIVIDE=':' C check existence of default file and open IF (MAXHOM_DEFAULT .EQ. ' ') THEN MAXHOM_DEFAULT= 'maxhom.default' ENDIF IF (MAXHOM_DEFAULT .NE. ' ') THEN INQUIRE(FILE=MAXHOM_DEFAULT,EXIST=LEXIST) ENDIF IF (LEXIST) THEN CALL STRPOS(MAXHOM_DEFAULT,IBEG,IEND) WRITE(6,*)' default file is: ',maxhom_default(ibeg:iend) CALL FLUSH_UNIT(6) LINE='OLD,READONLY' CALL OPEN_FILE(KDEF,MAXHOM_DEFAULT,LINE,LERROR) ELSE WRITE(6,*)' ERROR: can not find default file ' WRITE(6,*)' Check environment variable MAXHOM_DEFAULT or ' WRITE(6,*)' specify default file with option -d=filename ' call flush_unit(6) STOP ENDIF C read defaults DO WHILE(.TRUE.) c read(kdef,'(a)',end=999)line READ(KDEF,'(A)',END=999,ERR=999)LINE c WRITE(6,*)line(1:40) IF (LINE(1:2) .EQ. '##') THEN GOTO 999 ENDIF IF (LINE(1:1) .NE. '#' .AND. LINE .NE.' ') THEN CALL EXTRACT_STRING(LINE,CDIVIDE,'MACHINE',CMACHINE) CALL EXTRACT_STRING(LINE,CDIVIDE,'COREPATH',COREPATH) CALL EXTRACT_STRING(LINE,CDIVIDE,'COREFILE',COREFILE) CALL EXTRACT_STRING(LINE,CDIVIDE,'METRIC_PATH',METRICPATH) CALL EXTRACT_STRING(LINE,CDIVIDE,'SWISSPROT_SEQ', + SWISSPROT_SEQ) CALL EXTRACT_STRING(LINE,CDIVIDE,'SWISSPROT_CURRENT', + SW_CURRENT) CALL EXTRACT_STRING(LINE,CDIVIDE,'SPLIT_DB',SPLIT_DB_NAMES) c call extract_string(line,cdivide,'SWISSPROT_INDEX',sw_index) c call extract_string(line,cdivide,'SWISSPROT_PATH',sw_path) c call extract_string(line,cdivide,'SWISSPROT_DATA',sw_data) CALL EXTRACT_STRING(LINE,CDIVIDE,'RELEASE_NOTES', + RELNOTES) CALL EXTRACT_STRING(LINE,CDIVIDE,'PDB_PATH',PDBPATH) CALL EXTRACT_STRING(LINE,CDIVIDE,'DSSP_PATH',DSSP_PATH) CALL EXTRACT_STRING(LINE,CDIVIDE,'FILTER_FASTA_EXE', + FILTER_FASTA_EXE) CALL EXTRACT_STRING(LINE,CDIVIDE,'FASTA_EXE',FASTA_EXE) CALL EXTRACT_STRING(LINE,CDIVIDE,'FILTER_BLASTP_EXE', + FILTER_BLASTP_EXE) CALL EXTRACT_STRING(LINE,CDIVIDE,'BLASTP_EXE',BLASTP_EXE) CALL EXTRACT_STRING(LINE,CDIVIDE,'CONVERTSEQ_EXE', + CONVERTSEQ_EXE) ENDIF ENDDO 999 CLOSE(KDEF) IF (INDEX(CMACHINE,'UNIX').NE.0) THEN CMACHINE='UNIX' ELSE IF (INDEX(CMACHINE,'VMS').NE.0) THEN CMACHINE='VMS' ELSE WRITE(6,*)' *** MACHINE type UNKNOWN (assume UNIX) ***' CMACHINE='UNIX' ENDIF IF (COREFILE .EQ. ' ' ) THEN WRITE(6,*)' ERROR: COREFILE UNDEFINED' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' STOP ELSE IF (COREPATH .EQ. ' ' ) THEN WRITE(6,*)' WARNING: COREPATH UNDEFINED' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (METRICPATH .EQ. ' ') THEN WRITE(6,*)' ERROR: METRIC_PATH undefined' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' STOP ELSE IF (SWISSPROT_SEQ .EQ. ' ') THEN WRITE(6,*)' WARNING: SWISSPROT_SEQ undefined ' WRITE(6,*)' no search against database possible ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (SPLIT_DB_NAMES .EQ. ' ') THEN WRITE(6,*)' WARNING: SPLIT_DB undefined ' WRITE(6,*)' no parallel search against database possible ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' c else if (sw_data .eq. ' ') then c WRITE(6,*)' WARNING: SW_DATA undefined ' c WRITE(6,*)' no search against database possible ' c WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' c else if (sw_index .eq. ' ') then c WRITE(6,*)' WARNING: SW_INDEX undefined ' c WRITE(6,*)' no search against database possible ' c WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' c else if (sw_path .eq. ' ') then c WRITE(6,*)' WARNING: SW_PATH undefined ' c WRITE(6,*)' no search against database possible ' c WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (SW_CURRENT .EQ. ' ') THEN WRITE(6,*)' WARNING: SWISSPROT_CURRENT undefined ' WRITE(6,*)' no search with blastp possible ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (RELNOTES .EQ. ' ') THEN WRITE(6,*)' WARNING: RELEASE_NOTES undefined ' WRITE(6,*)' no information about database ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (PDBPATH .EQ. ' ') THEN WRITE(6,*)' WARNING: PDB_PATH undefined ' WRITE(6,*)' no superposition in 3-D possible ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (DSSP_PATH .EQ. ' ') THEN WRITE(6,*)' WARNING: DSSP_PATH undefined ' WRITE(6,*)' no check of pdb-pointers from SwissProt ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (FILTER_FASTA_EXE .EQ. ' ') THEN WRITE(6,*)' WARNING: FILTER_FASTA_EXE undefined ' WRITE(6,*)' no pre-filtered run against database ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (FASTA_EXE .EQ. ' ') THEN WRITE(6,*)' WARNING: FASTA_EXE undefined ' WRITE(6,*)' no FASTA-pre-filtered run against database ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (FILTER_BLASTP_EXE .EQ. ' ') THEN WRITE(6,*)' WARNING: FILTER_BLASTP_EXE undefined ' WRITE(6,*)' no pre-filtered run against database ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (BLASTP_EXE .EQ. ' ') THEN WRITE(6,*)' WARNING: BLASTP_EXE undefined ' WRITE(6,*)' no BLASTP-pre-filtered run against database ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ELSE IF (CONVERTSEQ_EXE .EQ. ' ') THEN WRITE(6,*)' WARNING: CONVERTSEQ_EXE undefined' WRITE(6,*)' at least no FASTA pre-filter possible ' WRITE(6,*)' PLEASE CHECK MAXHOM.DEFAULT FILE ' ENDIF c WRITE(6,*)' get_default end' RETURN END C END GET_DEFAULT C...................................................................... C...................................................................... C SUB GET_FASTA_DB_ENTRY SUBROUTINE GET_FASTA_DB_ENTRY(MAXSQ,KUNIT,NRES,NAME,COMPOUND, + ACCESSION,PDBREF,SEQ,LEND) IMPLICIT NONE C import INTEGER MAXSQ,KUNIT C export CHARACTER*(*) SEQ,NAME,COMPOUND,ACCESSION,PDBREF INTEGER NRES LOGICAL LEND C internal INTEGER I,J,K,LINELEN PARAMETER (LINELEN= 500) CHARACTER LINE*(LINELEN) C====================================================================== LEND=.FALSE. NRES=0 C===================================================================== READ(KUNIT,'(A)',END=900,ERR=999)LINE J=INDEX(LINE,'|') IF (J .GT. 0) THEN K=INDEX(LINE(J+1:),'|') IF (K .GT. 0) THEN I=INDEX(LINE,' ') NAME(1:)=LINE(2:I-1) COMPOUND(1:)=LINE(I+1:) LINE(J:J)=' ' K=INDEX(LINE,'|') ACCESSION(1:)=LINE(J+1:K-1) ELSE J=INDEX(LINE,' ') NAME(1:)=LINE(2:J) COMPOUND(1:)=LINE(2:) ACCESSION(1:)=' ' WRITE(6,*)'WARNING from get_fasta_db: '// + 'entry line looks strange (no |)' WRITE(6,*)LINE(1:60) ENDIF ELSE J=INDEX(LINE,' ') NAME(1:)=LINE(2:J) COMPOUND(1:)=LINE(2:) ACCESSION(1:)=' ' WRITE(6,*)'WARNING from get_fasta_db: '// + 'entry line looks strange (no |)' WRITE(6,*)LINE(1:60) ENDIF SEQ=' ' c sequences starts in next line 100 READ(KUNIT,'(A)',ERR=999,END=900) LINE IF (LINE(1:1) .EQ. '>' .AND. NRES .NE. 0) THEN BACKSPACE(KUNIT) ELSE DO I=1,LINELEN IF ( LINE(I:I) .NE. ' ' ) THEN NRES=NRES+1 IF (NRES .LE. MAXSQ ) THEN SEQ(NRES:NRES)=LINE(I:I) ELSE c truncate if needed WRITE(6,*)' SEQ CUT TO MAXSQ: ',MAXSQ CALL FLUSH_UNIT(6) NRES=MAXSQ 200 READ(KUNIT,'(A)',ERR=999,END=900) LINE IF (LINE(1:1) .EQ. '>' ) THEN BACKSPACE(KUNIT) RETURN ENDIF GOTO 200 ENDIF ENDIF ENDDO GOTO 100 ENDIF C====================================================================== 900 IF (NRES .EQ. 0)LEND=.TRUE. RETURN 999 WRITE(6,*)' ERROR in get_fasta_db_entry ',name,nres c call flush_unit(6) STOP END C END GET_FASTA_DB_ENTRY C...................................................................... C...................................................................... C SUB GET_LDIREC SUBROUTINE GET_LDIREC(ND1,ND2,LH2,II,JJ,LDEL_DIREC) IMPLICIT NONE INTEGER ND1,ND2,II,JJ,LDEL_DIREC INTEGER*2 LH2(0:ND1,0:ND2) c real lh(0:nd1,0:nd2,2) LDEL_DIREC =ABS( LH2(II,JJ) ) c scratch once used trace LH2(II,JJ)=-1 RETURN END C END GET_LDIREC C...................................................................... C...................................................................... C SUB GET_LDIREC_FAST SUBROUTINE GET_LDIREC_FAST(ND1,ND2,LH2,II,JJ,LDEL_DIREC) IMPLICIT NONE INTEGER ND1,ND2,II,JJ,LDEL_DIREC INTEGER*2 LH2(0:ND1,0:ND2) c real lh(0:nd1,0:nd2) LDEL_DIREC =ABS( LH2(II,JJ) ) c scratch once used trace LH2(II,JJ)=-1 RETURN END C END GET_LDIREC_FAST C====================================================================== C NOTE: ONLY TEMPRARY TO REDUCE MEMORY REQUIREMENTS FOR MAXHOM C MIXED ROUTINES FROM: C SYSTEM-LIB C UTILITY-LIB C PROTEIN-LIB C HSSP-LIB C====================================================================== C...................................................................... C SUB GET_SEQ SUBROUTINE GET_SEQ(KIN,FILENAME,TRANS,CHAINS,COMPND,ACCESSION, + PDBREF,PDBNO,NRES,SEQ,STRUC,ACC,TRUNCATED,ERROR) C 13.5.93 IMPLICIT NONE C Import INTEGER KIN CHARACTER*(*) CHAINS CHARACTER*(*) TRANS, FILENAME, COMPND, ACCESSION, PDBREF C Export INTEGER NRES INTEGER PDBNO(*), ACC(*) CHARACTER*(*) SEQ, STRUC LOGICAL TRUNCATED, ERROR C Internal INTEGER I,J, RLEN CHARACTER*20 FORMATNAME LOGICAL LACCZERO C====================================================================== ACCESSION=' ' PDBREF=' ' COMPND=' ' TRUNCATED=.FALSE. CALL CHECKFORMAT(KIN,FILENAME,FORMATNAME,ERROR) IF ( ERROR ) THEN WRITE(6,*)'GET_SEQ: FILE OPEN ERROR, SET NRES=0 AND RETURN' WRITE(6,*)'FILENAME: ', FILENAME RETURN ENDIF CALL STRPOS(FILENAME,I,J) C..initialize NRES = 0 PDBREF = ' ' DO I = 1,LEN(SEQ) SEQ(I:I) = '-' STRUC(I:I) = 'U' ACC(I) = 0 ENDDO INQUIRE(KIN,RECL=RLEN) IF (FORMATNAME .EQ. 'BRK') THEN CALL READ_BRK(KIN,FILENAME,CHAINS,TRANS,RLEN,NRES, 1 COMPND,SEQ,PDBNO,TRUNCATED,ERROR) ELSE IF (FORMATNAME .EQ. 'FASTA') THEN CALL READ_FASTA(KIN,FILENAME,TRANS,RLEN,NRES,ACCESSION, 1 COMPND,SEQ,TRUNCATED,ERROR) ELSE IF (FORMATNAME .EQ. 'PIR') THEN CALL READ_PIR(KIN,FILENAME,TRANS,RLEN,NRES,ACCESSION, 1 COMPND,SEQ,TRUNCATED,ERROR) ELSE IF (FORMATNAME .EQ. 'EMBL') THEN CALL READ_EMBL(KIN,FILENAME,TRANS,RLEN,NRES, 1 COMPND,ACCESSION,PDBREF,SEQ,TRUNCATED,ERROR) ELSE IF (FORMATNAME .EQ. 'GCG') THEN CALL READ_GCG(KIN,FILENAME,TRANS,RLEN,NRES, 1 COMPND,SEQ,TRUNCATED,ERROR) ELSE IF (FORMATNAME .EQ. 'STAR') THEN COMPND = ' ' CALL READ_STAR(KIN,FILENAME,TRANS,RLEN,NRES, 1 SEQ,TRUNCATED,ERROR) ELSE IF (FORMATNAME .EQ. 'DSSP') THEN CALL READ_SEQ_FROM_DSSP(KIN,FILENAME,CHAINS,TRANS,RLEN, 1 SEQ,STRUC,ACC,PDBNO,COMPND,NRES,LACCZERO,TRUNCATED,ERROR) IF (LACCZERO) THEN WRITE(6,*)'***************************************' WRITE(6,*)'* WARNING: accessibility values are 0 *' WRITE(6,*)'***************************************' ENDIF ELSE IF (FORMATNAME .EQ. 'HSSP') THEN CALL READ_SEQ_FROM_HSSP(KIN,FILENAME,CHAINS,TRANS,RLEN, 1 SEQ,STRUC,ACC,PDBNO,COMPND,NRES,LACCZERO,TRUNCATED,ERROR ) ENDIF IF ( ERROR ) RETURN CALL STRPOS(FILENAME,I,J) WRITE(6,'(A,A10,A,A,A,I5)')'GET_SEQ: ',FORMATNAME,':', + FILENAME(1:J),' ',NRES IF ( TRUNCATED ) THEN WRITE(6,*)'TRUNCATED TO ',len(seq),nres,' RESIDUES' WRITE(6,*)'!!! INCREASE DIMENSION !!!' NRES=LEN(SEQ) ENDIF RETURN END C END GET_SEQ C...................................................................... C...................................................................... C SUB GET_SEQ_FROM_ALISEQ SUBROUTINE GET_SEQ_FROM_ALISEQ(ALISEQ,IFIR,ILAS,ALIPOINTER, 1 ALILEN,ALINO,SEQUENCE,NRES,ERROR) C 8.7.93 IMPLICIT NONE C Import INTEGER ALILEN, ALINO INTEGER IFIR(*),ILAS(*),ALIPOINTER(*) CHARACTER ALISEQ(*) C EXPORT INTEGER NRES CHARACTER*(*) SEQUENCE LOGICAL ERROR C INTERNAL INTEGER IPOS CHARACTER CGAPCHAR CGAPCHAR = '.' IF ( ALILEN .GT. LEN(SEQUENCE) ) THEN ERROR = .TRUE. WRITE(6,'(A)') 1 ' MAXRES overflow in get_seq_from_aliseq !' RETURN ENDIF NRES = 0 DO WHILE ( NRES .LT. IFIR(ALINO)-1 ) NRES = NRES + 1 SEQUENCE(NRES:NRES) = CGAPCHAR ENDDO DO IPOS = ALIPOINTER(ALINO), 1 ALIPOINTER(ALINO)+ILAS(ALINO)-IFIR(ALINO) NRES = NRES + 1 SEQUENCE(NRES:NRES) = ALISEQ(IPOS) ENDDO DO WHILE ( NRES .LT. ALILEN ) NRES = NRES + 1 SEQUENCE(NRES:NRES) = CGAPCHAR ENDDO RETURN END C END GET_SEQ_FROM_ALISEQ C...................................................................... C...................................................................... C SUB GETALIGN SUBROUTINE GETALIGN(KFILE,IRECORD,IFIR,LEN1,LENOCC,JFIR,JLAS, + IDEL,NDEL,VALUE,RMS,HOM,SIM,SDEV,DISTANCE,CHECKVAL) C GET ONE ALIGNMENT AS WRITTEN BY TRACE C an alignment is: C * LDSSP_2 NAME_2 COMPOUND ACCESSION PDBREF VALUE IFIR LEN1 LENOCC C C JFIR JLAS N2IN IDEL NDEL NSHIFTED RMS HOM SIM DISTANCE C C AL_2 [ SAL_2 (if ldssp_2 ] C C====================================================================== IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' c input INTEGER KFILE,IRECORD REAL CHECKVAL c output C CHARACTER AL_1*(*) C CHARACTER AL_2*(*),SAL_2*(*) INTEGER IFIR,JFIR,JLAS,IDEL,NDEL,LEN1,LENOCC REAL VALUE,SIM,SDEV,HOM,RMS,DISTANCE C INSERTIONS IN SEQ 2 C INTEGER IINS,INSLEN_LOCAL(*),INSBEG_1_LOCAL(*),INSBEG_2_LOCAL(*) C CHARACTER INSSEQ*(*) C INTERNAL INTEGER INSPOINTER_LOCAL CHARACTER LINE(4)*(MAXRECORDLEN) CHARACTER C*1 INTEGER K,IALIPOS,JALIPOS,IPOS,I,NLINE,IBEG,IEND REAL XCHECK C INIT C AL_1= ' ' C= ' ' LDSSP_2= .FALSE. NAME_2= ' ' C OMPND_2=' ' ACCESSION_2=' ' PDBREF_2= ' ' AL_2= ' ' SAL_2= ' ' LINE(1)= ' ' LINE(2)= ' ' LINE(3)= ' ' INSSEQ= ' ' IFIR= 0 LEN1= 0 LENOCC= 0 JFIR= 0 JLAS= 0 N2IN= 0 IDEL= 0 NDEL= 0 NSHIFTED= 0 VALUE= 0.0 RMS= 0.0 HOM= 0.0 SIM= 0.0 DISTANCE= 0.0 IINS= 0 INSLEN_LOCAL(1)= 0 INSBEG_1_LOCAL(1)=0 INSBEG_2_LOCAL(1)=0 SDEV= 0.0 LCONSIDER= .TRUE. READ(KFILE,REC=IRECORD)C,LCONSIDER,VALUE IF (C .NE. '*') THEN WRITE(6,*)C,IRECORD WRITE(LOGSTRING,'(A)') + '*** ERROR: INCORRECT RECORD BOUNDARY IN GETALIGN' CALL LOG_FILE(KLOG,LOGSTRING,1) STOP ENDIF C WRITE(6,*)LCONSIDER,VALUE ; CALL FLUSH_UNIT(6) C---- -------------------------------------------------- C---- only for alignments to take! C---- -------------------------------------------------- IF (LCONSIDER) THEN IRECORD=IRECORD+1 READ(KFILE,REC=IRECORD)NAME_2 IRECORD=IRECORD+1 READ(KFILE,REC=IRECORD)COMPND_2 IRECORD=IRECORD+1 READ(KFILE,REC=IRECORD)ACCESSION_2,PDBREF_2,LDSSP_2 IRECORD=IRECORD+1 READ(KFILE,REC=IRECORD)IFIR,LEN1,LENOCC,JFIR,JLAS,N2IN, + IDEL,NDEL,NSHIFTED,RMS,HOM,SIM,SDEV, + DISTANCE,IINS XCHECK=0.0 IF (CSORTMODE .EQ. 'DISTANCE' ) THEN XCHECK = DISTANCE ELSE IF (CSORTMODE .EQ.'VALUE' .OR. CSORTMODE.EQ.'ZSCORE') THEN XCHECK = VALUE ELSE IF (CSORTMODE .EQ. 'WSIM' ) THEN XCHECK = SIM ELSE IF (CSORTMODE .EQ. 'SIM' ) THEN XCHECK = SIM ELSE IF (CSORTMODE .EQ. 'SIGMA' ) THEN XCHECK = VALUE / SDEV ELSE IF (CSORTMODE .EQ. 'IDENTITY' ) THEN XCHECK = HOM ELSE IF (CSORTMODE .EQ. 'VALPER' ) THEN XCHECK = VALUE/FLOAT(LENOCC) ELSE IF (CSORTMODE .EQ. 'VALFORM' ) THEN XCHECK=VALUE*(LENOCC**(-0.56158)) ENDIF IF (CSORTMODE .NE. 'ZSCORE' .AND. CSORTMODE .NE. 'NO' ) THEN IF ( ABS (XCHECK-CHECKVAL) .GT. 0.01 ) THEN LOGSTRING=' ' WRITE(LOGSTRING,'(A,F7.2,A,F7.2,A,A)') + '** ERROR: XCHECK.NE.CHECKVAL ',XCHECK,' ', + CHECKVAL,' ',CSORTMODE CALL LOG_FILE(KLOG,LOGSTRING,1) STOP ENDIF ENDIF C IF (.NOT. LDSSP_2) THEN DO K=1,LEN1 SAL_2(K:K)='U' ENDDO ENDIF IALIPOS=1 JALIPOS=MIN(LEN1,MAXRECORDLEN) DO WHILE(IALIPOS .LE. LEN1) IRECORD=IRECORD+1 READ(KFILE,REC=IRECORD)LINE(2) IF (LDSSP_2) THEN IRECORD=IRECORD+1 READ(KFILE,REC=IRECORD)LINE(3) IRECORD=IRECORD+1 READ(KFILE,REC=IRECORD)LINE(4) ENDIF IPOS=1 DO I=IALIPOS,JALIPOS AL_2(I:I)=LINE(2)(IPOS:IPOS) IF (LDSSP_2) THEN SAL_2(I:I)=LINE(3)(IPOS:IPOS) READ(LINE(4)(IPOS:IPOS),'(I1)')LACC_2(I) ENDIF IPOS=IPOS+1 ENDDO IALIPOS=JALIPOS+1 JALIPOS=MIN(LEN1,JALIPOS+MAXRECORDLEN) ENDDO C READ INSERTIONS IF (IINS .GT. 0) THEN INSPOINTER_LOCAL=1 DO I=1,IINS IRECORD=IRECORD+1 READ(KFILE,REC=IRECORD)INSLEN_LOCAL(I),INSBEG_1_LOCAL(I), + INSBEG_2_LOCAL(I) INSPOINTER_LOCAL=INSPOINTER_LOCAL+INSLEN_LOCAL(I)+3 ENDDO IF ( MOD(FLOAT(INSPOINTER_LOCAL),FLOAT(MAXRECORDLEN)) .EQ. + 0.0) THEN NLINE= INSPOINTER_LOCAL/MAXRECORDLEN ELSE NLINE=(INSPOINTER_LOCAL/MAXRECORDLEN ) +1 ENDIF IBEG=1 IEND=MAXRECORDLEN DO I=1,NLINE IRECORD=IRECORD+1 READ(KFILE,REC=IRECORD)INSSEQ(IBEG:IEND) IBEG=IEND+1 IEND=IEND+MAXRECORDLEN ENDDO ENDIF ENDIF C end of LCONSIDER RETURN END C END GETALIGN C...................................................................... C...................................................................... ***** ------------------------------------------------------------------ ***** SUB GETARRAYINDEX ***** ------------------------------------------------------------------ C---- C---- NAME : GETARRAYINDEX C---- ARG : 1 CARRAY(1:NMAX) = array with strings C---- ARG : 2 CSTRING = string to find in array C---- ARG : 3 NMAX = maximal number of elements of carray C---- ARG : 4 INDEX = index of element matching C---- DES : Checks whether or not the string CSTRING equals C---- DES : any of the strings in CARRAY. C---- DES : if yes: returns the number of the array element matching C---- DES : if not: returns 0 C---- *----------------------------------------------------------------------* SUBROUTINE GETARRAYINDEX(CARRAY,CSTRING,NMAX,IINDEX) IMPLICIT NONE C does not contain CSTRING C Import INTEGER NMAX C---- br 99.03: watch hard_coded here, see maxhom.param C---- lkajan: let us not force that size - it caused problems with the C---- following CARRAY(IINDEX).EQ.CSTRING - that did not match when C---- it should have. CHARACTER*(*) CARRAY(NMAX) C---- --> REASON: the following produces warnings on SGI C CHARACTER*(*) CARRAY(*) C CHARACTER*(*) CARRAY(NMAX) CHARACTER*(*) CSTRING C internal INTEGER IINDEX,i LOGICAL LNOT ******------------------------------*-----------------------------****** C---- ini IINDEX= 1 LNOT= .TRUE. C---- count up until ctest matches DO WHILE (LNOT) C---- leave when at end of string IF (LNOT .AND. IINDEX.GT.NMAX) LNOT=.FALSE. C---- leave when match C---- hack br 99.03: SGI compiler crashes if IINDEX too high C---- (if in one if (not and carry=string)!) IF (LNOT) THEN C WRITE(6,*)'>',CARRAY(IINDEX),'<>',CSTRING,'<' IF (CARRAY(IINDEX).EQ.CSTRING) LNOT=.FALSE. ENDIF C------- count up IF (LNOT) IINDEX=IINDEX+1 ENDDO C---- none found -> return 0 IF (IINDEX .GT. NMAX ) IINDEX = 0 RETURN END C END GETARRAYINDEX C...................................................................... C...................................................................... C SUB GETBEST SUBROUTINE GETBEST(IPOSBEG,IPOSEND,JPOSBEG,JPOSEND,NREGION, + NTEST,LH1,LH2,ND1,ND2,BESTVAL,BESTIIPOS,BESTJJPOS) C search the LH matrix for the best value, where the trace was not C used in a previous alignment IMPLICIT NONE INCLUDE 'maxhom.param' c import INTEGER IPOSBEG,IPOSEND,JPOSBEG,JPOSEND,NREGION,NTEST INTEGER ND1,ND2 REAL LH1(0:ND1,0:ND2) INTEGER*2 LH2(0:ND1,0:ND2) C REAL LH(0:ND1,0:ND2,*) C EXPORT INTEGER BESTIIPOS,BESTJJPOS REAL BESTVAL C INTERNAL INTEGER I,J,II,JJ,LDIREC LOGICAL LDONE_BEFORE REAL BEST,BEST_II(0:MAXSQ+1) INTEGER ITEMP,JTEMP,TEMP_II(0:MAXSQ+1), + TEMP_JJ(0:MAXSQ+1) *----------------------------------------------------------------------* C INIT BESTVAL=0.00000000 BESTIIPOS=0 BESTJJPOS=0 C horizontal path : ldirec=40000 ; ldel<=MAXSQ C vertical path : ldirec=30000 ; ldel<=MAXSQ C diagonal match : ldirec=20000 ; ldel=0 C unmatched terminal sequence : ldirec=10000 ; ldel=0 IF (NTEST .LT. NREGION) THEN C GET BEST VALUE DO I=IPOSBEG,IPOSEND BEST_II(I)=0.0 TEMP_II(I)=0 TEMP_JJ(I)=0 ENDDO DO J=JPOSEND,JPOSBEG,-1 DO I=IPOSBEG,IPOSEND IF (LH1(I,J) .GT. BEST_II(I)+0.0001 ) THEN BEST_II(I)= LH1(I,J) TEMP_II(I) = I TEMP_JJ(I) = J ENDIF ENDDO ENDDO DO I=IPOSEND,IPOSBEG,-1 IF (BEST_II(I) .GT. BESTVAL+0.0001) THEN BESTVAL=BEST_II(I) BESTIIPOS=TEMP_II(I) BESTJJPOS=TEMP_JJ(I) C WRITE(6,*)BESTVAL,BEST_II(I),BESTIIPOS,BESTJJPOS ENDIF ENDDO ELSE C TRACE BACK TILL END FOR EACH NEW BEST VALUE DO J=JPOSEND,JPOSBEG,-1 DO I=IPOSEND,IPOSBEG,-1 IF ( LH1(I,J) .GT. BESTVAL+0.0001 ) THEN LDONE_BEFORE=.FALSE. BEST=LH1(I,J) ITEMP=I JTEMP=J II=I JJ=J DO WHILE ( .NOT. LDONE_BEFORE .AND. + LH2(II,JJ) .NE. 0 .AND. + II .GT. IPOSBEG .AND. JJ .GT. JPOSBEG) LDIREC= ABS( LH2(II,JJ) ) IF (LDIREC .GT. 20000 ) THEN II=II - ( LDIREC - 20000 ) ELSE IF (LDIREC .GT. 10000 ) THEN JJ=JJ - ( LDIREC - 10000 ) ELSE IF (LH2(II,JJ) .EQ. -1) THEN LDONE_BEFORE=.TRUE. ELSE IF (LDIREC .EQ. 1) THEN II=II-1 JJ=JJ-1 ELSE WRITE(6,*)'GETBEST: LDIREC UNKNOWN: ',LDIREC ENDIF ENDDO IF (.NOT. LDONE_BEFORE) THEN BESTVAL=BEST BESTIIPOS=ITEMP BESTJJPOS=JTEMP ENDIF ENDIF ENDDO ENDDO ENDIF RETURN END C END GETBEST C...................................................................... C...................................................................... C SUB GETCHAINBREAKS SUBROUTINE GETCHAINBREAKS(NRES,LSQ,STRUC,TRANS,NBREAK,IBREAKPOS) C RS 89 C search for chain break(s) and store position(s) in array IBREAKPOS C total number of breaks in protein are in NBREAK C used to disallow alignments over chain breaks C and to check pieces from DSSP and BRK if superpositon in 3-D wanted C import INTEGER LSQ(*) CHARACTER TRANS*(*) C EXPORT INTEGER IBREAKPOS(*),NBREAK CHARACTER*(*) STRUC(*) C INTERNAL INTEGER ILEN ILEN=LEN(TRANS) NBREAK=0 IBREAKPOS(1)=0 IND=INDEX(TRANS(1:ILEN),'!') DO IRES=1,NRES IF (LSQ(IRES) .EQ. IND) THEN NBREAK=NBREAK+1 IBREAKPOS(NBREAK)=IRES STRUC(IRES)='!' C WRITE(6,*)' CHAINBREAK : ',IRES ENDIF ENDDO RETURN END C END GETCHAINBREAKS C...................................................................... C...................................................................... C SUB GETCHAR SUBROUTINE GETCHAR(KCHAR,CHARARR,CTEXT) C prompts for characters CHARACTER*(*) CTEXT,CHARARR CHARACTER*(KCHAR) LINE INTEGER IMAX IMAX=LEN(CHARARR) WRITE(6,*)'================================================='// + '==============================' CALL WRITELINES(CTEXT) 10 CONTINUE WRITE(6,*) WRITE(6,'(a,i3,a)')' Enter string of length < ',imax, + ' [CR=default]' WRITE(6,*)' ' CALL STRPOS(CHARARR,IBEG,IEND) IF (IBEG .GT. 0 .AND. IEND .GT. 0) THEN WRITE(6,'(a,a)')' Default: ',chararr(ibeg:iend) ELSE WRITE(6,'(a,a)')' Default: ',chararr ENDIF WRITE(6,*)' ' LINE=' ' READ(*,'(A)',ERR=10,END=11) LINE IF ( LINE .NE. ' ' ) THEN C assuming default values were set outside .... CALL STRPOS(LINE,IBEG,IEND) c do i=1,iend c iascii=ichar(line(i:i)) c if (iascii .lt. 32 .or. iascii .gt. 126) then c WRITE(6,*)'*** Characters only, NOT: ',line(1:iend) c GOTO 10 c endif c enddo c iend=min(iend,imax) CHARARR(1:)=LINE(1:IEND) ENDIF iend=min(iend,len(chararr)) 11 WRITE(6,'(a,a)')' echo: ',chararr(1:iend) RETURN END C END GETCHAR C...................................................................... C...................................................................... C SUB GETCONSWEIGHT SUBROUTINE GETCONSWEIGHT(NRES,IALIGN,LSEQ_1) C conservation weights: C fix weights between 1.0 and 0.1 C where 0.0 means random distribution, because of moise its possible C that cons-weights have small negative values C so cons-weights <0.1 are set to 0.1 C ISAFE is here +5 C======================================================================= IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C import INTEGER NRES,IALIGN,LSEQ_1(*) C internal INTEGER ISMALL,KFILE,ISAFERANGE,I,IRES,IALNEW,IDEL,NDEL,IAL, + IFIR,JFIR,JLAS,LEN1,LENOCC,IBEG,IEND,IAGR,ILEN,JPOS, + NPOS,IPOS,IRECORD,IND REAL SEQDIST,CHECKVAL,RMS,VALUE,HOM,SIM,DISTANCE,SUM, + MEAN,XVAL,SDEV CHARACTER*500 CONSEVOLUTION CHARACTER*200 CTMP LOGICAL LEVOLUTION,LDUMMY,LERROR C---- ------------------------------------------------------------------ C---- C---- defaults, ini C---- C---- FORMULA+ISAFERANGE -> include into averaging ISAFERANGE= 5 C---- BR 99.0x: make 'safer' for weights ISAFERANGE= 5 LDUMMY= .TRUE. DO IRES=1,NRES NOCC(IRES)=0 ENDDO C---- C---- write cons-weights after each alignment (if lconsider=.true.) for C---- inspection of the conservation-weights evolution LEVOLUTION=.FALSE. C LEVOLUTION=.TRUE. IF (LEVOLUTION .AND. LFIRSTWEIGHT) THEN DO I=1,MAXSQ SUMDISTANCE(I)= 0.0 SUMVARIABILITY(I)=0.0 ENDDO CALL CONCAT_STRINGS(HSSPID_1,'_EVOLUTION.DAT',CONSEVOLUTION) CALL OPEN_FILE(KCONS,CONSEVOLUTION,'NEW',LERROR) WRITE(6,*)' BACK OPEN' WRITE(KCONS,'(A,A)')'## ',NAME_1 WRITE(KCONS,'(A)')'## EVOLUTION OF CONSERVATION WEIGHTS' WRITE(KCONS,'(A)')'## IALIGN: Number of alignment above '// + 'threshold (list position )' WRITE(KCONS,'(A)')'## IRES: residue number (test-seq)' WRITE(KCONS,'(A)')'## WEIGHT: conservation weight' WRITE(KCONS,'(A)')'## I4,2X,I4,F7.2' WRITE(kcons,'(a)')'## IALIGN IRES WEIGHT ' DO IRES=1,NRES WRITE(KCONS,'(I4,2X,I4,F7.2)')0,IRES,CONSWEIGHT_1(IRES) ENDDO LFIRSTWEIGHT=.FALSE. ENDIF C---- C---- loop over new alis (depends on NBEST and/or no of chain breaks) CAUTION kfile has to be open CHANGE in future DO IALNEW=IALIGNOLD+1,IALIGN IRECORD=IRECPOI(IALNEW) IF (IRECORD .GT. 0 ) THEN C KFILE=KCORE -ISMALL + IFILEPOI(IALNEW) KFILE=KCORE CHECKVAL=ALISORTKEY(IALNEW) CALL GETALIGN(KFILE,IRECORD,IFIR,LEN1,LENOCC,JFIR,JLAS, + IDEL,NDEL,VALUE,RMS,HOM, + SIM,SDEV,DISTANCE,CHECKVAL) C LDUMMY (=LFORMULA) is true C use formula+ISAFERANGE percent for the calculation of cons-weight C---- new switch: parameter set in maxhom.param IF (LNEWCURVE) THEN CALL CHECKHSSPCUT99(LENOCC,HOM*100.0,ISOLEN,ISOIDE,NSTEP, + LDUMMY,LALL,ISAFERANGE,LCONSIDER,DISTANCE) ELSE CALL CHECKHSSPCUT(LENOCC,HOM*100.0,ISOLEN,ISOIDE,NSTEP, + LDUMMY,LALL,ISAFERANGE,LCONSIDER,DISTANCE) ENDIF IABOVE(IALNEW)=0 C BR 99.09: found a bug (this was missing) IF (.NOT. LCONSIDER) THEN AL_EXCLUDEFLAG(IALNEW)='*' ELSE IABOVE(IALNEW)=1 IFIRST(IALNEW)=IFIR ILAST(IALNEW)= IFIR+LEN1-1 C C FIRST: convert lower case characters in HSSP-alignment C to UPPER CASE C AND convert to INTEGER C IPOS=IFIR IF (ISEQPOS+LEN1+1 .LE. MAXSEQBUFFER) THEN DO IRES=1,LEN1 C write(6,*)'ipos=',ipos,' ires=',ires, C + ' al2=',al_2(ires:ires) IF (AL_2(IRES:IRES) .GE. 'a' .AND. + AL_2(IRES:IRES) .LE. 'z') THEN C hack br 2003-10-13 CTMP=AL_2(IRES:IRES) CALL LOWTOUP(CTMP,1) Cold SEQBUFFER(ISEQPOS+IRES-1)= Cold + CHAR(ICHAR(AL_2(IRES:IRES))-32 ) SEQBUFFER(ISEQPOS+IRES-1)=CTMP(1:1) C hack br 2003-10-13 IND=TRANSPOS(ICHAR(CTMP(1:1))) ELSE SEQBUFFER(ISEQPOS+IRES-1)=AL_2(IRES:IRES) IND=TRANSPOS(ICHAR(AL_2(IRES:IRES))) ENDIF IF (IND .NE. 0) THEN LSEQ_2(IPOS)=IND ELSE LSEQ_2(IPOS)=0 WRITE(6,'(A)')'** UNKNOWN RESIDUE 1: ' + //AL_2(IRES:IRES) ENDIF IPOS=IPOS+1 ENDDO ISEQPOINTER(IALNEW)=ISEQPOS ISEQPOS=ISEQPOS+LEN1+1 SEQBUFFER(ISEQPOS)='/' ELSE WRITE(6,*)' ERROR: MAXSEQBUFFER OVERFLOW' STOP ENDIF C--- accumulate SUMVARIABILITY/SUMDISTANCE C--- for the pair test-seq - new ali SEQDIST=1.0-HOM DO IRES=IFIRST(IALNEW),ILAST(IALNEW) IF (LSEQ_1(IRES).NE.0 .AND. LSEQ_2(IRES).NE.0) THEN SUMVARIABILITY(IRES)= + SUMVARIABILITY(IRES) + + (SEQDIST * SIMCONSERV(LSEQ_1(IRES),LSEQ_2(IRES)) ) ENDIF SUMDISTANCE(IRES)=SUMDISTANCE(IRES) + SEQDIST NOCC(IRES)=NOCC(IRES)+1 ENDDO C If profiles are used, conservation weights are calculated from C the comparison between test sequence and aligned sequences (not C between aligned seqs) IF (.NOT. LPROFILE_1 .AND. .NOT. LPROFILE_2) THEN DO IAL=1,IALIGNOLD IF (IABOVE(IAL) .EQ. 1) THEN C DO the 2 alignments overlap ? C---- C---- 98-10: br C---- correct bug C---- Cbug IF (IFIRST(IAL) .LT. IFIRST(IALNEW) .OR. Cbug + ILAST(IAL) .LT. ILAST(IALNEW)) THEN Cbug SEQDIST=0.0 IF (IFIRST(IAL) .LT. ILAST(IALNEW) .OR. + ILAST(IAL) .LT. IFIRST(IALNEW)) THEN SEQDIST=0.0 ELSE C GET OVERLAP RANGE IBEG=MAX(IFIRST(IAL),IFIRST(IALNEW)) IEND=MIN(ILAST(IAL),ILAST(IALNEW)) IRES=IBEG DO JPOS=IBEG-IFIRST(IAL)+1,IEND-IFIRST(IAL)+1 IND= + TRANSPOS(ICHAR(SEQBUFFER(ISEQPOINTER(IAL)+JPOS-1))) IF (IND .NE. 0) THEN LSEQTEMP(IRES)=IND ELSE LSEQTEMP(IRES)=0 WRITE(6,'(A)')'* UNKNOWN RESIDUE 2:'// + SEQBUFFER(ISEQPOINTER(IAL)+JPOS-1) ENDIF IRES=IRES+1 ENDDO C GET THE IDENTITIES AND LENGTH OF THE OVERLAPPING PART IAGR=0 ILEN=0 DO IRES=IBEG,IEND IF (LSEQ_2(IRES).NE.0 .AND. + LSEQTEMP(IRES).NE.0) THEN ILEN=ILEN+1 IF (LSEQ_2(IRES) .EQ. + LSEQTEMP(IRES))IAGR=IAGR+1 ENDIF IBOTH_LEGAL(IRES)=0 IF (LSEQ_2(IRES).NE.0 .AND. + LSEQTEMP(IRES).NE.0) THEN IBOTH_LEGAL(IRES)=1 SIMVAL(IRES)= + SIMCONSERV(LSEQ_2(IRES),LSEQTEMP(IRES)) ENDIF ENDDO C--- accumulate SUMVARIABILITY/SUMDISTANCE C--- for the pair NEW_ALI OLD_ALI IF (ILEN.NE.0) THEN SEQDIST=1-(FLOAT(IAGR)/FLOAT(ILEN)) DO IRES=IBEG,IEND IF (IBOTH_LEGAL(IRES) .EQ. 1) THEN SUMVARIABILITY(IRES)= + SUMVARIABILITY(IRES)+(SEQDIST*SIMVAL(IRES)) SUMDISTANCE(IRES)= + SUMDISTANCE(IRES)+SEQDIST ENDIF ENDDO ENDIF ENDIF ENDIF C LOOP OVER OLD ALIS ENDDO C .NOT. LPROFILE ENDIF C UPDATE WEIGHTS FOR OVERLAPPING RANGE BETWEEN TEST-SEQ AND NEW ALI DO IRES=IFIRST(IALNEW),ILAST(IALNEW) IF (SUMDISTANCE(IRES).NE.0.0) THEN CONSWEIGHT_1(IRES)= + (SUMVARIABILITY(IRES)/SUMDISTANCE(IRES)) C NO NEGATIVE VALUES FOR CONS-WEIGHT IF (CONSWEIGHT_1(IRES).LT.CONSMIN) THEN CONSWEIGHT_1(IRES)=CONSMIN ENDIF ENDIF ENDDO C WRITE CONSERVATION WEIGHTS TO FILE IF (LEVOLUTION) THEN C CALL CONCAT_STRINGS(HSSPID_1,'_EVOLUTION.DAT', C + CONSEVOLUTION) C CALL OPEN_FILE(KCONS,CONSEVOLUTION,'OLD,APPEND',LERROR) WRITE(KCONS,'(A,A)')'## ',NAME_2(1:50) DO IRES=1,NRES WRITE(KCONS,'(I4,2X,I4,F7.2)')IALIGN,IRES, + CONSWEIGHT_1(IRES) ENDDO CLOSE(KCONS) ENDIF C C else: do NOT take (said CHECKHSSPCUT) -> updata flags! C ENDIF C LCONSIDER ENDIF C LOOP OVER NEW ALIS ENDDO 99 SUM=0.0 NPOS=0 MEAN=1.0 DO I=1,NRES IF (NOCC(I).NE.0) THEN SUM=SUM+CONSWEIGHT_1(I) NPOS=NPOS+1 ENDIF ENDDO IF (NPOS .NE. 0) THEN MEAN=SUM/NPOS ENDIF C WRITE(6,*)'GETCONSWEIGHT: SUM,MEAN ',SUM,MEAN IF (MEAN.GT. 0.99 .AND. MEAN .LT. 1.01) RETURN XVAL=1.0-MEAN DO I=1,NRES IF (NOCC(I).NE.0) CONSWEIGHT_1(I)=CONSWEIGHT_1(I)+XVAL ENDDO GOTO 99 END C END GETCONSWEIGHT C...................................................................... C...................................................................... C SUB GETCOORFORHSSP SUBROUTINE GETCOORFORHSSP(INFILE,INUNIT,CIDPROT,NRES,NATM, + MXRES,MXATM,CIDRES,IPATMRES,RCA,CIDATM,IPRESATM,R) C AUTION HERE 'TER' LINES (CHAIN TERMINATORS) ARE COUNT AS RESIDUES C BECAUSE PIECES COME FROM DSSP-SEQUENCE (CHAIN BREAKS INCREMENT C RESIDUE COUNTER) *RS 89 C C GET-COOR-BROOK:SYMB.....CHRIS SANDER....MAY 1983... C FINAL DEFINITIVE PROTEIN DATA BANK COORDINATE INPUT C ADAPTED FROM GCOOR OF SEGSEG, BUT WIHTOUT ADDED HYDROGENS AND C WITH ALTERED DATA STRUCTURE C FILE ATTRIBUTES CHARACTER*(*) INFILE INTEGER INUNIT C PROTEIN ATTRIBUTES C HEADER,COMPOUND,SOURCE,AUTHOR,RESOLUTION CHARACTER*(*) CIDPROT(*) C NUMBER OF RESIDUES, ATOMS INTEGER NRES,NATM C RESIDUE ATTRIBUTES CHARACTER*(*) CIDRES(*) C POINTS TO FIRST, LAST AND CA ATOM. INTEGER IPATMRES(3,*) C C(ALPHA) COORDINATES REAL RCA(3,*) C ATOM ATTRIBUTES CHARACTER*(*) CIDATM(MXATM) C ATOM BELONGS TO RESIDUE NUMBER IPRESATM INTEGER IPRESATM(*) REAL R(3,*) C LOCAL STORAGE CHARACTER SEQ*3,LINE*200,ALT*1 INTEGER NLIN LOGICAL OVERFLOW,LERROR C EXECUTE NRES=0 NATM=0 DO KI=1,5 CIDPROT(KI)=' ' ENDDO OVERFLOW=.FALSE. WRITE(6,*)'GETCOOR: OPEN ',infile(1:40) CALL OPEN_FILE(INUNIT,INFILE,'OLD,READONLY',LERROR) IF (LERROR) THEN WRITE(6,*)' OPEN FILE ERROR IN GETCOOR: ',infile(1:40) WRITE(6,*)' ....return with NRES=NATM=0 ' RETURN ENDIF C LOOP OVER LINES IA=0 IR=0 C ATOM, RESIDUE AND LINE COUNTERS NLIN=0 10 READ(INUNIT,'(A)',END=999) LINE NLIN=NLIN+1 C ATOMS IF (LINE(1:4) .EQ. 'ATOM') THEN IA=IA+1 IR=IR+1 IF (IA .GT. MXATM) OVERFLOW=.TRUE. IF (IR .GT. MXRES) OVERFLOW=.TRUE. IF (OVERFLOW) THEN IA=IA-1 IR=IR-1 WRITE(6,*)'***GETCOOR: CORE OVERFLOW FOR MXATM OR MXRES' WRITE(6,*)' MXATM,IA, MXRES,IR',MXATM,IA,MXRES,IR WRITE(6,*)' MOLECULE TRUNCATED' GOTO 999 ENDIF C MAIN INPUT C EXAMPLE FROM 3PTI: C REAL FIELDS: 111111112222222233333333 C TOM 101 N PRO 13 12.250 12.909 15.223 1.00 0.00 3PTI 160 C TOM 102 CA PRO 13 11.486 11.965 16.047 1.00 0.00 3PTI 161 C... :....1....:....2....:....3....:....4....:....5....:....6....:....7....:....8 CIDATM(IA)=LINE(13:16) ALT=LINE(17:17) SEQ=LINE(18:20) CIDRES(IR)=LINE(22:27) READ(LINE,'(30X,3F8.3)')(R(K,IA),K=1,3) C SKIP ALTERNATE ATOM POSITIONS IF ( ALT .NE. ' ' .AND. IA .NE. 1 .AND. + CIDATM(IA) .EQ. CIDATM(IA-1) ) THEN WRITE(6,'(A,I5,1X,A4,A1,A3,1X,A6,3X,3F8.3)') + 'GETCOOR ALTERNATE ATOM IGNORED: ', + IA,CIDATM(IA),ALT,SEQ,CIDRES(IR),(R(K,IA),K=1,3) IA=IA-1 IR=IR-1 GOTO 10 ENDIF calt ignore ace residue IF (SEQ .EQ. 'ACE' ) THEN IA=IA-1 IR=IR-1 WRITE(6,*)'GETCOOR: ACE ignored at res ',ir GOTO 10 ENDIF c set atom pointer IPATMRES(1,IR)=IA IF (IR .NE. 1) IPATMRES(2,IR-1)=IA-1 c is it a new residue ? IF (IR .NE. 1) THEN IF ( CIDRES(IR-1) .EQ. CIDRES(IR) ) IR=IR-1 ENDIF c now valid ir and ia - stash away IPRESATM(IA)=IR IF (CIDATM(IA) .EQ. ' CA ') THEN IPATMRES(3,IR)=IA DO K=1,3 RCA(K,IR)=R(K,IA) ENDDO ENDIF ELSE IF (LINE(1:4) .NE. 'ATOM' ) THEN IF (LINE(1:4) .EQ. 'HEAD'.AND.CIDPROT(1).EQ.' ')CIDPROT(1)=LINE IF (LINE(1:4) .EQ. 'COMP'.AND.CIDPROT(2).EQ.' ')CIDPROT(2)=LINE IF (LINE(1:4) .EQ. 'SOUR'.AND.CIDPROT(3).EQ.' ')CIDPROT(3)=LINE IF (LINE(1:4) .EQ. 'AUTH'.AND.CIDPROT(4).EQ.' ')CIDPROT(4)=LINE IF ( INDEX(LINE,'RESOLUTION') .NE. 0 .AND. + CIDPROT(5).EQ.' ') THEN CIDPROT(5)=LINE ENDIF IF (LINE(1:3) .EQ. 'TER') THEN IR=IR+1 SEQ='---' ENDIF ENDIF c next line GOTO 10 c end of file 999 IR=IR-1 NATM=IA NRES=IR IPATMRES(2,NRES)=NATM CLOSE(INUNIT) WRITE(6,*)'CLOSED: ',INFILE(1:40) WRITE(6,'(a,3(i5,a))')' exit getcoor:',nres,' residues', + natm,' atoms',nlin,' lines' RETURN END C END GETCOORFORHSSP C...................................................................... C...................................................................... C SUB GETDSSPFORHSSP SUBROUTINE GETDSSPFORHSSP(IN,FILE,MAXSQ,CHAINREMARK,PROT, + HEAD,COMP,SOURCE,AUTHOR,NRES,LRES,NCHAIN,KCHAIN,PDBNO, + PDBCHAINID,PDBSEQ,SECSTR,COLS,BP1,BP2,SHEETLABEL,ACC) c reads header etc from files of type dssp. modified getdssp rs dez 88. c reads dssp-data as line of length 38 (no h-bond-data) INTEGER IN,MAXSQ CHARACTER*(*) FILE,PROT,COMP,HEAD,SOURCE,AUTHOR,CHAINREMARK CHARACTER PDBSEQ(*) CHARACTER*(*) PDBCHAINID(*),SECSTR(*) CHARACTER*1 SHEETLABEL(*) C LENGHT*7 CHARACTER*7 COLS(*) INTEGER PDBNO(*),BP1(*),BP2(*),ACC(*) C INTERNAL PARAMETER (MAXCHAIN= 100) CHARACTER CHAINMODE*20,CHAINID(MAXCHAIN) CHARACTER LINE*200,TEMPNAME*124 LOGICAL ERRFLAG,LKEEP,LCHAIN(MAXCHAIN) *----------------------------------------------------------------------* C INIT NSELECT=1 TEMPNAME=' ' I=INDEX(CHAINREMARK,'_!_') IF (I.NE.0) THEN TEMPNAME(1:)=FILE(1:I-1) ELSE TEMPNAME(1:)=FILE(1:) ENDIF CALL OPEN_FILE(IN,TEMPNAME,'READONLY,OLD',ERRFLAG) IF (ERRFLAG)GOTO 999 C GET PROTEIN IDENTIFIER, HEADER AND COMPOUND etc DO LL=1,3 READ(IN,'(A200)',END=777,ERR=999) LINE ENDDO PROT=LINE(63:66) PROT=LINE(63:66) HEAD=LINE(11:50) READ(IN,'(A200)',END=777,ERR=999)LINE COMP=LINE(11:) READ(IN,'(A200)',END=777,ERR=999)LINE SOURCE=LINE(11:) READ(IN,'(A200)',END=777,ERR=999)LINE AUTHOR=LINE(11:) C...........FIND SEQUENCE......... 70 READ(IN,'(A200)',END=777,ERR=999)LINE IF (INDEX(LINE(1:5),'#').EQ.0) GOTO 70 CD WRITE(6,*)' # found sequence ' C............READ STRUCTURE......... C...:....1....:....2....:....3....:... C # RESIDUE AA STRUCTURE BP1 BP2 ACC C 22 36 A S E > -I 24 0C 60 C DO I=1,MAXCHAIN LCHAIN(I)=.TRUE. ENDDO I=INDEX(CHAINREMARK,'!') C RS 90 C extract selected chains C fx: $pdb:4hhb.dssp_!_1,2 C or: $pdb:4hhb.dssp_!_A IF (I.NE.0) THEN DO J=1,MAXCHAIN LCHAIN(J)=.FALSE. ENDDO NSELECT=1 CALL STRPOS(CHAINREMARK,ISTART,ISTOP) DO J=ISTOP,I+1,-1 IF (CHAINREMARK(J:J).EQ.',')NSELECT=NSELECT+1 ENDDO CHAINMODE='CHARACTER' c WRITE(6,*)' WILL READ CHAINS ACCORDING TO CHARACTER' ISTART=INDEX(CHAINREMARK,'!')+2 DO J=1,NSELECT READ(CHAINREMARK(ISTART:),'(A1)')CHAINID(J) CALL LOWTOUP(CHAINID(J),1) ISTART=ISTART+2 ENDDO c WRITE(6,*)' GETDSSPFORHSSP: extract the chain(s)' c DO J=1,NSELECT c WRITE(6,*)' CHAIN: ',CHAINID(J) c ENDDO ELSE CHAINMODE='NONE' IF (KCHAIN.NE.0) THEN WRITE(6,*)' will extract chain number: ',KCHAIN ENDIF DO J=1,MAXCHAIN LCHAIN(J)=.TRUE. ENDDO ENDIF I=1 NCHAIN=1 NPICK=0 80 READ(IN,'(A38)',END=777,ERR=999)LINE LKEEP=.FALSE. IF (LINE(14:14).EQ.'!') THEN NCHAIN=NCHAIN+1 ELSE IF (KCHAIN.EQ.NCHAIN)LKEEP=.TRUE. ENDIF C KCHAIN=0 => all chains IF (KCHAIN.EQ.0)LKEEP=.TRUE. C if chains are identified by filename IF (CHAINMODE.EQ.'NUMBER') THEN IF (LCHAIN(NCHAIN)) THEN C if the first chain wanted is not the first chain in DSSP-file, skip C the first position ('!') IF (NPICK.EQ.0) THEN IF (LINE(14:14).EQ.'!') THEN LKEEP=.FALSE. ENDIF ELSE LKEEP=.TRUE. ENDIF NPICK=1 ELSE LKEEP=.FALSE. ENDIF ELSE IF (CHAINMODE.EQ.'CHARACTER') THEN LKEEP=.FALSE. IF (LINE(14:14).EQ.'!') THEN IF (NPICK.EQ.0) THEN LKEEP=.FALSE. ELSE LKEEP=.TRUE. ENDIF ELSE CALL LOWTOUP(LINE(12:12),1) DO JCHAIN=1,NSELECT IF (CHAINID(JCHAIN).EQ. LINE(12:12)) THEN LKEEP=.TRUE. NPICK=1 ENDIF ENDDO IF (.NOT. LKEEP .AND. I.GT.1) THEN IF (pdbseq(i-1).EQ.'!')I=I-1 ENDIF ENDIF ENDIF c pdbno,chainid,dsspseq,secstr,cols,bp1,bp2,sheetlabel,acc IF (LKEEP) THEN READ(LINE,'(6x,I4,1X,A1,1X,A1,2X,A1,1X,A7,I4,I4,A1,I4)', + END=777,ERR=999)pdbno(i),pdbchainid(i),pdbseq(i), + secstr(i),cols(i)(1:7),bp1(i),bp2(i),sheetlabel(i), + acc(i) I=I+1 CALL CHECKRANGE (I,1,MAXSQ,'MAXSQ','GETDSSP ') ENDIF GOTO 80 C...............done.................. 777 NRES=I-1 c WRITE(6,*) NRES,' RESIDUES READ IN GETDSSPFORHSSP ' IF (NRES.LE.0) THEN PROT=' ' HEAD=' ' COMP=' ' SOURCE=' ' AUTHOR=' ' ENDIF C.......DO NOT COUNT CHAIN BREAKS... LRES=NRES KCHAIN=1 DO I=1,NRES IF (pdbseq(i).EQ.'!') THEN LRES=LRES-1 KCHAIN=KCHAIN+1 ENDIF ENDDO c WRITE(6,*) LRES,' RESIDUES ',NRES,' POSITIONS ' CLOSE(IN) RETURN 999 WRITE(6,*)' *** READ ERROR ***' NRES=0 PROT=' ' HEAD=' ' COMP=' ' SOURCE=' ' AUTHOR=' ' RETURN END C END GETDSSPFORHSSP C...................................................................... C...................................................................... C SUB GETHSSPCUT SUBROUTINE GETHSSPCUT(KIN,MAXSTEP,INFILE,ISOLEN,ISOIDE,NSTEP) C RS 89 C read in isosignificance data from file C C............................................................. C* isosignificance data / 70% secondary structure identity C* a "*" indicates a comment line C* alignments longer than the length specified in the last line C* have the same cutoff C* format=(2X,I4,7X,F7.2) C*.1234..... 1234567 C* length minimum % sequence identity <===== start-line C 10 67.41 C 20 50.22 C .. .. C> 200 24.53 C............................................................. IMPLICIT NONE INTEGER MAXSTEP,KIN,I CHARACTER*(*) INFILE INTEGER ISOLEN(MAXSTEP),NSTEP REAL ISOIDE(MAXSTEP) LOGICAL LERROR CHARACTER LINE*200 CALL OPEN_FILE(KIN,INFILE,'READONLY,OLD',LERROR) 10 READ(KIN,'(A)',ERR=999)LINE WRITE(6,*)LINE CALL LOWTOUP(LINE,200) IF (INDEX(LINE,'* LENGTH') .EQ. 0) GOTO 10 I=1 20 READ(KIN,'(2X,I4,7X,F7.2)',END=888)ISOLEN(I),ISOIDE(I) I=I+1 IF (I .GT. MAXSTEP) THEN WRITE(6,*)' GETHSSPCUT: maxstep overflow: ',maxstep ENDIF GOTO 20 888 NSTEP=I-1 WRITE(6,*)' GETHSSPCUT: ',nstep,' steps ' cd do i=1,nstep cd WRITE(6,*)isolen(i),isoide(i) cd enddo CLOSE(KIN) RETURN 999 WRITE(6,*)' GETHSSPCUT: ERROR READING ',INFILE CLOSE(KIN) STOP END C END GETHSSPCUT C...................................................................... C...................................................................... C SUB GETINT SUBROUTINE GETINT(KINT,INTARR,CTEXT) C by Chris Sander, June 1985, Feb 1986, June 1987, RS89 C For interactive use via terminal. C Prompts for KINT integers from input unit *. C Returns new values in INTARR(1..KINT) C Offers previous values as default. CUG INTEGER LINELEN PARAMETER (LINELEN= 200) CHARACTER*(LINELEN) LINE CHARACTER*(*) CTEXT INTEGER INTARR(*) LOGICAL EMPTYSTRING CUG INTEGER NUMSTART CHARACTER*20 CTEMP *----------------------------------------------------------------------* WRITE(6,*) WRITE(6,*)'===================================================='// + '===========================' CALL WRITELINES(CTEXT) IF (KINT.LT.1.OR.KINT.GT.100) THEN WRITE(6,*)'*** INTPROMPT: KINT no good',KINT RETURN ENDIF 10 WRITE(6,*) WRITE(6,'(2X,''Default: '',10I4)') (INTARR(K),K=1,KINT) IF (KINT.GT.1) THEN WRITE(6,'(2X,''Enter'',I3,'' integers [CR=default]: '')')KINT ELSE WRITE(6,'(2X,''Enter one integer [CR=default]: '')') ENDIF LINE=' ' READ(*,'(A200)',ERR=10,END=11) LINE IF (.NOT.EMPTYSTRING(LINE)) THEN C remove comments ( 34535345 !$ comment ) KCOMMENT=INDEX(LINE,'!$') IF (KCOMMENT.NE.0) LINE(KCOMMENT:linelen)=' ' C check for legal string DO I=1,linelen IF (INDEX(' ,+-0123456789',LINE(I:I)).EQ.0) THEN WRITE(6,'(2X,''*** not an integer: '',A40)') LINE(1:40) GOTO 10 ENDIF ENDDO CALL STRPOS(LINE,ISTART,ISTOP) DO INUM = 1,KINT CALL GETTOKEN(LINE,LINELEN,INUM,NUMSTART,CTEMP) CALL RIGHTADJUST(CTEMP,1,20) READ(CTEMP,'(I20)') INTARR(INUM) ENDDO ENDIF 11 WRITE(6,'(2X,'' echo:'',10I4)') (INTARR(K),K=1,KINT) RETURN END C END GETINT C...................................................................... C...................................................................... C SUB GETINDEX SUBROUTINE GETINDEX(CTEST,STRINGPOS,IPOS) C get index of ctest in cstring INTEGER STRINGPOS(*),IPOS CHARACTER CTEST I=ICHAR(CTEST) IPOS=STRINGPOS(I) c if (ipos .eq. 0) then c WRITE(6,*)' WARNING: UNKNOWN character: ',ctest c endif RETURN END C END GETINDEX C...................................................................... C...................................................................... C SUB GETPIDCODE SUBROUTINE GETPIDCODE(FILENAME,PID) C extract protein ID from file name CHARACTER*(*) FILENAME, PID CHARACTER NAME*500,TEMPNAME*500 C PID=' ' TEMPNAME=' ' CALL STRPOS(FILENAME,ISTART,IEND) IF (IEND .GT. LEN(TEMPNAME)) THEN WRITE(6,*)' ERROR in GETPIDCODE' WRITE(6,*)' tempname variable too short' STOP ENDIF TEMPNAME(1:IEND)=FILENAME(1:IEND) CALL LOWTOUP(TEMPNAME,IEND) NAME=FILENAME(ISTART:IEND) C DO IR=IEND,1,-1 IF (TEMPNAME(IR:IR) .EQ. '.') then IEND=IR-1 GOTO 111 ENDIF ENDDO 111 TEMPNAME=' ' DO IL=IEND,ISTART,-1 IF ((NAME(IL:IL) .EQ. '/') .OR. (NAME(IL:IL) .EQ. ':') + .OR. (NAME(IL:IL) .EQ. ']') ) THEN ISTART=IL+1 GOTO 222 ENDIF ENDDO 222 PID(1:)=FILENAME(ISTART:IEND) c444 il=index(name(:ir),'.') c if (il .gt. 0) then c name(il:il)='|' c goto 444 c else c goto 555 c endif c 555 if (iend .gt. len(pid)) then c WRITE(6,*)' ERROR in GETPIDCODE' c WRITE(6,*)' pid variable too short' c STOP c endif c PID=NAME(:IR) RETURN END C END GETPIDCODE C...................................................................... C...................................................................... C SUB GETPOS SUBROUTINE GETPOS(CSTRING,STRINGPOS,N) C RS JAN 90 C store ASCII code of cstring in array stringpos INTEGER STRINGPOS(*),N CHARACTER*(*) CSTRING DO I=1,N STRINGPOS(I)=0 ENDDO ILEN=LEN(CSTRING) DO I=1,ILEN J=ICHAR(CSTRING(I:I)) STRINGPOS(J)=I ENDDO RETURN END C END GETPOS C...................................................................... C...................................................................... C SUB GETSEQ SUBROUTINE GETSEQ(IN,NDIM,NRES,CRESID,CSQ,STRUC,KACC, + LDSSP,FILENAME,COMPND,ACCESSION,CDUMMY,IOP,TRANS,NTRANS, + KCHAIN,NCHAIN,CCHAIN) C RS 89 changed to read from PDB-file (used in MAXHOM) C by Chris Sander, 1982 and later C and Brigitte Altenberg, 1987 and later C GET SEQUENCE FROM DSSP-FILE, HSSP SWISSPROT....OR FREE FORMAT FILE. CAUTION: used by MAXHOM, PUZZLE, WINDOW-DNA (?), SEG-PRED (?) etc. C C NDIM - MAX SPACE IN SEQUENCE ARRAY C NREAD - NUMBER OF RESIDUES READ C NRES - NUMBER OF RESIDUES PASSED ON C IN - LOGICAL UNIT NUMBER OF SEQ FILE C IOP - LOGICAL UNIT NUMBER OF OUTPUT FILE C KCHAIN - KCHAINTH CHAIN WANTED (K=0 => ALL CHAINS,K<>0 => KTH C CHAIN) BUT ONLY IF "_!_A,B" IS NOT SPECIFIED !! C NCHAIN - NUMBER OF CHAINS IN *.DSSP DATA-SET C CCHAIN - NAME OF CHAIN C LCHAIN() - true if 'x' chain wanted PARAMETER (MAXCHAIN= 40) PARAMETER (MAXRECLEN= 200) CHARACTER LOWER*26,PUNCTUATION*10,FORMATNAME*4 CHARACTER TRANS*26,CS*1,CC*1 CHARACTER LINE*(MAXRECLEN) cx character*200 FILENAME CHARACTER*(*) FILENAME C compound for DSSP CHARACTER*(*) COMPND C accession number and dummy string (fx. pdb-pointer from swissprot) CHARACTER*(*) ACCESSION,CDUMMY CHARACTER*1 CSQ(*),STRUC(*),CH,CCHAIN CHARACTER*6 CRESID(*),CR LOGICAL TRUNCATED,ERRFLAG,LKEEP,LCHAIN(MAXCHAIN) LOGICAL LDSSP,LACCZERO,LHSSP INTEGER KACC(*),KCHAIN INTEGER IOP C INTERNAL CHARACTER CTEST*1,CHAINMODE*20,CHAINID(MAXCHAIN)*1 LOGICAL LCHAINBREAK,LEGALRES CHARACTER*100 CTEMP C dont use INDEX command (CPU time) INTEGER NASCII PARAMETER (NASCII= 256) INTEGER TRANSPOS(NASCII) C read from BRK C CHARACTER SEQ(10000)*3,CIDRES(10000)*6 CHARACTER SEQ(9999)*3,CIDRES(9999)*6 C====================================================================== IEND=0 ISEQLEN=0 ISTART=0 ISTOP=0 LOWER='abcdefghijklmnopqrstuvwxyz' LDSSP=.FALSE. LHSSP=.FALSE. IF (IOP.NE.0)WRITE(IOP,*)FILENAME CDUMMY=' ' ACCESSION=' ' LINE=' ' CAUTION.. recommendation: C calling program should allow "!" as legal residue for DSSP format C *BA* IF (NTRANS.EQ.0) THEN WRITE(6,*)'GETSEQ: NTRANS was 0 !!!!' NTRANS=26 TRANS='GAVLISTDENQKHRFYWCMPBZX!-.' WRITE(6,*)'GETSEQ: TRANS set to:', TRANS ENDIF IF (NTRANS.GT.26) THEN WRITE(6,*)'trans:#',TRANS,'# ntrans:',NTRANS STOP'GETSEQ ERROR *** NTRANS.GT.26' ENDIF L=INDEX(TRANS(1:NTRANS),'-') IF (L.EQ.0) THEN WRITE (*,*)'GETSEQ: WARNING: Trans must include"-" ' ENDIF CALL GETPOS(TRANS,TRANSPOS,NASCII) C *BA*BEGIN NRES=0 C......................defaults........ C in general, only blanks are allowed PUNCTUATION=' ' DO I=1,NDIM KACC(I)=0 C implies that unknown residues are named - CSQ(I)='-' C undefined STRUC(I)='U' C *BA*END ENDDO COMPND=FILENAME C read only the kth chain *BA* C NAME OF CHAIN CCHAIN=' ' C CHAIN COUNTER NCHAIN=1 C RES LINE COUNTER NRESLINE=0 NSELECT=0 CALL strpos(FILENAME,i,LENFILNAM) WRITE(6,*) 'GETSEQ: ', FILENAME(1:LENFILNAM) IF (LENFILNAM .LE. 1) THEN WRITE(6,*)'GETSEQ: *** empty file name, return with NRES=0' RETURN ENDIF I=INDEX(FILENAME,'_!_') C RS 90 C extract selected chains C fx: $pdb:4hhb.dssp_!_1,2 IF (I.NE.0) THEN DO J=1,MAXCHAIN LCHAIN(J)=.FALSE. ENDDO NSELECT=1 IEND=LEN(FILENAME) DO J=IEND,I+1,-1 IF (FILENAME(J:J).EQ.',')NSELECT=NSELECT+1 ENDDO ISTART=INDEX(FILENAME,'!_')+2 READ(FILENAME(ISTART:ISTART),'(A1)')CTEST IF (INDEX('1234567890',CTEST).NE.0) THEN CHAINMODE='NUMBER' WRITE(6,*)' WILL READ CHAINS ACCORDING TO NUMBER' ELSE CHAINMODE='CHARACTER' WRITE(6,*)' WILL READ CHAINS ACCORDING TO CHARACTER' ENDIF DO J=1,NSELECT IF (CHAINMODE.EQ.'NUMBER') THEN CALL READ_INT_FROM_STRING(FILENAME(ISTART:),K) IF (K.GT.0 .AND. K.LE.MAXCHAIN) THEN LCHAIN(K)=.TRUE. ELSE WRITE(6,*)'*** ERROR: K<1 OR K>MAXCHAIN IN GETSEQ' STOP ENDIF ELSE READ(FILENAME(ISTART:ISTART),'(A1)')CHAINID(J) CALL LOWTOUP(CHAINID(J),1) ENDIF ISTART=ISTART+2 ENDDO WRITE(6,*)' **** GETSEQ: extract the chain(s)' IF (CHAINMODE.EQ.'NUMBER') THEN DO J=1,MAXCHAIN IF (LCHAIN(J))WRITE(6,*)' CHAIN: ',J ENDDO ELSE DO J=1,NSELECT WRITE(6,*)' CHAIN: ',CHAINID(J) ENDDO ENDIF ISTOP=INDEX(FILENAME,'_!')-1 FILENAME=FILENAME(1:ISTOP) ELSE CHAINMODE='NONE' IF (KCHAIN.NE.0) THEN WRITE(6,*)' will extract chain number: ',KCHAIN ENDIF DO J=1,MAXCHAIN LCHAIN(J)=.TRUE. ENDDO ENDIF C *BA*BEGIN CALL CHECKFORMAT(IN,FILENAME,FORMATNAME,ERRFLAG) c WRITE(6,*) ' GETSEQ: format is ',FORMATNAME IF (INDEX(FORMATNAME,'DSSP').NE.0) THEN LDSSP=.TRUE. ENDIF IF (INDEX(FORMATNAME,'HSSP').NE.0) THEN LHSSP=.TRUE. ENDIF IF (ERRFLAG) THEN WRITE(6,*)'GETSEQ: file open error, set NRES=0 and return' WRITE(6,*)'filename: ', FILENAME RETURN ENDIF CTEMP=' ' write(ctemp,'(a,i5)')'READONLY,OLD,RECL=',maxreclen CALL OPEN_FILE(IN,FILENAME,ctemp,ERRFLAG) C *BA*END IF (FORMATNAME.EQ.'DSSP') GOTO 100 IF (FORMATNAME.EQ.'BRK ') GOTO 200 IF (FORMATNAME.EQ.'PIR ') GOTO 300 IF (FORMATNAME.EQ.'EMBL') GOTO 400 IF (FORMATNAME.EQ.'GCG ') GOTO 500 IF (FORMATNAME.EQ.'UWGC') GOTO 600 IF (FORMATNAME.EQ.'HSSP') GOTO 700 C--------------NOT DSSP----NOT PIR----NOT EMBL--NOT GCG---------------- C--------------simple STAR FORMAT, probably DO WHILE(.TRUE.) READ(IN,'(A)',END=900) LINE IF (LINE(1:1).EQ.'*') THEN IF (IOP.NE.0)WRITE(IOP,*) LINE C NOT A COMMENT LINE ELSE CALL STRPOS(LINE,IBEG,IEND) DO J=1,IEND CS=LINE(J:J) CALL GETINDEX(CS,TRANSPOS,I) C star format allows chainbreak IF ( .NOT. LCHAINBREAK(CS,NRES+1) .AND. I.NE.0) THEN NRES=NRES+1 IF (NRES.LE.NDIM) THEN CSQ(NRES)=CS ELSE WRITE(IOP,'(A)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ ***' WRITE(6,'(A)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ ***' ENDIF ENDIF C J, CHARACTERS IN LINE ENDDO C COMMENT OR SEQUENCE LINE ENDIF C NEXT LINE ENDDO C-------------------------------READ FROM :DSSP----------------- C ** SECONDARY STRUCTURE DEFINITION BY THE PROGRAM DSSP, \\ C VERSION OCT. 1985 C FERENCE W. KABSCH AND C.SANDER, BIOPOLYMERS 22 (1983) 2577-2637 C ADER PANCREATIC HORMONE 16-JAN-81 1PPT C MPND AVIAN PANCREATIC POLYPEPTIDE 100 READ(IN,'(A124)',END=199)LINE IF (INDEX(LINE,'SECONDARY').EQ.0) THEN WRITE(6,*)'***GETAASEQ ERROR: DSSP file assumed, but...' WRITE(6,*)' the word /SECONDARY/ is missing in first line' RETURN ENDIF C reference - ignore READ(IN,'(A)',END=199)LINE C header READ(IN,'(A)',END=199)LINE C* LINE='*'//LINE IF (IOP.NE.0)WRITE(IOP,*)LINE C compnd READ(IN,'(A)',END=199)LINE C* LINE='*'//LINE IF (IOP.NE.0)WRITE(IOP,*)LINE COMPND=LINE(11:200) C C C repeat until # 105 READ(IN,'(A)',END=199)LINE IF (INDEX(LINE(1:5),'#').EQ.0) GOTO 105 C C23456123451c1cc1 Ccccccaaaaaacaccacccccccccccccccccciii C 9 9 A S E -aB 35 15A 0 24,-2.3 27,-2.9 -2,-0.4 28,-0.5 -0.939 14.7-175.8-120.8 141.0 -5.5 9.8 13.0 C 21 21 Y E -AB 32 45A 68 24,-3.1 24,-2.9 -2,-0.3 C DSSP: seqstr acc hbonds C NPICK=0 DO WHILE (.TRUE.) READ(IN,'(6X,A5,A1,1X,A1,2X,A1,18X,I3)',END=900) + CR(1:5),CH,CS,CC,IACC C Res line counter. Note: NRES = # of res passed NRESLINE=NRESLINE+1 LKEEP=.FALSE. C ......CONVERT SS-BRIDGES TO 'C'.... IF (INDEX(LOWER,CS).NE.0) CS='C' IF (NRES.LT.NDIM) THEN C incr.chains *BA* IF (LCHAINBREAK(CS,NRESLINE)) THEN NCHAIN=NCHAIN+1 ELSE IF (KCHAIN.EQ.NCHAIN)LKEEP=.TRUE. ENDIF C KCHAIN=0 => all chains IF (KCHAIN.EQ.0)LKEEP=.TRUE. C if chains are identified by filename IF (CHAINMODE.EQ.'NUMBER') THEN IF (LCHAIN(NCHAIN)) THEN C if the first chain wanted is not the first chain in DSSP-file, skip C the first position ('!') IF (NPICK.EQ.0) THEN IF (LCHAINBREAK(CS,NRESLINE)) THEN LKEEP=.FALSE. ENDIF ELSE LKEEP=.TRUE. ENDIF NPICK=1 ELSE LKEEP=.FALSE. ENDIF ELSE IF (CHAINMODE.EQ.'CHARACTER') THEN LKEEP=.FALSE. IF (LCHAINBREAK(CS,NRESLINE)) THEN IF (NPICK.EQ.0) THEN LKEEP=.FALSE. ELSE LKEEP=.TRUE. ENDIF ELSE CALL LOWTOUP(CH,1) DO JCHAIN=1,NSELECT IF (CHAINID(JCHAIN).EQ.CH) THEN LKEEP=.TRUE. NPICK=1 ENDIF ENDDO IF (.NOT.LKEEP) THEN IF (CSQ(NRES).EQ.'!')NRES=NRES-1 ENDIF ENDIF ENDIF C keep only the kth chain IF (LKEEP) THEN CALL GETINDEX(CS,TRANSPOS,I) IF (I .NE. 0) THEN CAUTION: change here (or in SEQTOINT) to L=0 implying chain break C CAUTION: INCREMENT ONLY OF LEGAL AA OR - OR ! NRES=NRES+1 CRESID(NRES)=CR(1:5)//CH CSQ(NRES)=CS KACC(NRES)=IACC CCHAIN=CH STRUC(NRES)=CC c WRITE(6,*)cchain C ### ILLegal RESIDUES ENDIF C CHAINS WANTED ENDIF C DIMENSION OVERFLOW ELSE WRITE(IOP,'(A)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ ***' WRITE(6,'(A,I10)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ ***', + MAXSQ GOTO 900 ENDIF C NEXT LINE ENDDO C--------------DSSP read error ----------------------------------- 199 WRITE(6,*)'***GETAASEQ: incomplete DSSP file (EOF) ' NRES=0 NCHAIN=0 CALL STRPOS(FILENAME,I,LENFILNAM) WRITE(6,*) 'FILE: ',FILENAME(1:LENFILNAM) CLOSE(IN) RETURN C----------------READ FROM BROOKHAVEN-------------------------------- 200 READ(IN,'(A)',END=900,ERR=999)LINE IF (INDEX(LINE,'HEADER').EQ.0) THEN WRITE(6,*)'***GETAASEQ ERROR: BRK file assumed, but...' WRITE(6,*)' the word /HEADER/ is missing in first line' RETURN ENDIF IF (IOP.NE.0)WRITE(IOP,*)LINE(1:200) C compnd READ(IN,'(A)',END=900,ERR=999)LINE IF (IOP.NE.0)WRITE(IOP,*)LINE(1:200) COMPND=LINE(1:200) C read only the kth chain C NAME OF CHAIN CCHAIN=' ' C CHAIN COUNTER NCHAIN=1 C RES LINE COUNTER NRESLINE=0 NRES=0 210 READ(IN,'(A)',END=280,ERR=999)LINE NRESLINE=NRESLINE+1 IF (LINE(1:4).EQ.'ATOM') THEN C if chains are identified by filename IF (CHAINMODE.EQ.'CHARACTER') THEN LKEEP=.FALSE. DO J=1,NSELECT IF (CHAINID(J).EQ.LINE(22:22))LKEEP=.TRUE. ENDDO ELSE LKEEP=.TRUE. ENDIF IF (LKEEP) THEN IF (NRES.LE.NDIM) THEN NRES=NRES+1 SEQ(NRES)=LINE(18:20) CIDRES(NRES)=LINE(22:27) IF (SEQ(NRES).EQ.'ACE') THEN NRES=NRES-1 WRITE(6,*)' GETAASEQ: ACE ignored at res ',NRES GOTO 210 ENDIF IF (NRES.NE.1) THEN IF (CIDRES(NRES-1).EQ.CIDRES(NRES))NRES=NRES-1 ENDIF ELSE WRITE(IOP,'(A,I10)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ ***', + MAXSQ WRITE(6,'(A,I10)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ ***', + MAXSQ GOTO 900 ENDIF ENDIF ELSE IF (LINE(1:3).EQ.'TER') THEN IF (NRES.NE.0) THEN NRES=NRES+1 SEQ(NRES)='!!!' ENDIF ENDIF GOTO 210 280 CALL S3TOS1(SEQ,CSQ,NRES) 290 IF (SEQ(NRES).NE.'!!!') THEN GOTO 900 ELSE SEQ(NRES)=' ' CIDRES(NRES)=' ' NRES=NRES-1 GOTO 290 ENDIF C====== C---------------------------READ FROM :PIR--------------------*BA*BEGIN C C 300 CONTINUE PUNCTUATION=',.:;()+ ' C Header line 1, ignore READ (IN,'(A)',END=999)LINE C Header line 2, ignore READ (IN,'(A)',END=999)LINE c IF (INDEX(LINE,'>').NE.0) THEN C THERE are TWO SPECIAL LINES c READ (IN,'(A)',END=999)LINE c ENDIF CALL STRPOS(LINE,IBEG,IEND) LINE(1:)='*'//LINE(IBEG:IEND) C WRITE HEADER IF (IOP.NE.0) then WRITE(IOP,*)LINE(ibeg:iend) ENDIF DO WHILE(.TRUE.) C IN THE NEXT LINES ARE RESIDUES READ (IN,'(A)',END=900)LINE CALL STRPOS(LINE,IBEG,IEND) DO J=1,IEND CS=LINE(J:J) IF (CS.EQ.'*') GOTO 900 IF (LEGALRES(CS,NRES,TRANS,NTRANS,PUNCTUATION)) THEN NRES=NRES+1 C CHECK FOR OVERFLOW IF (NRES.LE.NDIM) THEN CSQ(NRES)=CS C OVERFLOW ELSE WRITE(IOP,'(A)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ ***' WRITE(6,'(A)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ ***' GOTO 900 ENDIF C LEGAL RESIDUES ENDIF C NEXT RESIDUE ENDDO C NEXT LINE ENDDO C---------------------------READ FROM :EMBL ---------------------------- 400 CONTINUE ID=0 DO WHILE (.TRUE.) READ (IN,'(A)',END=999)LINE C LOOK FOR ACCESSION NUMBER AND TAKE THE FIRST ONE IF (INDEX(LINE(:2),'AC').NE.0) THEN I=INDEX(LINE,';')-1 ACCESSION(1:)=LINE(6:I) ENDIF C LOOK FOR DEFINITION IF (INDEX(LINE(:2),'DE').NE.0) THEN COMPND=' ' COMPND(1:74)=LINE(6:79) C WRITE ONLY DEFINITION IF (IOP.NE.0)WRITE (IOP,*)LINE GOTO 410 ENDIF ENDDO 410 DO WHILE (.TRUE.) C LOOK FOR LINE BEGINNING READ (IN,'(A)',END=999)LINE C WITH "SQ" IF (INDEX(LINE(:2),'SQ').NE.0) THEN GOTO 420 C*RS 89 C look for PDB-database pointer and store them in CDUMMY ELSE IF (INDEX(LINE(:2),'DR').NE.0 .AND. + INDEX(LINE,'PDB;').NE.0) THEN CALL STRPOS(CDUMMY,ISTART,ISTOP) CALL STRPOS(LINE,JSTART,JSTOP) IF (ISTOP+JSTOP+10 .LE. LEN(CDUMMY) ) THEN IF (ID .LE. 0) THEN CDUMMY(ISTOP+1:)=LINE(10:JSTOP) ELSE CDUMMY(ISTOP+1:)='|'//LINE(10:JSTOP) ENDIF ID=ID+1 ELSE WRITE(6,*)'**** PDBREF-LINE DIMENSION OVERFLOW ***' ENDIF ENDIF ENDDO 420 CALL STRPOS(CDUMMY,ISTART,ISTOP) IF (ID .GT. 0) THEN IF ( (ISTOP+7) .LE. LEN(CDUMMY) ) THEN WRITE(CDUMMY(ISTOP+1:),'(A,I4)')'||',ID ELSE WRITE(6,*)'**** PDBREF-LINE DIMENSION OVERFLOW ***' ENDIF ENDIF DO WHILE (.TRUE.) C SEQUENCES NEXT LINE READ (IN,'(A)',ERR=999,END=900) LINE C END OF ROUTINE IF (INDEX(LINE(:2),'//').NE.0) GOTO 900 C NO MORE TEXT ALLOWED IF (INDEX(LINE(:2),' ').NE.0) THEN CALL STRPOS(LINE,IBEG,IEND) DO J=1,iend CS=LINE (J:J) CALL GETINDEX(CS,TRANSPOS,I) IF (I .NE. 0) THEN NRES=NRES+1 C CHECK FOR OVERFLOW IF (NRES.LE.NDIM) THEN CSQ(NRES)=CS C OVERFLOW ELSE WRITE(IOP,'(A)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ ***' WRITE(6,'(A,I10)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ=', + MAXSQ GOTO 900 ENDIF C LEGAL RESIDUES ENDIF C NEXT RESIDUE ENDDO C NO TEXT BETWEEN THE LINES ENDIF C NEXT LINE ENDDO C------------------------END EMBL-READING------------------------------ C------------------------READ FROM:GCG-FORMAT-------------------------- Cold500 DO WHILE (.TRUE.) Cold READ (IN,'(A124)',END=999)LINE Cold IF (INDEX(LINE(:2),'ID').NE.0) THEN Cold IF (IOP.NE.0)WRITE (IOP,*)LINE Cold GOTO 510 Cold ENDIF Cold ENDDO 500 DO WHILE (.TRUE.) C GET SEQUENCE WHILE READ (IN,'(A)',END=999)LINE IF (IOP.NE.0)WRITE (IOP,*)LINE IF (INDEX(LINE,'..').NE.0)GOTO 520 ENDDO 520 DO WHILE (.TRUE.) C GET THE SEQUENCES READ (IN,'(A)',ERR=999,END=900)LINE CALL STRPOS(LINE,IBEG,IEND) DO J=1,IEND CS=LINE (J:J) C CHECK FOR LEGAL RESIDUES CALL GETINDEX(CS,TRANSPOS,I) IF (I .NE.0) THEN NRES=NRES+1 C CHECK FOR OVERFLOW IF (NRES.LE.NDIM) THEN CSQ(NRES)=CS ELSE WRITE(IOP,'(A)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ ***' WRITE(6,'(A,I10)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ=', + MAXSQ GOTO 900 ENDIF C LEGAL RESIDUE ENDIF C NEXT RESIDUE ENDDO C NEXT LINE ENDDO C---------------------------READ FROM :UWGCG------------------*BA*BEGIN C HEADER 600 READ (IN,'(A)',END=999)LINE IF (INDEX(LINE,'Check').EQ.0) THEN C THERE IS AN EMPTY LINE READ (IN,'(A)',END=999)LINE ENDIF LINE='*'//LINE(1:len(line)-1) C WRITE HEADER IF (IOP.NE.0)WRITE(IOP,*)LINE DO WHILE (.TRUE.) C GET SEQUENCE WHILE READ (IN,'(A)',END=999)LINE C LOOKING FOR A LINE IF (INDEX(LINE(3:50),'Length').NE.0) THEN C WITH 'LENGHT'AND'CHECK' IF (INDEX(LINE(50:124),'Check').NE.0)GOTO 610 ENDIF ENDDO 610 DO WHILE(.TRUE.) C EMPTY LINE READ (IN,'(A)',END=900,ERR=999)LINE READ (IN,'(A)',END=900,ERR=999)LINE CALL STRPOS(LINE,IBEG,IEND) DO J=9,iend CS=LINE(J:J) IF (CS.EQ.'*') GOTO 900 C CHECK FOR LEGAL RESIDUES CALL GETINDEX(CS,TRANSPOS,I) IF (I .NE. 0) THEN NRES=NRES+1 C CHECK FOR OVERFLOW IF (NRES.LE.NDIM) THEN CSQ(NRES)=CS C OVERFLOW ELSE WRITE(IOP,'(A,I10)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ=', + MAXSQ WRITE(6,'(A,I10)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ=', + MAXSQ GOTO 900 ENDIF C LEGAL RESIDUES ENDIF C NEXT RESIDUE ENDDO C NEXT LINE ENDDO C C---------------------------READ FROM :HSSP---------------------------- 700 READ(IN,'(A)',END=199)LINE IF (INDEX(LINE,'HOMOLOGY').EQ.0) THEN WRITE(6,*)'***GETAASEQ ERROR: HSSP file assumed, but...' WRITE(6,*)' the word /HOMOLOGY/ is missing in first line' RETURN ENDIF DO WHILE(INDEX(LINE,'NOTATION ').EQ.0) READ(IN,'(A)',END=199)LINE IF (INDEX(LINE,'HEADER').NE.0) THEN IF (IOP.NE.0)WRITE(IOP,*)LINE(12:) ELSE IF (INDEX(LINE,'COMPND').NE.0) THEN IF (IOP.NE.0)WRITE(IOP,*)LINE COMPND=LINE(12:200) ELSE IF (INDEX(LINE,'SEQLENGTH ').NE.0) THEN call read_int_from_string(LINE(12:),iseqlen) ENDIF ENDDO DO WHILE(INDEX(LINE,'## ALIGNMENTS').EQ.0) READ(IN,'(A)',END=199)LINE ENDDO READ(IN,'(A)',END=199)LINE NPICK=0 DO IRES=1,ISEQLEN READ(IN,'(7X,A5,A1,1X,A1,2X,A1,18X,I3)',END=799) + CR(1:5),CH,CS,CC,IACC C Res line counter. Note: NRES = # of res passed NRESLINE=NRESLINE+1 LKEEP=.FALSE. C CONVERT SS-BRIDGES TO 'C'.... IF (INDEX(LOWER,CS).NE.0) CS='C' IF (NRES.LT.NDIM) THEN C incr.chains IF (LCHAINBREAK(CS,NRESLINE)) THEN NCHAIN=NCHAIN+1 ELSE IF (KCHAIN.EQ.NCHAIN)LKEEP=.TRUE. ENDIF C KCHAIN=0 => all chains IF (KCHAIN.EQ.0)LKEEP=.TRUE. C if chains are identified by filename IF (CHAINMODE.EQ.'NUMBER') THEN IF (LCHAIN(NCHAIN)) THEN C if the first chain wanted is not the first chain in DSSP-file, skip C the first position ('!') IF (NPICK.EQ.0) THEN IF (LCHAINBREAK(CS,NRESLINE)) THEN LKEEP=.FALSE. ENDIF ELSE LKEEP=.TRUE. ENDIF NPICK=1 ELSE LKEEP=.FALSE. ENDIF ELSE IF (CHAINMODE.EQ.'CHARACTER') THEN LKEEP=.FALSE. IF (LCHAINBREAK(CS,NRESLINE)) THEN IF (NPICK.EQ.0) THEN LKEEP=.FALSE. ELSE LKEEP=.TRUE. ENDIF ELSE CALL LOWTOUP(CH,1) DO JCHAIN=1,NSELECT IF (CHAINID(JCHAIN).EQ.CH) THEN LKEEP=.TRUE. NPICK=1 ENDIF ENDDO IF (.NOT.LKEEP) THEN IF (CSQ(NRES).EQ.'!')NRES=NRES-1 ENDIF ENDIF ENDIF C keep only the kth chain IF (LKEEP) THEN C CHECK FOR LEGAL RESIDUES CALL GETINDEX(CS,TRANSPOS,I) IF (I .NE. 0) THEN CAUTION: INCREMENT ONLY OF LEGAL AA OR - OR ! NRES=NRES+1 CRESID(NRES)=CR(1:5)//CH CSQ(NRES)=CS KACC(NRES)=IACC CCHAIN=CH STRUC(NRES)=CC C ILLegal RESIDUES OR LEGAL PUNCTATION ENDIF C CHAINS WANTED ENDIF C DIMENSION OVERFLOW ELSE WRITE(IOP,'(A)')'*** ERROR: DIMENSION OVERFLOW MAXSQ ***' WRITE(6,'(A,I10)') + '*** ERROR: DIMENSION OVERFLOW MAXSQ=',MAXSQ GOTO 900 ENDIF C NEXT LINE ENDDO IF (NRESLINE .EQ. ISEQLEN)GOTO 900 C--------------HSSP read error ----------------------------------- 799 WRITE(6,*)'***GETSEQ: incomplete HSSP file ' NRES=0 NCHAIN=0 CALL STRPOS(FILENAME,I,LENFILNAM) WRITE(6,*) 'FILE: ',FILENAME(1:LENFILNAM) CLOSE(IN) RETURN C----------------------READ FILE ERROR--------------------------------- 999 WRITE (*,*)'****GETSEQ:INCOMPLETE FILE ',FILENAME(1:LENFILNAM) NRES=0 CLOSE (IN) RETURN C--------------------------------------------------------------*BA*END C---all formats: -----------FINISHED READING----------------------- 900 CLOSE(IN) IF (LDSSP .OR. LHSSP) THEN LACCZERO=.TRUE. DO I=1,NRES IF (KACC(I).NE.0) THEN LACCZERO=.FALSE. GOTO 910 ENDIF ENDDO 910 IF (LACCZERO) THEN WRITE(6,*)'*******************************************' WRITE(6,*)'* WARNING: all accessibility values are 0 *' WRITE(6,*)'*******************************************' IF (IOP.NE.0) THEN WRITE(IOP,'(A)')'***************************************' WRITE(IOP,'(A)')'* WARNING: accessibility values are 0 *' WRITE(IOP,'(A)')'***************************************' ENDIF ENDIF ENDIF C TRUNCATE IF NEEDED TRUNCATED=(NRES.GE.NDIM) NREAD=NRES NRES=MIN(NDIM,NRES) IF (TRUNCATED) THEN WRITE(6,*)'TRUNCATED TO ',NDIM,' RESIDUES' WRITE(6,*)'**** INCREASE DIMENSION ****' ENDIF C PRINT SEQ AND STRUC IF (IOP.NE.0) then WRITE(IOP,*)'LENGTH ',NRES IF (TRUNCATED)WRITE(IOP,*)'**** TRUNCATED FROM ',NREAD C some machines have problems with list directed I/O !! RS 94 c DO N=0,NRES/100 c N1=1+N*100 c N2=min(nres,100+N*100) c WRITE(IOP,*)(CSQ(I),I=N1,N2) c IF (LDSSP)WRITE(IOP,*)(STRUC(I),I=N1,N2) c ENDDO c WRITE(IOP,*)' ' ENDIF RETURN END C END GETSEQ C...................................................................... C...................................................................... C SUB GETSEQPROF SUBROUTINE GETSEQPROF(CSEQ,TRANS,IRES,NOCC,SEQPROF,MAXRES,MAXAA) C RS 89 C counts frequencies of amino acids C 'B' and 'Z' are assigned as well to the acid as to the amide form C with respect to their occurence in EMBL/SWISSPROT 13.0 IMPLICIT NONE INTEGER IRES,MAXRES,MAXAA CHARACTER*(*) TRANS,CSEQ INTEGER NOCC(*) INTEGER SEQPROF(MAXRES,MAXAA) REAL BTOD,BTON,ZTOE,ZTOQ C INTEGER I,J C================ BTOD=0.521 BTON=0.439 ZTOE=0.623 ZTOQ=0.41 C lower case character CALL LOWTOUP(CSEQ,1) IF (INDEX('BZ',CSEQ).EQ.0) THEN I=INDEX(TRANS(1:MAXAA),CSEQ) IF (I.EQ.0 .OR. I .GT. MAXAA) THEN WRITE(6,*)' GETSEQPROF: unknown residue symbol: ',cseq RETURN ELSE SEQPROF(IRES,I)=SEQPROF(IRES,I)+1 NOCC(IRES)=NOCC(IRES)+1 ENDIF ELSE IF (CSEQ.EQ.'B') THEN CD WRITE(6,*)' GETSEQPROF: convert B' I=INDEX(TRANS,'D') J=INDEX(TRANS,'N') SEQPROF(IRES,I)=NINT( SEQPROF(IRES,I)+BTOD) SEQPROF(IRES,J)=NINT( SEQPROF(IRES,J)+BTON) NOCC(IRES)=NOCC(IRES)+1 ELSE IF (CSEQ.EQ.'Z') THEN CD WRITE(6,*)' GETSEQPROF: convert Z' I=INDEX(TRANS,'E') J=INDEX(TRANS,'Q') SEQPROF(IRES,I)=NINT(SEQPROF(IRES,I)+ZTOE) SEQPROF(IRES,J)=NINT(SEQPROF(IRES,J)+ZTOQ) NOCC(IRES)=NOCC(IRES)+1 ENDIF RETURN END C END GETSEQPROF C...................................................................... C...................................................................... C SUB GETSIMMETRIC SUBROUTINE GETSIMMETRIC(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES_1,NIOSTATES_1,NSTRSTATES_2,NIOSTATES_2, + CSTRSTATES,CIOSTATES, + IORANGE,KSIM,SIMFILE,SIMMETRIC) IMPLICIT NONE C import INTEGER NTRANS CHARACTER*(*) TRANS INTEGER MAXSTRSTATES,MAXIOSTATES INTEGER KSIM CHARACTER*(*) SIMFILE c export INTEGER NSTRSTATES_1,NIOSTATES_1,NSTRSTATES_2,NIOSTATES_2 REAL IORANGE(MAXSTRSTATES,MAXIOSTATES) CHARACTER*(*) CSTRSTATES,CIOSTATES REAL SIMMETRIC(NTRANS,NTRANS,MAXSTRSTATES,MAXIOSTATES, + MAXSTRSTATES,MAXIOSTATES) c internal INTEGER I,J,K,L,I1,I2,J1,J2,ITRANS,IBEG,IEND INTEGER NSTR,NIO,ISTR1,IO1,ISTR2,IO2 INTEGER MATRIXPOS CHARACTER CSTR,CIO,LINE*250 CHARACTER*250 TESTSTRING CHARACTER*30 CTRANS LOGICAL LERROR C====================================================================== C init C====================================================================== MATRIXPOS=22 I= (NTRANS * NTRANS) * (MAXSTRSTATES * MAXSTRSTATES) * + (MAXIOSTATES * MAXIOSTATES) CALL INIT_REAL_ARRAY(1,I,SIMMETRIC,0.0) C accessibility cut to 200% = take all I= MAXSTRSTATES * MAXIOSTATES CALL INIT_REAL_ARRAY(1,I,IORANGE,200.0) TESTSTRING=' ' LINE=' ' CSTRSTATES=' ' CIOSTATES=' ' ITRANS=0 NSTRSTATES_1=1 NIOSTATES_1=1 NSTRSTATES_2=1 NIOSTATES_2=1 NSTR=0 NIO=0 c----------------------------------------------------------------------- TESTSTRING='AA STR I/O V L I M '// + 'F W Y G A P S T C '// + 'H R K Q E N D B Z' WRITE(6,'(a,a)')' GETSIMMATRIX open metric: ',simfile(1:50) CALL OPEN_FILE(KSIM,SIMFILE,'READONLY,OLD',LERROR) IF (LERROR)GOTO 99 C---------------------------------------------------------------------- DO WHILE(INDEX(LINE,TESTSTRING).EQ.0) READ(KSIM,'(A)',END=99)LINE IF (INDEX(LINE,'STRUCTURE-STATES_1:') .NE. 0) THEN I=INDEX(LINE,':')+1 CALL STRPOS(LINE,IBEG,IEND) CALL READ_INT_FROM_STRING(LINE(I:IEND),NSTRSTATES_1) ELSE IF (INDEX(LINE,'STRUCTURE-STATES_2:') .NE. 0) THEN I=INDEX(LINE,':')+1 CALL STRPOS(LINE,IBEG,IEND) CALL READ_INT_FROM_STRING(LINE(I:IEND),NSTRSTATES_2) ELSE IF (INDEX(LINE,'I/O-STATES_1:') .NE. 0) THEN I=INDEX(LINE,':')+1 CALL STRPOS(LINE,IBEG,IEND) CALL READ_INT_FROM_STRING(LINE(I:IEND),NIOSTATES_1) ELSE IF (INDEX(LINE,'I/O-STATES_2:') .NE. 0) THEN I=INDEX(LINE,':')+1 CALL STRPOS(LINE,IBEG,IEND) CALL READ_INT_FROM_STRING(LINE(I:IEND),NIOSTATES_2) ELSE IF (INDEX(LINE,'DSSP-STRUCTURE') .NE. 0) THEN DO I=1,NSTRSTATES_1 DO J=1,NIOSTATES_1 READ(KSIM,'(A)')LINE READ(LINE,'(4X,A1,13X,A1)')CSTR,CIO K=INDEX(CSTRSTATES,CSTR) IF (K.EQ.0) THEN NSTR=NSTR+1 K=NSTR IF (NSTR .GT. MAXSTRSTATES) THEN WRITE(6,*)'*** ERROR: struct-states overflow' STOP ENDIF CALL STRPOS(CSTRSTATES,IBEG,IEND) IF (IEND+1 .GT. LEN(CSTRSTATES)) THEN WRITE(6,*) + '*** ERROR: CSTRSTATES string too short' STOP ENDIF WRITE(CSTRSTATES(IEND+1:IEND+1),'(A1)')CSTR ENDIF L=INDEX(CIOSTATES,CIO) IF (L.EQ.0) THEN NIO=NIO+1 L=NIO IF (NIO .GT. MAXIOSTATES) THEN WRITE(6,*)'*** ERROR: I/O-states overflow' STOP ENDIF CALL STRPOS(CIOSTATES,IBEG,IEND) IF (IEND+1 .GT. LEN(CSTRSTATES)) THEN WRITE(6,*) + '*** ERROR: CIOSTATES string too short' STOP ENDIF WRITE(CIOSTATES(IEND+1:IEND+1),'(A1)')CIO ENDIF READ(LINE,'(26X,F3.0)')IORANGE(K,L) ENDDO ENDDO ENDIF ENDDO C---------------------------------------------------------------------- WRITE(6,*)' STRUCTURE-STATES_1: ',cstrstates,nstrstates_1 WRITE(6,*)' I/O-STATES_1 : ',ciostates,niostates_1 WRITE(6,*)' STRUCTURE-STATES_2: ',cstrstates,nstrstates_2 WRITE(6,*)' I/O-STATES_2 : ',ciostates,niostates_2 IF (NSTRSTATES_1 .EQ. 1)NSTR=1 IF (NIOSTATES_1 .EQ. 1)NIO=1 IF (NSTR .NE. NSTRSTATES_1 .OR. NIO .NE. NIOSTATES_1 ) THEN WRITE(6,*)'*** ERROR: number of structure-states .ne. NSTR' WRITE(6,*)' OR number of I/O-states .ne. NIO' STOP ENDIF C---------------------------------------------------------------------- DO WHILE(.TRUE.) ITRANS=ITRANS+1 DO ISTR1=1,NSTRSTATES_1 DO IO1=1,NIOSTATES_1 DO ISTR2=1,NSTRSTATES_2 DO IO2=1,NIOSTATES_2 READ(KSIM,'(A)',END=11)LINE I1=INDEX(CSTRSTATES,LINE(5:5)) J1=INDEX(CIOSTATES,LINE(8:8)) I2=INDEX(CSTRSTATES,LINE(6:6)) J2=INDEX(CIOSTATES,LINE(9:9)) IF (I1.EQ.0.OR.I2.EQ.0.OR.J1.EQ.0.OR.J2.EQ.0) THEN IF (I1.EQ.0)I1=1 IF (J1.EQ.0)J1=1 IF (I2.EQ.0)I2=1 IF (J2.EQ.0)J2=1 ENDIF READ(LINE,'(1X,A1,7X,22(1X,F5.2))') + CTRANS(ITRANS:ITRANS), + (SIMMETRIC(ITRANS,K,I1,J1,I2,J2), + K=1,MATRIXPOS) ENDDO ENDDO ENDDO ENDDO ENDDO 11 CLOSE(KSIM) ITRANS=ITRANS-1 C======================================================================= C reset value for chain breaks etc... C add 'X' ITRANS=ITRANS+1 CTRANS(ITRANS:ITRANS)='X' I=INDEX(TRANS,'X') DO J=1,NTRANS DO ISTR1=1,NSTRSTATES_1 DO IO1=1,NIOSTATES_1 DO ISTR2=1,NSTRSTATES_2 DO IO2=1,NIOSTATES_2 SIMMETRIC(I,J,ISTR1,IO1,ISTR2,IO2)=0.0 SIMMETRIC(J,I,ISTR1,IO1,ISTR2,IO2)=0.0 ENDDO ENDDO ENDDO ENDDO ENDDO c add '!' ITRANS=ITRANS+1 CTRANS(ITRANS:ITRANS)='!' I=INDEX(TRANS,'!') DO J=1,NTRANS DO ISTR1=1,NSTRSTATES_1 DO IO1=1,NIOSTATES_1 DO ISTR2=1,NSTRSTATES_2 DO IO2=1,NIOSTATES_2 SIMMETRIC(I,J,ISTR1,IO1,ISTR2,IO2)=0.0 SIMMETRIC(J,I,ISTR1,IO1,ISTR2,IO2)=0.0 ENDDO ENDDO ENDDO ENDDO ENDDO c add '-' ITRANS=ITRANS+1 CTRANS(ITRANS:ITRANS)='-' I=INDEX(TRANS,'-') DO J=1,NTRANS DO ISTR1=1,NSTRSTATES_1 DO IO1=1,NIOSTATES_1 DO ISTR2=1,NSTRSTATES_2 DO IO2=1,NIOSTATES_2 SIMMETRIC(I,J,ISTR1,IO1,ISTR2,IO2)=0.0 SIMMETRIC(J,I,ISTR1,IO1,ISTR2,IO2)=0.0 ENDDO ENDDO ENDDO ENDDO ENDDO c add '.' ITRANS=ITRANS+1 CTRANS(ITRANS:ITRANS)='.' I=INDEX(TRANS,'.') DO J=1,NTRANS DO ISTR1=1,NSTRSTATES_1 DO IO1=1,NIOSTATES_1 DO ISTR2=1,NSTRSTATES_2 DO IO2=1,NIOSTATES_2 SIMMETRIC(I,J,ISTR1,IO1,ISTR2,IO2)=0.0 SIMMETRIC(J,I,ISTR1,IO1,ISTR2,IO2)=0.0 ENDDO ENDDO ENDDO ENDDO ENDDO C---------------------------------------------------------------------- C check input order of amino acids C======================================================================= IF (TRANS(1:NTRANS) .NE. CTRANS(1:ITRANS)) THEN WRITE(6,*)' *** ERROR: CTRANS from metric-file and TRANS'// + ' are not the same' WRITE(6,*)'GETSIMMATRIX: ',ctrans,itrans WRITE(6,*)'GETSIMMATRIX: ',trans,ntrans STOP ENDIF C======================================================================= C debug C======================================================================= c do istr1=1,nstrstates_1 c do io1=1,niostates_1 c do istr2=1,nstrstates_2 c do io2=1,niostates_2 c WRITE(6,*)(simmetric(1,j,istr1,io1,istr2,io2),j=1,26) c enddo c enddo c enddo c enddo C======================================================================= RETURN C======================================================================= C unknown metric or read error C======================================================================= 99 CLOSE(KSIM) WRITE(6,'(a)') + '** ERROR reading metric in GETSIMMATRIX **' STOP END C END GETSIMMETRIC C...................................................................... C...................................................................... C SUB GETSWISSBASE SUBROUTINE GETSWISSBASE (KUNIT,MAXRES,KLOG,NRES,CSEQ,NAME, + COMPND,ACCESSION,CPDBREF,LENDFILE) c implicit none INTEGER KUNIT,MAXRES,KLOG,NRES CHARACTER*(*) CSEQ,NAME,COMPND,ACCESSION,CPDBREF LOGICAL LENDFILE CHARACTER*500 LOGSTRING c internal INTEGER LINELEN PARAMETER (LINELEN= 200) CHARACTER LINE*(LINELEN) INTEGER NID,ISTART,ISTOP,JSTART,JSTOP,I,J C====================================================================== LENDFILE=.FALSE. NID=0 NRES=0 NAME=' ' COMPND=' ' ACCESSION=' ' CPDBREF=' ' CSEQ=' ' LINE=' ' ISTOP=0 JSTOP=0 C===================================================================== DO WHILE (.TRUE.) READ(KUNIT,'(A)',END=900,ERR=900)LINE C identifier IF ( LINE(1:2) .EQ. 'ID' ) THEN NAME(1:)=LINE(6:17) c accession number ELSE IF ( LINE(1:2) .EQ. 'AC' ) THEN I=INDEX(LINE,';')-1 ACCESSION(1:)=LINE(6:I) c name ELSE IF ( LINE(1:2) .EQ. 'DE' ) THEN COMPND=' ' COMPND(1:200)=LINE(6:) GOTO 410 ENDIF ENDDO c search for sequence 410 READ(KUNIT,'(A)',END=999)LINE IF ( LINE(1:2) .EQ. 'SQ' ) THEN GOTO 420 C STORE LATEST BROOKHAVEN-POINTER IN CPDBREF ELSE IF ( LINE(1:2) .EQ. 'DR' .AND. + INDEX(LINE,'PDB;') .NE. 0) THEN CALL STRPOS(CPDBREF,ISTART,ISTOP) CALL STRPOS(LINE,JSTART,JSTOP) IF (LINE(JSTOP:JSTOP) .EQ. '.')JSTOP=JSTOP-1 IF (ISTOP+JSTOP .LE. LEN(CPDBREF) ) THEN IF (NID .LE. 0) THEN CPDBREF(ISTOP+1:)=LINE(11:JSTOP) ELSE CPDBREF(ISTOP+1:)='|'//LINE(11:JSTOP) ENDIF NID=NID+1 c else c WRITE(6,*)'**** PDBREF-LINE DIMENSION OVERFLOW ***' ENDIF ENDIF GOTO 410 420 IF (NID .GT. 0) THEN CALL STRPOS(CPDBREF,ISTART,ISTOP) IF ( (ISTOP+6) .LE. LEN(CPDBREF) ) THEN WRITE(CPDBREF(ISTOP+1:),'(A,I4)')'||',NID ELSE WRITE(6,*)'**** PDBREF-LINE DIMENSION OVERFLOW ***' ENDIF ENDIF c sequences starts in next line 430 READ(KUNIT,'(A)',ERR=999,END=999) LINE c end of database file reached ? IF ( LINE(1:2) .EQ. '//' ) RETURN c call strpos(line,istart,istop) DO ISTART=LINELEN,1,-1 IF (LINE(ISTART:ISTART).NE.' ') THEN ISTOP=ISTART GOTO 440 ENDIF ENDDO 440 DO J=1,ISTOP IF ( LINE(J:J) .NE. ' ' .AND. NRES+1 .LE. MAXRES) THEN NRES=NRES+1 CSEQ(NRES:NRES)=LINE(J:J) ELSE IF (NRES+1 .GT. MAXRES ) THEN WRITE(6,'(A)')'** DIMENSION OVERFLOW MAXSQ ***' GOTO 910 ENDIF ENDDO GOTO 430 C===================================================================== C END of SWISSPROT reached C===================================================================== 900 LENDFILE=.TRUE. NRES=0 RETURN C===================================================================== C TRUNCATE IF NEEDED C===================================================================== 910 WRITE(LOGSTRING,'(A,I8,A)')'TRUNCATED TO ',MAXRES, + ' RESIDUES: INCREASE DIMENSION ' c call log_file(klog,logstring,1) RETURN C====================================================================== 999 WRITE(LOGSTRING,'(A)') + '*** ERROR READING SWISSPROT, SET NRES=0 AND RETURN' c call log_file(klog,logstring,1) NRES=0 RETURN END C END GETSWISSBASE C...................................................................... C...................................................................... C SUB GET_SWISS_ENTRY SUBROUTINE GET_SWISS_ENTRY(MAXSQ,KUNIT,LBINARY,NRES,NAME, + COMPOUND,ACCESSION,PDBREF,SEQ,LEND) IMPLICIT NONE C IMPORT INTEGER MAXSQ,KUNIT LOGICAL LBINARY C EXPORT CHARACTER*(*) SEQ,NAME,COMPOUND,ACCESSION,PDBREF INTEGER NRES LOGICAL LEND C INTERNAL INTEGER NSIZE,NSIZE2 PARAMETER (NSIZE= 12) PARAMETER (NSIZE2= 200) C====================================================================== LEND=.FALSE. C===================================================================== IF (LBINARY) THEN READ(KUNIT,END=900,ERR=900)NRES,NAME(1:NSIZE), + ACCESSION(1:NSIZE),PDBREF(1:NSIZE), + COMPOUND(1:NSIZE2) READ(KUNIT,END=900,ERR=999)SEQ(1:NRES) ELSE READ(KUNIT,'(I6,A,A,A,A,A)',END=900,ERR=999)NRES, + NAME(1:NSIZE),ACCESSION(1:NSIZE),PDBREF(1:NSIZE), + COMPOUND(1:NSIZE2),SEQ c read(kunit,'(i6,a,a,a,a,a)',end=900,err=999)nres,name, c + ACCESSION,pdbref, c + compound,seq ENDIF c truncate if needed IF (NRES .GT. MAXSQ) THEN c WRITE(6,*)' SEQ CUT TO MAXSQ: ',nres,MAXSQ NRES=MAXSQ CALL FLUSH_UNIT(6) ENDIF RETURN C====================================================================== 900 LEND=.TRUE. NRES= 0 SEQ= ' ' NAME= ' ' ACCESSION=' ' PDBREF= ' ' COMPOUND= ' ' RETURN 999 WRITE(6,*)' ERROR in get_swiss_entry ',NAME,NRES CALL FLUSH_UNIT(6) STOP END C END GET_SWISS_ENTRY C...................................................................... C...................................................................... ***** ------------------------------------------------------------------ ***** SUB GETTOKEN ***** ------------------------------------------------------------------ C---- C---- NAME : GETTOKEN C---- ARG : 1 CSTRING(1:LEN) = string of length LEN C---- ARG : 2 LEN = length of string C---- ARG : 3 ITOKEN = number of string to matcho C---- ARG : 4 FIRSTPOS = position where CSTRING matches C---- ARG : the ITOKEN nth STRING (non blank) C---- ARG : 5 CTOKEN = returns ITOKEN nth string that matched C---- ARG : that matched in CSTRING C---- DES : Builds up the ITOKEN nth string in the string CSTRING C---- DES : that is not having any blank. The first position of C---- DES : this string (FIRSTPOS) and the string (CTOKEN) are returned C---- DES : if no match: returns 0 (i.e. never matched) C---- *----------------------------------------------------------------------* C SUB GETTOKEN SUBROUTINE GETTOKEN(CSTRING,LEN,ITOKEN,FIRSTPOS,CTOKEN) IMPLICIT NONE C Import INTEGER LEN,ITOKEN CHARACTER*(*) CSTRING C Export INTEGER FIRSTPOS CHARACTER*(*) CTOKEN C Internal INTEGER IPOS,THISTOKEN,TPOS LOGICAL FINISHED,INSIDE ******------------------------------*-----------------------------****** C---- C---- initialise C---- CTOKEN= ' ' TPOS= 0 FINISHED= .FALSE. IF ( CSTRING(1:1) .EQ. ' ' ) THEN THISTOKEN= 0 INSIDE= .FALSE. ELSE THISTOKEN= 1 INSIDE= .TRUE. FIRSTPOS= 1 IF ( THISTOKEN .EQ. ITOKEN ) THEN TPOS= TPOS + 1 CTOKEN(TPOS:TPOS)= CSTRING(1:1) ENDIF ENDIF C---- C---- loop over string C---- IPOS = 2 DO WHILE ((IPOS .LE. LEN) .AND. (.NOT. FINISHED) ) IF ( CSTRING(IPOS:IPOS) .EQ. ' ' .OR. 1 IPOS .EQ. LEN ) THEN IF ( INSIDE ) THEN INSIDE = .FALSE. IF ( THISTOKEN .EQ. ITOKEN ) FINISHED = .TRUE. ENDIF ELSE IF ( .NOT. INSIDE ) THEN INSIDE = .TRUE. FIRSTPOS = IPOS THISTOKEN = THISTOKEN + 1 ENDIF IF ( THISTOKEN .EQ. ITOKEN ) THEN TPOS = TPOS + 1 CTOKEN(TPOS:TPOS) = CSTRING(IPOS:IPOS) ENDIF ENDIF IPOS = IPOS + 1 ENDDO IF ( .NOT. FINISHED ) FIRSTPOS = 0 RETURN END C END GETTOKEN C...................................................................... C...................................................................... C SUB HSSPHEADER SUBROUTINE HSSPHEADER(KHSSP,HSSPFILE,HSSPLINE,BRKID,CDATE, + DATABASE,CPARAMETER,NPARALINE,ISOSIGFILE,ISAFE,LFORMULA, + HEAD,COMP,SOURCE,AUTHOR,LRES,NCHAIN,KCHAIN,CHAINREMARK, + NALIGN) IMPLICIT NONE C import CHARACTER*(*) HSSPLINE,DATABASE,CPARAMETER(*), + HSSPFILE,ISOSIGFILE,CDATE INTEGER KHSSP,NPARALINE,ISAFE,LRES,NCHAIN,KCHAIN, + NALIGN LOGICAL LFORMULA,LERROR C Attributes of DSSP-file CHARACTER*(*) HEAD,COMP,SOURCE,AUTHOR,BRKID,CHAINREMARK C internal INTEGER I,LEN,ISTART,ISTOP C---- -------------------------------------------------- C---- open HSSP-file and write header C---- -------------------------------------------------- CALL OPEN_FILE(KHSSP,HSSPFILE,'NEW',LERROR) CALL STRPOS(HSSPLINE,ISTART,ISTOP) WRITE(KHSSP,'(A)')HSSPLINE(1:ISTOP) CALL STRPOS(BRKID,ISTART,ISTOP) WRITE(KHSSP,'(A,A)')'PDBID ',BRKID(ISTART:ISTOP) WRITE(KHSSP,'(A,A)')'DATE file generated on ',CDATE CALL STRPOS(DATABASE,ISTART,ISTOP) WRITE(KHSSP,'(A)')DATABASE(1:ISTOP) DO I=1,NPARALINE CALL STRPOS(CPARAMETER(I),ISTART,ISTOP) WRITE(KHSSP,'(A,A)')'PARAMETER ',CPARAMETER(I)(ISTART:ISTOP) ENDDO C---- which formula used for filtering? IF (LFORMULA) THEN IF (ISAFE.EQ.0) THEN WRITE(KHSSP,'(A,A)')'THRESHOLD ', + ' according to: t(L)=290.15 * L ** -0.562' ELSE IF (ISAFE.GT.0) THEN WRITE(KHSSP,'(A,A,I3)')'THRESHOLD ', + ' according to: t(L)=(290.15 * L ** -0.562) +',isafe ELSE IF (ISAFE.LT.0) THEN WRITE(KHSSP,'(A,A,I3)')'THRESHOLD ', + ' according to: t(L)=(290.15 * L ** -0.562) ',isafe ENDIF C---- no FORMULA filtering ELSE CALL STRPOS(ISOSIGFILE,ISTART,ISTOP) WRITE(KHSSP,'(A,A)')'THRESHOLD according to: ', + ISOSIGFILE(ISTART:ISTOP) ENDIF WRITE(KHSSP,'(A,A)')'REFERENCE ',' Sander C., Schneider R.'// + ' : Database of homology-derived protein structures.'// + ' Proteins, 9:56-68 (1991).' CALL STRPOS(HEAD,ISTART,ISTOP) WRITE(KHSSP,'(A,A)') 'HEADER ',HEAD(1:ISTOP) CALL STRPOS(COMP,ISTART,ISTOP) WRITE(KHSSP,'(A,A)') 'COMPND ',COMP(1:ISTOP) CALL STRPOS(SOURCE,ISTART,ISTOP) WRITE(KHSSP,'(A,A)') 'SOURCE ',SOURCE(1:ISTOP) CALL STRPOS(AUTHOR,ISTART,ISTOP) WRITE(KHSSP,'(A,A)') 'AUTHOR ',AUTHOR(1:ISTOP) WRITE(KHSSP,'(A,I4)') 'SEQLENGTH ',LRES CALL STRPOS(BRKID,ISTART,ISTOP) WRITE(KHSSP,'(A,I4,A,A,A)')'NCHAIN ',NCHAIN, + ' chain(s) in ',brkid(istart:istop),' data set' c WRITE(6,*)'chainremark: ',chainremark IF (CHAINREMARK .NE. ' ') THEN CALL STRPOS(CHAINREMARK,ISTART,ISTOP) WRITE(KHSSP,'(A,I4,A,A)')'KCHAIN ',KCHAIN, + ' chain(s) used here ; chain(s) : ',chainremark(1:istop) ENDIF WRITE(KHSSP,'(A,I4)') 'NALIGN ',NALIGN C---- C---- NOTATION part C---- WRITE(KHSSP,'(A)')'NOTATION : ID: EMBL/SWISSPROT identifier'// + ' of the aligned (homologous) protein' WRITE(KHSSP,'(A)')'NOTATION : STRID: if the 3-D structure of'// + ' the aligned protein is known, then STRID is the Protein'// + ' Data Bank identifier as taken' WRITE(KHSSP,'(A)')'NOTATION : from the database'// + ' reference or DR-line of the EMBL/SWISSPROT entry' WRITE(KHSSP,'(A)')'NOTATION : %IDE: percentage of residue'// + ' identity of the alignment' WRITE(KHSSP,'(A)')'NOTATION : %SIM (%WSIM): '// + ' (weighted) similarity of the alignment' WRITE(KHSSP,'(A)')'NOTATION : IFIR/ILAS: first and last resid'// + 'ue of the alignment in the test sequence' WRITE(KHSSP,'(A)')'NOTATION : JFIR/JLAS: first and last resid'// + 'ue of the alignment in the alignend protein' WRITE(KHSSP,'(A)')'NOTATION : LALI: length of the alignment'// + ' excluding insertions and deletions' WRITE(KHSSP,'(A)')'NOTATION : NGAP: number of insertions'// + ' and deletions in the alignment' WRITE(KHSSP,'(A)')'NOTATION : LGAP: total length of all'// + ' insertions and deletions' WRITE(KHSSP,'(A)')'NOTATION : LSEQ2: length of the entire'// + ' sequence of the aligned protein' WRITE(KHSSP,'(A)')'NOTATION : ACCESSION: SwissProt accession'// + ' number' WRITE(KHSSP,'(A)')'NOTATION : PROTEIN: one-line description'// + ' of aligned protein' WRITE(KHSSP,'(A)')'NOTATION : SeqNo,PDBNo,AA,STRUCTURE,BP1,'// + 'BP2,ACC: sequential and PDB residue numbers, amino acid '// + '(lower case = Cys), secondary' WRITE(KHSSP,'(A)')'NOTATION : structure, bridge '// + 'partners, solvent exposure as in DSSP (Kabsch and Sander,'// + ' Biopolymers 22, 2577-2637(1983)' WRITE(KHSSP,'(A)')'NOTATION : VAR: sequence variability on'// + ' a scale of 0-100 as derived from the NALIGN alignments' WRITE(KHSSP,'(A)')'NOTATION : pair of lower case characters'// + ' (AvaK) in the alignend sequence bracket a point of'// + ' INSERTION IN THIS sequence' WRITE(KHSSP,'(A)')'NOTATION : dots (....) in the alignend'// + ' SEQUENCE INDICATE POINTS of deletion in this sequence' WRITE(KHSSP,'(A)')'NOTATION : SEQUENCE PROFILE: relative '// + 'frequency of an amino acid type at each position. Asx'// + ' and Glx are in their' WRITE(KHSSP,'(A)')'NOTATION : acid/amide'// + ' form in proportion to their database frequencies' WRITE(KHSSP,'(A)')'NOTATION : NOCC: number of aligned sequenc'// + 'es spanning this position (including the test sequence)' WRITE(KHSSP,'(A)')'NOTATION : NDEL: number of sequences with'// + ' a deletion in the test protein at this position' WRITE(KHSSP,'(A)')'NOTATION : NINS: number of sequences with'// + ' an insertion in the test protein at this position' WRITE(KHSSP,'(A)')'NOTATION : ENTROPY: entropy measure of'// + ' sequence variability at this position' WRITE(KHSSP,'(A)')'NOTATION : RELENT: relative entropy, i.e. '// + ' entropy normalized to the range 0-100' WRITE(KHSSP,'(a)')'NOTATION : WEIGHT: conservation weight' WRITE(KHSSP,*) RETURN END C END HSSPHEADER C...................................................................... C...................................................................... C SUB INIT_CHAR_ARRAY SUBROUTINE INIT_CHAR_ARRAY(IBEG,IEND,CARRAY,SYMBOL) IMPLICIT NONE INTEGER I,IBEG,IEND CHARACTER*(*) CARRAY DIMENSION CARRAY(IBEG:IEND) CHARACTER*(*) SYMBOL DO I=IBEG,IEND CARRAY(I)=SYMBOL ENDDO RETURN END C END INIT_CHAR_ARRAY C...................................................................... C...................................................................... C SUB INIT_REAL_ARRAY SUBROUTINE INIT_REAL_ARRAY(IBEG,IEND,ARRAY,VALUE) IMPLICIT NONE INTEGER I,IBEG,IEND REAL ARRAY DIMENSION ARRAY(IBEG:IEND) REAL VALUE DO I=IBEG,IEND ARRAY(I)=VALUE ENDDO RETURN END C END INIT_REAL_ARRAY C...................................................................... C...................................................................... C SUB INIT_INT_ARRAY SUBROUTINE INIT_INT_ARRAY(IBEG,IEND,ARRAY,VALUE) IMPLICIT NONE INTEGER I,IBEG,IEND INTEGER ARRAY DIMENSION ARRAY(IBEG:IEND) INTEGER VALUE DO I=IBEG,IEND ARRAY(I)=VALUE ENDDO RETURN END C END INIT_INT_ARRAY C...................................................................... C...................................................................... C SUB INIT_INT2_ARRAY SUBROUTINE INIT_INT2_ARRAY(IBEG,IEND,ARRAY,VALUE) IMPLICIT NONE INTEGER I,IBEG,IEND INTEGER*2 ARRAY DIMENSION ARRAY(IBEG:IEND) INTEGER VALUE DO I=IBEG,IEND ARRAY(I)=VALUE ENDDO RETURN END C END INIT_INT2_ARRAY C...................................................................... C...................................................................... C SUB INT_TO_SEQ SUBROUTINE INT_TO_SEQ(LSEQ,SEQ,NRES,CTRANS,INDELMARK,ENDMARK) C reverse SEQ_TO_INTEGER C DSSP SS bridges (lower case) are lost (converted to 'C' from seqtoint) C converts amino acid integers to string of amino acid characters C uses translation table CHARACTER CTRANS IMPLICIT NONE C import INTEGER NRES,LSEQ(*) INTEGER INDELMARK,ENDMARK CHARACTER*(*) CTRANS c export CHARACTER*1 SEQ(*) C internal INTEGER I C DO I=1,NRES IF (LSEQ(I) .EQ. 0) THEN WRITE(6,*)'** unknown res or chain separator in INT_TO_SEQ' ENDIF IF (LSEQ(I) .EQ. INDELMARK) THEN SEQ(I)='.' ELSE IF (LSEQ(I) .EQ. ENDMARK) THEN SEQ(I)='<' ELSE SEQ(I)=CTRANS(LSEQ(I):LSEQ(I)) ENDIF ENDDO RETURN END C END INT_TO_SEQ C...................................................................... C...................................................................... C INT_TO_STRCLASS SUBROUTINE INT_TO_STRCLASS(MAXSTRSTATES,MAXALSQ,NRES,LSTRUC, + STR_CLASSES,INDELMARK,ENDMARK,STRUC) IMPLICIT NONE INTEGER MAXSTRSTATES,MAXALSQ INTEGER NRES,LSTRUC(*),INDELMARK,ENDMARK C---- br 99.03: watch hard_coded here, see maxhom.param CHARACTER*10 STR_CLASSES(MAXSTRSTATES) C---- --> REASON: the following produces warnings on SGI C CHARACTER*(*) STR_CLASSES(MAXSTRSTATES) CHARACTER STRUC(MAXALSQ) c internal INTEGER I c======================================================================= DO I=1,NRES IF (LSTRUC(I) .EQ. INDELMARK) THEN STRUC(I)='.' ELSE IF (LSTRUC(I) .EQ. ENDMARK) THEN STRUC(I)='<' ELSE STRUC(I)=STR_CLASSES(LSTRUC(I))(1:1) ENDIF ENDDO RETURN END C END INT_TO_STRCLASS C...................................................................... C...................................................................... C SUB INTERPRET_LINE SUBROUTINE INTERPRET_LINE(LINE,MAXFIELD, + MACROLINE, CFIELD, CSTRING, CALFANUMERIC, + CALFAMIXED,CWORD,NFIELD,NSTRING,NALFANUMERIC, + NNUMBER, NREAL, NINTEGER,NPOSITIVE, NNEGATIVE, + NWORD, NALFAMIXED,IINTEGER,IPOSITIVE, + INEGATIVE,XNUMBER, XREAL,IFIELD_POS) IMPLICIT NONE c INCLUDE 'interpret_line' C input CHARACTER*(*) LINE INTEGER MAXFIELD CHARACTER*(*) MACROLINE, CFIELD(*), + CSTRING(*), CALFANUMERIC(*), + CALFAMIXED(*), CWORD(*) INTEGER NFIELD,NSTRING,NALFANUMERIC,NNUMBER,NREAL,NINTEGER INTEGER NPOSITIVE, NNEGATIVE, NWORD, NALFAMIXED INTEGER IINTEGER(*),IPOSITIVE(*), + INEGATIVE(*) REAL XNUMBER(*), XREAL(*) C POINTERS TO BEG AND END OF EACH FIELD INTEGER IFIELD_POS(2,*) C LOCAL LOGICAL LALFANUMERIC,LNUMBER,LREAL,LWORD INTEGER ID,I,ISTARTLINE,IENDLINE,IBEG,IEND LOGICAL LCONTINUE C interprets an input line C-------example---------------------------------------------------- C input: CA Q 110 CB W -203 5.5 C output: MACROLINE='LLCLLNR' C NFIELD=7 ; NSTRING=0 ; NALFANUMERIC=7 ; NNUMBER=3 ; NWORD=4 C NALFAMIXED=0 ; NREAL=1 ; NINTEGER=2 ; NPOSITIVE=1 ; NNEGATIVE=1 C CFIELD(2)='Q' ; XNUMBER(2)=-203. ; CSTRING(3)='CB' C IINTEGER(2)=-203 etc. C-----------hierarchy of field types-------- = macroline symbol---- C like 4PTI.COOR String = contains non-alfanumeric, like @#$%^&* C like CA5 Alfamixed = mixed letters and numbers C like TRP Letters only = word C like -.5E+5 Real number C

like 16 positive integer C like -16 Negative integer or 0 C C field = (alfanumeric,other ASCII) (filterted by ASCII-filter) C alfanumeric = (number, word, alfa-mixed) C number = (integer,real) C integer = (positive, negative) C C STRING C ALFANUMERIC C NUMBER C REAL C INTEGER C POSITIVE C NEGATIVE C WORD C ALFAMIXED C C macrosymbol is designed such that the whole world partitions into C S A L R P N, i.e. macrosymbol of a field is the lowest valid type C C---------------------------------------------------------------------- C step0: preliminaries NFIELD=0 MACROLINE=' ' NSTRING=0 NALFANUMERIC=0 NNUMBER=0 NREAL=0 NINTEGER=0 NPOSITIVE=0 NNEGATIVE=0 NWORD=0 NALFAMIXED=0 DO ID=1,MAXFIELD CFIELD(ID)=' ' CSTRING(ID)=' ' CALFANUMERIC(ID)=' ' XNUMBER(ID)=0.0 XREAL(ID)=0.0 IINTEGER(ID)=0 IPOSITIVE(ID)=0 INEGATIVE(ID)=0 CWORD(ID)=' ' CALFAMIXED(ID)=' ' DO I=1,2 IFIELD_POS(I,ID)=0 ENDDO ENDDO CALL ASCIIFILTER(LINE) C step1: find beg and end of each field CALL STRPOS(LINE,ISTARTLINE,IENDLINE) NFIELD=1 IFIELD_POS(1,NFIELD)=ISTARTLINE DO I=ISTARTLINE,IENDLINE-1 C " x" starts field IF (LINE(I:I) .EQ. ' ' .AND. LINE(I+1:I+1) .NE. ' ') THEN NFIELD=NFIELD+1 IF ( NFIELD .GT. MAXFIELD) THEN WRITE(6,*)'*** ERROR IN INTERPRETLINE: MAXFIELD OVERFLOW' NFIELD=MAXFIELD ENDIF IFIELD_POS(1,NFIELD)=I+1 ENDIF C "x " ends field IF (LINE(I:I) .NE. ' ' .AND. LINE(I+1:I+1) .EQ. ' ') THEN IFIELD_POS(2,NFIELD)=I ENDIF ENDDO IFIELD_POS(2,NFIELD)=IENDLINE C step3: process each field C----------------------------------------------------------------------- C NSTRING C NALFANUMERIC C NNUMBER C NREAL C NINTEGER C NPOSITIVE C NNEGATIVE C NWORD C NALFAMIXED C----------------------------------------------------------------------- DO ID=1,NFIELD C BR 2007/08/22 avoid GOTO IF (LCONTINUE) THEN C step 3.1: extract string i CFIELD(ID)=LINE(IFIELD_POS(1,ID):IFIELD_POS(2,ID)) CALL STRPOS(CFIELD(ID),IBEG,IEND) C step 3.2: determine type of field, store field, store macrosymbol C .not. lalfanumeric CALL IS_ALFANUMERIC(CFIELD(ID),LALFANUMERIC) IF (.NOT. LALFANUMERIC) THEN NSTRING=NSTRING+1 CSTRING(NSTRING)=CFIELD(ID) MACROLINE(ID:ID)='S' ELSE C lnumber / lword / mixed NALFANUMERIC=NALFANUMERIC+1 CALFANUMERIC(NALFANUMERIC)=CFIELD(ID) CALL IS_NUMBER(CFIELD(ID),LNUMBER) IF (LNUMBER) THEN NNUMBER=NNUMBER+1 CALL IS_REAL(CFIELD(ID),LREAL) C real / integer IF (LREAL) THEN NREAL=NREAL+1 CALL READ_REAL(CFIELD(ID),XREAL(NREAL)) XNUMBER(NNUMBER)=XREAL(NREAL) MACROLINE(ID:ID)='R' ELSE CALL READ_REAL_FROM_STRING(CFIELD(ID)(IBEG:IEND), + XNUMBER(NNUMBER) ) NINTEGER=NINTEGER+1 CALL READ_INT_FROM_STRING(CFIELD(ID)(IBEG:IEND), + IINTEGER(NINTEGER) ) IF (IINTEGER(NINTEGER).GT.0) THEN NPOSITIVE=NPOSITIVE+1 IPOSITIVE(NPOSITIVE)=IINTEGER(NINTEGER) MACROLINE(ID:ID)='P' ELSE NNEGATIVE=NNEGATIVE +1 INEGATIVE(NNEGATIVE )=IINTEGER(NINTEGER) MACROLINE(ID:ID)='N' ENDIF ENDIF ELSE CALL IS_WORD(CFIELD(ID),LWORD) IF (LWORD) THEN NWORD=NWORD+1 CWORD(NWORD)=CFIELD(ID) MACROLINE(ID:ID)='L' ELSE NALFAMIXED=NALFAMIXED+1 CALFAMIXED(NALFAMIXED)=CFIELD(ID) MACROLINE(ID:ID)='A' ENDIF ENDIF ENDIF LCONTINUE=.FALSE. ENDIF 100 ENDDO C RETURN END C END INTERPRET_LINE C...................................................................... C...................................................................... C SUB INTTOSTR SUBROUTINE INTTOSTR(NRES,LSTR,CSTR,LDSSP) IMPLICIT NONE INTEGER NRES,LSTR(*) CHARACTER CSTR(*) LOGICAL LDSSP C internal INTEGER I CHARACTER*25 STRSTATES STRSTATES=' LTCSltcsEBAPMebapmHGIhgi' IF (LDSSP) THEN DO I=1,NRES IF (LSTR(I) .EQ. 99) THEN CSTR(I)='.' ELSE IF (LSTR(I) .EQ. 999) THEN CSTR(I)='<' ELSE CSTR(I)=STRSTATES(LSTR(I):LSTR(I)) ENDIF ENDDO ELSE DO I=1,NRES CSTR(I)='U' ENDDO ENDIF RETURN END C END INTTOSTR C...................................................................... C...................................................................... C SUB LEFTADJUST(STRING,NDIM,NLEN) SUBROUTINE LEFTADJUST(STRING,NDIM,NLEN) C...left-adjust of a string IMPLICIT NONE CHARACTER*(*) STRING INTEGER NDIM, NLEN, l,il C...find position of first non-blank IF (NDIM .LT. 1 .OR. NLEN .LT. 1) RETURN IF (NDIM .gt. 1)STOP' update routine leftadjust' L=1 DO WHILE(STRING(L:L) .EQ. ' ' .AND. L .LT. NLEN) L=L+1 ENDDO IF (L .GT. 1) THEN C..L is position of first non-blank STRING(1:NLEN-L+1)=STRING(L:NLEN) C.C..fill rest with blanks up to NLEN DO IL=NLEN-L+2,NLEN STRING(IL:IL)=' ' ENDDO ENDIF c DO I=1,NDIM c L=1 c DO WHILE(STRINGS(I)(L:L).EQ.' '.AND.L.LT.NLEN) c L=L+1 c ENDDO c IF (L.GT.1) THEN C..L is position of first non-blank c STRINGS(I)(1:NLEN-L+1)=STRINGS(I)(L:NLEN) C.C..fill rest with blanks up to NLEN c DO IL=NLEN-L+2,NLEN c STRINGS(I)(IL:IL)=' ' c ENDDO c ENDIF c ENDDO RETURN END C END LEFTADJUST C...................................................................... C...................................................................... C SUB IS_INTEGER SUBROUTINE IS_INTEGER(STRING,LINTEGER) C LINTEGER = .TRUE. if first field of STRING is an INTEGER C LINTEGER = first non-blank byte is + or - or a digit, .AND. C all subsequent byte are digits, until blank. IMPLICIT NONE C import CHARACTER*(*) STRING C export LOGICAL LINTEGER C local CHARACTER DIGITS*10, SIGNED*12 INTEGER IBEG,IEND,K SIGNED='+-0123456789' DIGITS='0123456789' LINTEGER=.TRUE. CALL STRPOS(STRING,IBEG,IEND) K=IBEG IF (INDEX(SIGNED,STRING(K:K)).EQ.0) THEN LINTEGER=.FALSE. RETURN ENDIF K=K+1 DO WHILE( K .LE. IEND) IF (INDEX(DIGITS,STRING(K:K)).EQ.0) THEN LINTEGER=.FALSE. RETURN ENDIF K=K+1 ENDDO RETURN END C END IS_INTEGER C...................................................................... C...................................................................... C SUB IS_REAL SUBROUTINE IS_REAL(STRING,LREAL) C LREAL = .TRUE. if STRING is a real number C LREAL = integer / . / integer / E or e / integer C import IMPLICIT NONE CHARACTER*(*) STRING C export LOGICAL LREAL C local CHARACTER*15 REALSYMBOL LOGICAL LINTEGER INTEGER IBEG,IEND,K,IEXP,JEXP,IPOS,IDOT C REALSYMBOL='0123456789.-+Ee' LREAL=.TRUE. C not just an integer CALL IS_INTEGER(STRING,LINTEGER) IF (LINTEGER) THEN LREAL=.FALSE. RETURN ENDIF CALL STRPOS(STRING,IBEG,IEND) DO K=IBEG,IEND IF (INDEX(REALSYMBOL,STRING(K:K)).EQ. 0) THEN LREAL=.FALSE. RETURN ENDIF ENDDO C LREAL = integer / . / integer / E or e / integer IDOT=INDEX(STRING,'.') C we want one '.' IF (IDOT .EQ. 0) THEN LREAL=.FALSE. RETURN ENDIF C the part before the '.' must be an integer IF (IDOT .NE. 1) THEN CALL IS_INTEGER(STRING(1:IDOT-1),LINTEGER) ELSE C means: .345 LINTEGER=.TRUE. ENDIF IF (LINTEGER) THEN IEXP=INDEX(STRING(IDOT+1:),'E') JEXP=INDEX(STRING(IDOT+1:),'e') C if no exponent is specified only an integer after the '.' is allowed IF (IEXP .EQ.0 .AND. JEXP .EQ. 0) THEN CALL IS_INTEGER(STRING(IDOT+1:),LINTEGER) IF (.NOT. LINTEGER) THEN LREAL=.FALSE. RETURN ENDIF C exponent must be an integer ELSE IPOS=MAX(IEXP,JEXP) CALL IS_INTEGER(STRING(IDOT+1+IPOS:),LINTEGER) IF (.NOT. LINTEGER) THEN LREAL=.FALSE. RETURN ENDIF ENDIF ENDIF RETURN END C END IS_REAL C...................................................................... C...................................................................... C SUB IS_NUMBER SUBROUTINE IS_NUMBER(STRING,LNUMBER) C LNUMBER=.TRUE. if STRING is a real or integer CHARACTER*(*) STRING LOGICAL LNUMBER, LINTEGER, LREAL CALL IS_REAL(STRING,LREAL) CALL IS_INTEGER(STRING,LINTEGER) LNUMBER= LREAL .OR. LINTEGER RETURN END C END IS_NUMBER C...................................................................... C...................................................................... C SUB IS_ALFANUMERIC SUBROUTINE IS_ALFANUMERIC(STRING,LALFANUMERIC) C LALFANUMERIC=.TRUE. if STRING is alfanumeric C LALFANUMERIC = contains only letters and digits and number C punctuation (.+-) IMPLICIT NONE C import CHARACTER*(*) STRING C export LOGICAL LALFANUMERIC C local CHARACTER ALFANUMERIC*65 INTEGER IBEG,IEND,K C init ALFANUMERIC='ABCDEFGHIJKLMNOPQRSTUVWXYZ'// + 'abcdefghijklmnopqrstuvwxyz0123456789+-.' LALFANUMERIC=.TRUE. CALL STRPOS(STRING,IBEG,IEND) K=IBEG DO WHILE(K .LT. IEND) IF ( INDEX(ALFANUMERIC,STRING(K:K)) .EQ.0 ) THEN LALFANUMERIC=.FALSE. RETURN ENDIF K=K+1 ENDDO RETURN END C END IS_ALFANUMERIC C...................................................................... C...................................................................... C SUB IS_WORD SUBROUTINE IS_WORD(STRING,LWORD) C LWORD=.TRUE. if STRING is pure alfa. C LWORD = contains only letters IMPLICIT NONE C import CHARACTER*(*) STRING C export LOGICAL LWORD C local CHARACTER ALFA*52 INTEGER IBEG,IEND,K C init ALFA='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz' LWORD=.TRUE. C CALL STRPOS(STRING,IBEG,IEND) K=IBEG DO WHILE(K .LT. IEND) IF ( INDEX(ALFA,STRING(K:K)) .EQ.0 ) THEN LWORD=.FALSE. RETURN ENDIF K=K+1 ENDDO RETURN END C END IS_WORD C...................................................................... C...................................................................... C SUB LOG_FILE SUBROUTINE LOG_FILE(KLOG,STRING,IFLAG) C iflag =0 ===> only in file C iflag =1 ===> only std-out C iflag =2 ===> both (file and std-out) C IMPLICIT NONE INTEGER KLOG,IFLAG,IBEG,IEND,ILINE,I INTEGER ICUTBEGIN(20),ICUTEND(20) CHARACTER*(*) STRING CALL STRPOS(STRING,IBEG,IEND) ILINE=1 ICUTBEGIN(ILINE)=1 ICUTEND(ILINE)=IEND DO I=1,IEND-1 IF (STRING(I:I+1).EQ.'/n') THEN ILINE=ILINE+1 ICUTBEGIN(ILINE)=I+2 ICUTEND(ILINE-1)=I-1 ICUTEND(ILINE)=IEND ENDIF ENDDO DO I=1,ILINE IF (IFLAG .EQ. 0) THEN WRITE(KLOG,'(A)')STRING(ICUTBEGIN(I):ICUTEND(I)) ELSE IF (IFLAG .EQ. 1) THEN WRITE(6,*)STRING(ICUTBEGIN(I):ICUTEND(I)) CALL FLUSH_UNIT(6) ELSE IF (IFLAG .EQ. 2) THEN WRITE(KLOG,'(A)')STRING(ICUTBEGIN(I):ICUTEND(I)) WRITE(6,*)STRING(ICUTBEGIN(I):ICUTEND(I)) CALL FLUSH_UNIT(6) ENDIF ENDDO RETURN END C END LOG_FILE C...................................................................... C...................................................................... C SUB LOWER_TO_CYS SUBROUTINE LOWER_TO_CYS(SEQ,NRES) C import CHARACTER*(*) SEQ INTEGER NRES DO I=1,NRES IF ( (SEQ(I:I) .GE. 'a') .AND. (SEQ(I:I) .LE. 'z') ) THEN SEQ(I:I)='C' ENDIF ENDDO END C END LOWER_TO_CYS C...................................................................... C...................................................................... C SUB LOWTOUP SUBROUTINE LOWTOUP(STRING,LENGTH) C LOWTOUP.......CONVERTS STRING......CHRIS SANDER JULY 1983 C changed by RS (speed up) CHARACTER*(*) STRING INTEGER LENGTH CX CHARACTER UPPER*26, LOWER*26, STRING*(*) CX DATA UPPER/'ABCDEFGHIJKLMNOPQRSTUVWXYZ'/ CX DATA LOWER/'abcdefghijklmnopqrstuvwxyz'/ C DO I=1,LENGTH IF (STRING(I:I) .GE. 'a' .AND. STRING(I:I) .LE. 'z') THEN STRING(I:I)=CHAR( ICHAR(STRING(I:I))-32 ) CX K=INDEX(LOWER,STRING(I:I)) CX IF (K.NE.0) STRING(I:I)=UPPER(K:K) ENDIF ENDDO RETURN END C END LOWTOUP C...................................................................... C...................................................................... C SUB MAKE_FORMAT_INT SUBROUTINE MAKE_FORMAT_INT(ILEN,CFORMAT) INTEGER ILEN CHARACTER*(*) CFORMAT CHARACTER*20 CTEMP,CINT CFORMAT=' ' CINT=' ' WRITE(CINT,'(I20)')ILEN CALL CONCAT_STRINGS('(I',CINT,CTEMP) CALL CONCAT_STRINGS(CTEMP,')',CFORMAT) RETURN END C END MAKE_FORMAT_INT C...................................................................... C...................................................................... C SUB MARKALI SUBROUTINE MARKALI(S1,S2,N,AGREE,C) C marks equalitites between S1 and S2 with C in string AGREE c implicit none CHARACTER*(*) S1,S2(*),AGREE(*),C CHARACTER CTEST INTEGER N,I,IAGR IF (N .EQ. 0) THEN WRITE(6,*)'*** N=0 IN MARKALI' RETURN ENDIF IAGR=0 DO I=1,N CTEST=S2(I) c convert lower case letter of sequence 2 CALL LOWTOUP(CTEST,1) IF (S1(I:I) .EQ. CTEST) THEN AGREE(I)=C IAGR=IAGR+1 ELSE AGREE(I)=' ' ENDIF ENDDO RETURN END C END MARKALI C...................................................................... C...................................................................... C SUB MSFCHECKSEQ SUBROUTINE MSFCHECKSEQ(SEQCHECK,NSEQ,MSFCHECK) C IMPORT INTEGER NSEQ INTEGER SEQCHECK(NSEQ) C INTERNAL INTEGER CHECKTMP, I C EXPORT INTEGER MSFCHECK CHECKTMP = 0 DO I = 1, NSEQ CHECKTMP = CHECKTMP + SEQCHECK(I) ENDDO MSFCHECK = MOD(CHECKTMP, 10 000) RETURN END C END MSFCHECKSEQ C...................................................................... C...................................................................... C SUB OPEN_SW_DATA_FILE SUBROUTINE OPEN_SW_DATA_FILE(KUNIT,LBINARY,IFILE,DATA,PATH,HOST) C import CHARACTER*(*) DATA,PATH,HOST INTEGER KUNIT,IFILE LOGICAL LBINARY C internal CHARACTER*100 TEMPNAME,LINE LOGICAL LERROR CALL CONCAT_INT_STRING(IFILE,DATA,LINE) CALL CONCAT_STRINGS(PATH,LINE,TEMPNAME) IF ( HOST .NE. ' ' ) THEN CALL STRPOS(HOST,IBEG,IEND) IF ( INDEX(HOST(IBEG:IEND),'unknownHost') .LE. 0 ) THEN c WRITE(6,*)'host:',host,":",tempname c host(iend+1:iend+1)=':' CALL CONCAT_STRINGS(HOST,TEMPNAME,LINE) TEMPNAME(1:)=LINE(1:) ENDIF ENDIF c WRITE(6,*)'file: ',tempname(1:60) CALL FLUSH_UNIT(6) IF (LBINARY) THEN CAUTION RECL !!!! CALL OPEN_FILE(KUNIT,TEMPNAME, + 'OLD,READONLY,UNFORMATTED,RECL=500000',lerror) ELSE CALL OPEN_FILE(KUNIT,TEMPNAME,'OLD,READONLY',LERROR) ENDIF IF (LERROR) THEN WRITE(6,*)'ERROR: open file : ',tempname CALL FLUSH_UNIT(6) STOP ENDIF RETURN END C END OPEN_SW_DATA_FILE C...................................................................... C...................................................................... C SUB PREPARE_INSERTIONS SUBROUTINE PREPARE_INSERTIONS(MAXRES,MAXALIGNS, 1 NRES,NALIGN,IFIR,ILAS,INSNUMBER,INSALI,INSLEN, 2 INSAP,MAXLEN,INSLIST_POINTER,TOTALINSLEN,ERROR) C 21.6.93 C 18.11. : AliseqEnvironment -> prepare_insertions; C ........ return pointers to sublists of single alignments in ReadHSSP C ........ arrays ( 0 if there is no sublist ) ; C ........ + the maximal length of an insertion starting at any position IMPLICIT NONE C Import INTEGER MAXRES,MAXALIGNS INTEGER NRES,NALIGN INTEGER IFIR(*), ILAS(*) INTEGER INSNUMBER,INSALI(*),INSLEN(*) INTEGER INSAP(*) C Export INTEGER*2 TOTALINSLEN(MAXRES) INTEGER*2 MAXLEN(MAXRES), INSLIST_POINTER(MAXALIGNS) LOGICAL ERROR C Internal INTEGER*2 INT2_TEMP INTEGER ALINO INTEGER IAP,IINS,TIL IF ( NALIGN .GT. MAXALIGNS ) THEN WRITE(6,'(1X,A)') 1 'MAXALIGNS overflow in prepare_insertions!' ERROR = .TRUE. RETURN ENDIF IF ( NRES .GT. MAXRES ) THEN WRITE(6,'(1X,A)') 'MAXRES overflow in prepare_insertions !' ERROR = .TRUE. RETURN ENDIF CALL INIT_INT2_ARRAY(1,NRES,MAXLEN,0) CALL INIT_INT2_ARRAY(1,NALIGN,INSLIST_POINTER,0) ALINO = INSALI(1) INSLIST_POINTER(ALINO) = 1 MAXLEN(INSAP(1)) = INSLEN(1) DO IINS = 2,INSNUMBER IF ( ALINO .NE. INSALI(IINS) ) THEN ALINO = INSALI(IINS) INSLIST_POINTER(ALINO) = IINS ENDIF C NOTE: CONVERSION FROM INT4 IN INT2 INT2_TEMP = INSLEN(IINS) MAXLEN(INSAP(IINS))= + MAX(MAXLEN(INSAP(IINS)),INT2_TEMP) ENDDO TIL = 0 DO IAP = 1,NRES IF ( MAXLEN(IAP) .GT. 0 ) TIL = TIL + MAXLEN(IAP) TOTALINSLEN(IAP) = TIL ENDDO RETURN END C END PREPARE_INSERTIONS C...................................................................... C...................................................................... C SUB PUNISHGAP SUBROUTINE PUNISHGAP(NRES,LDSSP,STRUC,GAPOPEN,PUNISH) C====================================================================== C INDELs in secondary structure segments C---------------------------------------------------------------------- C if INDELS in secondary structure are NOT allowed (if DSSP-file(s)) C set gap-open(IPOS , SEQuence 1/SEQuence 2) in secondary structure segments C to a high value. C BUT NOT for the first and last residue in a segment C LLLLLHHHHHHHHHHLLLLLLLLL C ______^^^^^^^^__________ C punish C C definition of struture class is: unknown 'U' = 0 C ' TCLStclss' = 1 C 'EBAPMebapm' = 2 C 'HGIhgiiiii' = 3 C CAUTION: IF THE ASSIGNMENT OF STRUC CLASS IS CHANGED IN STRUCCLASS C ======== CHANGE IT ALSO HERE c======================================================================= IMPLICIT NONE c import INTEGER NRES CHARACTER STRUC(*) LOGICAL LDSSP REAL PUNISH C CHANGED REAL GAPOPEN(*) C INTERNAL INTEGER I,ICLASS1,ICLASS2,ICLASS3 CHARACTER C C IF (LDSSP) THEN DO I=2,NRES-1 CALL SECSTRUC_TO_3_STATE(STRUC(I-1),C,ICLASS1) CALL SECSTRUC_TO_3_STATE(STRUC(I ),C,ICLASS2) CALL SECSTRUC_TO_3_STATE(STRUC(I+1),C,ICLASS3) IF (ICLASS1.GT.1 .AND. ICLASS2.GT.1 .AND. ICLASS3.GT.1) THEN GAPOPEN(I)=PUNISH ENDIF ENDDO ENDIF RETURN END C END PUNISHGAP C...................................................................... C...................................................................... C SUB PUNISH_GAP SUBROUTINE PUNISH_GAP(NRES,STRUC,STRSTATES,PUNISH,GAPOPEN) C====================================================================== C INDELs in secondary structure segments C---------------------------------------------------------------------- C if INDELS in secondary structure are NOT allowed (passed in strstates) C set gap-open(IPOS , SEQuence 1/SEQuence 2) in secondary str segments C to a high value. C BUT NOT for the first and last residue in a segment C LLLLLHHHHHHHHHHLLLLLLLLL C ______^^^^^^^^__________ C punish C c======================================================================= IMPLICIT NONE C IMPORT INTEGER NRES CHARACTER*(*) STRUC(*),STRSTATES REAL PUNISH C CHANGED REAL GAPOPEN(*) C INTERNAL INTEGER I,IBEG,IEND C CALL STRPOS(STRSTATES,IBEG,IEND) DO I=2,NRES-1 IF (INDEX(STRSTATES(IBEG:IEND),STRUC(I)) .NE. 0) THEN IF (STRUC(I).EQ.STRUC(I-1).AND.STRUC(I).EQ.STRUC(I+1)) THEN GAPOPEN(I)=PUNISH ENDIF ENDIF ENDDO RETURN END C END PUNISH_GAP C...................................................................... C...................................................................... C SUB PUTHEADER SUBROUTINE PUTHEADER(KPLOT,CSQ_1,CSQ_2,STRUC_1,STRUC_2, + N1,N2,NAME_1,NAME_2) IMPLICIT NONE INTEGER KPLOT,N1,N2 CHARACTER*(*) NAME_1,NAME_2 CHARACTER*(*) CSQ_1,CSQ_2 CHARACTER*1 STRUC_1(*),STRUC_2(*) C internal INTEGER LINELEN,I,J,ISTOP,M CHARACTER*200 CTEMP C init CTEMP=' ' LINELEN=LEN(CTEMP)-1 write(kplot,*) '/number of residues in protein 1:' write(kplot,'(i10)')n1 write(kplot,*) '/number of residues in protein 2:' write(kplot,'(i10)')n2 write(kplot,*) ' ' write(kplot,*) '/file name for protein 1:' write(kplot,*)name_1 write(kplot,*)'/file name for protein 2:' write(kplot,*)name_2 write(kplot,*)' ' write(kplot,*)'/SEQUENCE 1:' DO I=1,N1,LINELEN J=1 ISTOP=I+LINELEN IF (ISTOP.GT.N1)ISTOP=N1 DO M=I,ISTOP WRITE(CTEMP(J:J),'(A)')CSQ_1(M:M) J=J+1 ENDDO WRITE(KPLOT,'(A)')CTEMP(:J-1) ENDDO WRITE(KPLOT,*)' ' WRITE(KPLOT,*)'/SEQUENCE 2:' DO I=1,N2,LINELEN J=1 ISTOP=I+LINELEN IF (ISTOP.GT.N2)ISTOP=N2 DO M=I,ISTOP WRITE(CTEMP(J:J),'(A)')CSQ_2(M:M) J=J+1 ENDDO WRITE(KPLOT,'(A)')CTEMP(:J-1) ENDDO WRITE(KPLOT,*)' ' WRITE(KPLOT,*) '/SECSTRUC 1:' DO I=1,N1,LINELEN J=1 ISTOP=I+LINELEN IF (ISTOP.GT.N1)ISTOP=N1 DO M=I,ISTOP WRITE(CTEMP(J:J),'(A)')STRUC_1(M) J=J+1 ENDDO WRITE(KPLOT,'(A)')CTEMP(:J-1) ENDDO WRITE(KPLOT,*)' ' WRITE(KPLOT,*) '/SECSTRUC 2:' DO I=1,N2,LINELEN J=1 ISTOP=I+LINELEN IF (ISTOP.GT.N2)ISTOP=N2 DO M=I,ISTOP WRITE(CTEMP(J:J),'(A)')STRUC_2(M) J=J+1 ENDDO WRITE(KPLOT,'(A)')CTEMP(:J-1) ENDDO WRITE(KPLOT,*) ' ' RETURN END C END PUTHEADER C...................................................................... C...................................................................... C SUB READ_BRK SUBROUTINE READ_BRK(KIN,INFILE,CHAINS,CTRANS,RLEN,NRES, 1 COMPND,SEQ,PDBNO,TRUNCATED,ERROR) CAUTION ctrans2 and seq3 are character strings here but C arrays in s1tos3 and s3tos1 ======>> BUG c RS dec. 94 C 14.5.93 CHEADER OXIDOREDUCTASE (SUPEROXIDE ACCEPTOR) 25-MAR-80 2SOD 2SOD CCOMPND CU,ZN SUPEROXIDE DISMUTASE (E.C.1.15.1.1) 2SOD C .. CATOM 4 N ALA O 1 -20.479 24.715 -21.334 1.00 16.16 2SOD CATOM 5 CA ALA O 1 -19.117 24.539 -21.395 1.00 15.65 2SOD C1..4......11.14..1820....26....32....3840....4648....54 IMPLICIT NONE C IMPORT INTEGER KIN,RLEN INTEGER PDBNO(*) CHARACTER*(*) CHAINS, CTRANS, INFILE C EXPORT CHARACTER*(*) COMPND,SEQ LOGICAL TRUNCATED,ERROR C INTERNAL INTEGER MAXRES_LOC,NTRANS_LOC,LINELEN PARAMETER (MAXRES_LOC= 9999) C PARAMETER (MAXRES_LOC= 10000) C PARAMETER (MAXRES_LOC= 19876) PARAMETER (NTRANS_LOC= 25) PARAMETER (LINELEN= 1000) INTEGER ICHAIN, JCHAIN, ISTART, ISTOP, IPOS, + NRES, N, NREAD, NTRANS CHARACTER*1 C, CHAIN CHARACTER*(3*MAXRES_LOC) SEQ3 CHARACTER*10 NUMBERS CHARACTER*(LINELEN) LINE CHARACTER*(3*NTRANS_LOC) CTRANS3 *----------------------------------------------------------------------* WRITE(6,*)' STOP UPDATE READ_BRK' C$$$ ERROR = .FALSE. c$$$ c$$$C try to open outfile; return if unsuccessful c$$$ call open_file(kin,infile,'old,readonly',error) c$$$C error messages are alredy issued by OPEN_FILE c$$$ if ( error ) return c$$$ c$$$ if ( linelen .lt. rlen ) then c$$$ WRITE(6,'(1x,a)') c$$$ 1 ' *** record length of input file too big ***' c$$$ goto 1 c$$$ endif c$$$ c$$$ error = .false. c$$$ numbers = '0123456789' c$$$ call strpos(ctrans,istart,istop) c$$$ ntrans = istop-istart+1 c$$$ call s1tos3(ctrans3,ctrans,ntrans) c$$$ read(kin,'(a)',err=1,end=2) line c$$$ compnd = line(7:) c$$$ c$$$ nres = 0 c$$$ ichain = 1 c$$$ jchain = 1 c$$$ seq3 = ' ' c$$$ call strpos(chains,istart,istop) c$$$ call gettoken(chains,len(chains),1,ipos,chain) c$$$ do while ( ipos .le. istop ) c$$$ do while ( line(1:4) .ne. 'ATOM' ) c$$$ read(kin,'(a)',err=1,end=2) line c$$$ enddo c$$$ c = line(22:22) c$$$ if ( index(numbers,chain ) .ne. 0 ) then c$$$ read(chain,'(i1)') n c$$$ if ( n .eq. ichain ) then c$$$ call read_brkchain(kin,nres,ctrans3,rlen,line,seq3, c$$$ 1 pdbno,nread,truncated,error) c$$$ nres = nres + nread c$$$ ichain = ichain + 1 c$$$ jchain = jchain + 1 c$$$ read(kin,'(a)',err=1,end=2) line c$$$ else c$$$ call skip_brkchain(kin,rlen,line,error) c$$$ ichain = ichain + 1 c$$$ read(kin,'(a)',err=1,end=2) line c$$$ endif c$$$ else c$$$ if ( c .eq. chain ) then c$$$ call read_brkchain(kin,nres,ctrans3,rlen,line,seq3, c$$$ 1 pdbno,nread,truncated,error) c$$$ nres = nres + nread c$$$ ichain = ichain + 1 c$$$ jchain = jchain + 1 c$$$ read(kin,'(a)',err=1,end=2) line c$$$ else c$$$ call skip_brkchain(kin,rlen,line,error) c$$$ ichain = ichain + 1 c$$$ read(kin,'(a)',err=1,end=2) line c$$$ endif c$$$ endif c$$$ call strpos(chains,istart,istop) c$$$ call gettoken(chains,len(chains),jchain,ipos,chain) c$$$ enddo c$$$ c$$$ goto 2 c$$$ 1 error = .true. c$$$ WRITE(6,'(a)') ' ** error reading BRK file **' c$$$ 2 continue c$$$ call s3tos1(seq3,seq,nres) c$$$ c$$$ close(kin) RETURN END C END READ_BRK C...................................................................... C...................................................................... C SUB READ_BRKCHAIN SUBROUTINE READ_BRKCHAIN(KIN,SEQPOS,CTRANS,RLEN,FIRSTLINE,SEQ, 1 PDBNO,NREAD,TRUNCATED,ERROR) C 15.5.93 CATOM 4 N ALA O 1 -20.479 24.715 -21.334 1.00 16.16 2SOD 232 CATOM 5 CA ALA O 1 -19.117 24.539 -21.395 1.00 15.65 2SOD 233 C SPECIAL CASES : CATOM 404 CA AASP 50 7.731 6.227 13.395 0.67 10.85 6PTI C1..4......11.14..1820....26....32....3840....4648....54 IMPLICIT NONE C Import INTEGER KIN,RLEN C .. the read pointer of kin is expected to point to the next line C .. TO BE INTERPRETED INTEGER SEQPOS C .. may be alread partially filled. last occupied position is "seqpos" INTEGER PDBNO(*) CHARACTER*(*) SEQ CHARACTER*(*) CTRANS, FIRSTLINE C EXPORT INTEGER NREAD LOGICAL TRUNCATED,ERROR C .. and "seq", with "nread" more symbols; "pdbno" C with "nread" more entries C Internal INTEGER LINELEN PARAMETER (LINELEN= 1000) INTEGER IPOS CHARACTER*3 C3 CHARACTER*4 CNUMBER CHARACTER*(LINELEN) LINE IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF ERROR = .FALSE. NREAD = 0 CNUMBER = ' ' IPOS = SEQPOS LINE = FIRSTLINE DO WHILE ( LINE(1:3) .NE. 'TER' .AND. 1 .NOT. TRUNCATED ) IF ( LINE(23:26) .NE. CNUMBER ) THEN CNUMBER = LINE(23:26) C3 = LINE(18:20) IF ( INDEX(CTRANS,C3) .NE. 0 ) THEN TRUNCATED = ( SEQPOS+NREAD+1 .GT. LEN(SEQ)/3 ) IF ( .NOT. TRUNCATED ) THEN IPOS = 3*(SEQPOS+NREAD) SEQ(IPOS+1:IPOS+3) = C3 NREAD = NREAD + 1 READ (CNUMBER,'(I4)') PDBNO(SEQPOS+NREAD) ENDIF ENDIF ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO GOTO 2 1 ERROR = .TRUE. WRITE(6,'(a)') ' ** error reading BRK file **' 2 CONTINUE RETURN END C END READ_BRKCHAIN C...................................................................... C...................................................................... C SUB READ_DSSPCHAIN SUBROUTINE READ_DSSPCHAIN(KIN,SEQPOS,CTRANS,RLEN,FIRSTLINE,SEQ, 1 STRUC,ACC,PDBNO,NREAD,LACCZERO,TRUNCATED,ERROR) C 18.5.93 C 1 1 O A 0 0 81 0, 0.0 149,-0.2 0, 0.0 104,-0.1 0.000 360.0 360.0 360.0 164.6 -19.1 24.5 -21.4 IMPLICIT NONE C IMPORT INTEGER KIN, SEQPOS, RLEN CHARACTER*(*) CTRANS, FIRSTLINE C EXPORT INTEGER NREAD,PDBNO(*), ACC(*) CHARACTER*(*) SEQ, STRUC LOGICAL LACCZERO,TRUNCATED,ERROR C INTERNAL INTEGER NASCII,LINELEN PARAMETER (NASCII= 256) PARAMETER (LINELEN= 1000) INTEGER LOWERPOS(NASCII),I CHARACTER*1 C CHARACTER*26 LOWER CHARACTER*(LINELEN) LINE *----------------------------------------------------------------------* IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF ERROR = .FALSE. C USED TO CONVERT LOWER CASE CHARACTERS FROM THE DSSP-SEQ TO 'C' (CYS) LOWER='abcdefghijklmnopqrstuvwxyz' CALL GETPOS(LOWER,LOWERPOS,NASCII) NREAD = 0 LINE = FIRSTLINE DO WHILE ( LINE(14:14) .NE. '!' .AND. 1 .NOT. TRUNCATED ) C = LINE(14:14) CALL GETINDEX(C,LOWERPOS,I) IF ( I.NE.0 ) C = 'C' IF ( INDEX(CTRANS,C) .NE. 0 ) THEN TRUNCATED = ( SEQPOS+NREAD+1 .GT. LEN(SEQ) ) IF ( .NOT. TRUNCATED ) THEN NREAD = NREAD + 1 SEQ(SEQPOS+NREAD:SEQPOS+NREAD) = C STRUC(SEQPOS+NREAD:SEQPOS+NREAD) = LINE(17:17) READ(LINE(6:10),'(I5)') PDBNO(SEQPOS+NREAD) READ(LINE(35:38),'(I4)') ACC(SEQPOS+NREAD) LACCZERO = LACCZERO .AND. (ACC(SEQPOS+NREAD) .EQ. 0) ENDIF ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO NREAD = NREAD + 1 SEQ(SEQPOS+NREAD:SEQPOS+NREAD) = '!' STRUC(SEQPOS+NREAD:SEQPOS+NREAD) = ' ' GOTO 2 1 ERROR = .TRUE. WRITE(6,'(A)') ' ** ERROR READING DSSP FILE **' 2 CONTINUE RETURN END C END READ_DSSPCHAIN C...................................................................... C...................................................................... C SUB READ_EMBL SUBROUTINE READ_EMBL(KIN,INFILE,CTRANS,RLEN,NRES,COMPND, 1 ACCESSION,PDB,SEQ,TRUNCATED,ERROR) C 14.5.93 CDE test.pep from: 1 to: 13 CDE test.pep CSQ SEQUEN C AAAAAAAAAA AAA C// IMPLICIT NONE C IMPORT INTEGER KIN,RLEN CHARACTER*(*) CTRANS,INFILE C EXPORT INTEGER NRES CHARACTER*(*) COMPND,ACCESSION,PDB, SEQ LOGICAL TRUNCATED,ERROR C INTERNAL INTEGER LINELEN PARAMETER (LINELEN= 1000) INTEGER I,ID, ISTART, ISTOP, JSTART, JSTOP, IPOS CHARACTER*1 C CHARACTER*(LINELEN) LINE *----------------------------------------------------------------------* ERROR = .FALSE. JSTOP=0 C try to open outfile; return if unsuccessful CALL OPEN_FILE(KIN,INFILE,'old,readonly',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF ERROR = .FALSE. ID = 0 PDB = ' ' READ(KIN,'(A)',ERR=1,END=2) LINE DO WHILE ( LINE(1:2) .NE. 'SQ' ) CALL STRPOS(LINE,ISTART,ISTOP) IF ( INDEX(LINE(1:2), 'AC') .NE. 0 ) THEN I = INDEX(LINE,';') - 1 ACCESSION = LINE(6:I) ELSE IF ( INDEX(LINE(1:2), 'DE') .NE. 0 ) THEN COMPND = LINE(6:200) ELSE IF ( INDEX(LINE(1:9), 'DR PDB;') .NE. 0 ) THEN C .PDB-DATABASE POINTER CALL STRPOS(PDB,ISTART,ISTOP) CALL STRPOS(LINE,JSTART,JSTOP) IF (LINE(JSTOP:JSTOP) .EQ. '.')JSTOP=JSTOP-1 C I = LEN(PDB) IF ( ISTOP+JSTOP-10 .LE. LEN(PDB)) THEN IF ( ID .LE. 0 ) THEN PDB(ISTOP+1:) = LINE(11:JSTOP) ELSE PDB(ISTOP+1:) = '|' // LINE(11:JSTOP) ENDIF ID = ID + 1 ELSE WRITE(6,*)'**** PDBREF-LINE DIMENSION OVERFLOW ***' ENDIF ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO CALL STRPOS(PDB,ISTART,ISTOP) IF ( ID .GT. 0 ) THEN IF ( ISTOP+7 .LE. LEN(PDB) ) THEN WRITE(PDB(ISTOP+1:),'(A,I4)') '||', ID ELSE WRITE(6,*)'**** PDBREF-LINE DIMENSION OVERFLOW ***' ENDIF ENDIF NRES = 0 READ(KIN,'(A)',ERR=1,END=2) LINE C SEQUENCE DO WHILE ( INDEX(LINE(1:2),'//') .EQ. 0 .AND. 1 .NOT. TRUNCATED ) CALL STRPOS(LINE,ISTART,ISTOP) DO IPOS = ISTART,ISTOP C = LINE(IPOS:IPOS) CALL LOWTOUP(C,1) IF ( INDEX(CTRANS,C) .NE. 0 ) THEN TRUNCATED = ( NRES+1 .GT. LEN(SEQ) ) IF ( .NOT. TRUNCATED ) THEN NRES = NRES + 1 SEQ(NRES:NRES) = C ENDIF ENDIF ENDDO READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO GOTO 2 1 ERROR = .TRUE. WRITE(6,'(a)') ' ** error reading EMBL/SWISSPROT file **' 2 CONTINUE CLOSE(KIN) RETURN END C END READ_EMBL C...................................................................... C...................................................................... C SUB READ_FASTA SUBROUTINE READ_FASTA(KIN,INFILE,CTRANS,RLEN,NRES, 1 ACCESSION,COMPND,SEQ,TRUNCATED,ERROR) C 11.4.96 C>test blablabla C A A A A A A A A A A A A A IMPLICIT NONE C IMPORT INTEGER KIN, RLEN CHARACTER*(*) CTRANS,INFILE C EXPORT INTEGER NRES CHARACTER*(*) ACCESSION,COMPND, SEQ LOGICAL TRUNCATED,ERROR C INTERNAL INTEGER LINELEN PARAMETER (LINELEN= 1000) INTEGER IPOS, ISTART, ISTOP CHARACTER*1 C CHARACTER*(LINELEN) LINE *----------------------------------------------------------------------* ERROR = .FALSE. ISTOP=0 C TRY TO OPEN OUTFILE; RETURN IF UNSUCCESSFUL CALL OPEN_FILE(KIN,INFILE,'old,readonly',error) C error messages are already issued by OPEN_FILE IF ( ERROR ) RETURN IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE DO WHILE ( LINE(1:1) .NE. '>' ) READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO CALL STRPOS(LINE,ISTART,ISTOP) ISTART=INDEX(LINE,' ') IF (ISTART .GT. 2 .AND. ISTART .LT. ISTOP) THEN ACCESSION(1:LEN(ACCESSION))=LINE(2:ISTART-1) COMPND = LINE(ISTART+1:ISTOP) ELSE ACCESSION(1:LEN(ACCESSION))=LINE(2:) COMPND=ACCESSION ENDIF NRES = 0 READ(KIN,'(A)',ERR=1,END=2) LINE DO WHILE ( .NOT. TRUNCATED ) CALL STRPOS(LINE,ISTART,ISTOP) IF ( ISTOP .NE. 0 ) THEN DO IPOS = ISTART,ISTOP C = LINE(IPOS:IPOS) CALL LOWTOUP(C,1) IF ( INDEX(CTRANS,C) .NE. 0 ) THEN TRUNCATED = ( NRES+1 .GT. LEN(SEQ) ) IF ( .NOT. TRUNCATED ) THEN NRES = NRES + 1 SEQ(NRES:NRES) = C ENDIF ELSE IF (C .EQ. '*') THEN GOTO 2 ENDIF ENDDO ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO GOTO 2 1 ERROR = .TRUE. WRITE(6,'(a)') ' ** error reading FASTA file **' 2 CONTINUE CLOSE(KIN) RETURN END C END READ_FASTA C...................................................................... C...................................................................... C SUB READ_GCG SUBROUTINE READ_GCG(KIN,INFILE,CTRANS,RLEN,NRES,COMPND, 1 SEQ,TRUNCATED,ERROR) C 14.5.93 C C Test.Pep Length: 13 May 10, 1993 10:48 Type: N Check: 5915 .. C C 1 AAAAAAAAAA AAA C IMPLICIT NONE C IMPORT INTEGER KIN,RLEN CHARACTER*(*) CTRANS,INFILE C EXPORT INTEGER NRES CHARACTER*(*) COMPND, SEQ LOGICAL TRUNCATED,ERROR C INTERNAL INTEGER LINELEN PARAMETER (LINELEN= 1000) INTEGER IPOS,JPOS, ISTART,JSTART,JSTOP, ISTOP CHARACTER*1 C CHARACTER*10 CTOKEN CHARACTER*(LINELEN) LINE *----------------------------------------------------------------------* ERROR = .FALSE. C try to open outfile; return if unsuccessful CALL OPEN_FILE(KIN,INFILE,'old,readonly',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF ERROR = .FALSE. C data start after a line ending with '..' READ(KIN,'(A)',ERR=1,END=2) LINE DO WHILE ( INDEX(LINE,'..') .EQ. 0 ) READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO C first word of '..' line CALL GETTOKEN(LINE,LINELEN,1,IPOS,COMPND) C sequence part NRES = 0 READ(KIN,'(A)',ERR=1,END=2) LINE DO WHILE ( .TRUE. ) CALL STRPOS(LINE,ISTART,ISTOP) IF ( ISTOP .GT. 0 ) THEN C .. FIRST WORD IS A NUMBER CALL GETTOKEN(LINE,LINELEN,1,JPOS,CTOKEN) CALL STRPOS(CTOKEN,JSTART,JSTOP) DO IPOS = JPOS+JSTOP-JSTART+1, ISTOP C = LINE(IPOS:IPOS) CALL LOWTOUP(C,1) IF ( INDEX(CTRANS,C) .NE. 0 ) THEN TRUNCATED = ( NRES+1 .GT. LEN(SEQ) ) IF ( .NOT. TRUNCATED ) THEN NRES = NRES + 1 SEQ(NRES:NRES) = C ENDIF ENDIF ENDDO ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO 1 ERROR = .TRUE. WRITE(6,'(a)') ' ** error reading GCG file **' 2 CONTINUE CLOSE(KIN) RETURN END C END READ_GCG C...................................................................... C...................................................................... C SUB READ_HSSPCHAIN SUBROUTINE READ_HSSPCHAIN(KIN,SEQPOS,CTRANS,RLEN,FIRSTLINE,SEQ, 1 STRUC,ACC,PDBNO,NREAD,LACCZERO,TRUNCATED,ERROR) C 18.5.93 C 1 1 O A 0 0 81 11 13 AAAAAAAA S A IMPLICIT NONE C IMPORT INTEGER KIN,RLEN CHARACTER*(*) CTRANS, FIRSTLINE C EXPORT INTEGER NREAD, SEQPOS INTEGER PDBNO(*), ACC(*) CHARACTER*(*) SEQ, STRUC LOGICAL LACCZERO,TRUNCATED,ERROR C INTERNAL INTEGER NASCII,LINELEN PARAMETER (NASCII= 256) PARAMETER (LINELEN= 1000) INTEGER LOWERPOS(NASCII) INTEGER I CHARACTER*1 C CHARACTER*26 LOWER CHARACTER*(LINELEN) LINE *----------------------------------------------------------------------* IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF ERROR = .FALSE. C used to convert lower case characters from the DSSP-seq to 'C' (Cys) LOWER='abcdefghijklmnopqrstuvwxyz' CALL GETPOS(LOWER,LOWERPOS,NASCII) NREAD = 0 LINE = FIRSTLINE DO WHILE ( LINE(15:15) .NE. '!' .AND. 1 LINE(1:2) .NE. '##' .AND. 2 .NOT. TRUNCATED ) C = LINE(15:15) CALL GETINDEX(C,LOWERPOS,I) IF ( I.NE.0 ) C = 'C' IF ( INDEX(CTRANS,C) .NE. 0 ) THEN TRUNCATED = ( SEQPOS+NREAD+1 .GT. LEN(SEQ) ) IF ( .NOT. TRUNCATED ) THEN NREAD = NREAD + 1 SEQ(SEQPOS+NREAD:SEQPOS+NREAD) = C STRUC(SEQPOS+NREAD:SEQPOS+NREAD) = LINE(18:18) READ(LINE(7:11),'(I5)') PDBNO(SEQPOS+NREAD) READ(LINE(36:39),'(I4)') ACC(SEQPOS+NREAD) LACCZERO = LACCZERO .AND. (ACC(SEQPOS+NREAD) .EQ. 0 ) ENDIF ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO NREAD = NREAD + 1 SEQ(SEQPOS+NREAD:SEQPOS+NREAD) = '!' STRUC(SEQPOS+NREAD:SEQPOS+NREAD) = ' ' GOTO 2 1 ERROR = .TRUE. WRITE(6,'(a)') '*** ERROR READ_HSSPCHAIN reading HSSP file' 2 CONTINUE RETURN END C END READ_HSSPCHAIN C...................................................................... C...................................................................... C SUB READ_INT_FROM_STRING SUBROUTINE READ_INT_FROM_STRING(CSTRING,INUMBER) C import CHARACTER*(*) CSTRING C export INTEGER INUMBER C internal CHARACTER*100 CFORMAT,CTEMP CHARACTER*12 CNUMBER CNUMBER='-=0123456789' CFORMAT=' ' INUMBER=0 CALL STRPOS(CSTRING,ISTART,ISTOP) ITOTAL=ISTOP-ISTART+1 C PRINT *, "ITOTAL=",ITOTAL," ISTOP=",ISTOP," ISTART=",ISTART DO I=ISTART,ISTOP J=INDEX(CNUMBER,CSTRING(I:I)) IF ( J .LE. 0) THEN ITOTAL=I-ISTART WRITE(6,*)' *** NOT AN INTEGER:',CSTRING(ISTART:ISTOP) ENDIF ENDDO CALL CONCAT_STRING_INT('(I',ITOTAL,CTEMP) CALL CONCAT_STRINGS(CTEMP,')',CFORMAT) READ(CSTRING(ISTART:ISTOP),CFORMAT)INUMBER RETURN END C END READ_INT_FROM_STRING C...................................................................... C...................................................................... C SUB READ_MSF SUBROUTINE READ_MSF(KUNIT,FILENAME,MAXALIGNS,MAXCORE, 1 ALISEQ,ALIPOINTER,IFIR,ILAS,JFIR,JLAS,TYPE, 2 SEQNAMES,WEIGHT,SEQCHECK,MSFCHECK,NRES_ALI,NALIGN, 3 ERROR) C Implicit None C IMPORT INTEGER MAXALIGNS, MAXCORE INTEGER KUNIT CHARACTER*(*) FILENAME C EXPORT INTEGER NALIGN INTEGER ALIPOINTER(MAXALIGNS) INTEGER NRES_ALI INTEGER MSFCHECK INTEGER IFIR(MAXALIGNS), ILAS(MAXALIGNS) INTEGER JFIR(MAXALIGNS), JLAS(MAXALIGNS) C 'P' = PROTEIN SEQUENCES, 'N' = NUCLEOTIDE SEQ CHARACTER*1 TYPE CHARACTER*(*) SEQNAMES(MAXALIGNS) CHARACTER ALISEQ(MAXCORE) REAL WEIGHT(MAXALIGNS) INTEGER SEQCHECK(MAXALIGNS) LOGICAL ERROR C INTERNAL INTEGER CODELEN_LOC,MAXALIGNS_LOC, MAXRES_LOC,LINELEN PARAMETER (CODELEN_LOC= 14) PARAMETER (MAXALIGNS_LOC= 8765) C PARAMETER (MAXALIGNS_LOC= 12345) C PARAMETER (MAXALIGNS_LOC= 21987) PARAMETER (MAXRES_LOC= 9999) C PARAMETER (MAXRES_LOC= 10000) C PARAMETER (MAXRES_LOC= 19876) PARAMETER (LINELEN= 200) INTEGER TESTCHECK,I,IPOS,ISEQ,NPROT_THIS,ISTART,ISTOP, + IBEG,ITMP,ILEN,DIFF,CFREE,FPOS INTEGER LASTOCCUPIED(MAXALIGNS_LOC),NRES(MAXALIGNS_LOC), + NSEQLINES(MAXALIGNS_LOC) CHARACTER CGAPCHAR CHARACTER*200 ERRORMESSAGE,CTOKEN,CTOKEN_ORIGINAL C---- br 99.03: watch when changing this: hard_coded in GETARRAYINDEX CHARACTER*200 SEQNAMES_UPPER(MAXALIGNS_LOC) + CHARACTER*(CODELEN_LOC) CNAME CHARACTER*(LINELEN) LINE, TMPSTRING,TMPSEQ CHARACTER*(MAXRES_LOC) STRAND CHARACTER*20 CFORMAT LOGICAL INSIDE(MAXALIGNS_LOC) LOGICAL INGAP(MAXALIGNS_LOC) LOGICAL NO_ENDGAPS LOGICAL LCHECK, LTYPE, LNRES_ALI LOGICAL NEXT_IS_NRES_ALI, NEXT_IS_CHECK, NEXT_IS_TYPE LOGICAL NEXT_IS_NAME, NEXT_IS_LEN, NEXT_IS_SEQCHECK LOGICAL NEXT_IS_WEIGHT *----------------------------------------------------------------------* C REFORMAT of: *.Frag C C Nfi.Msf MSF: 594 Type: P February 17, 1992 14:37 Check: 1709 .. C C Name: Cnfi02 Len: 594 Check: 7754 Weight: 1.00 C Name: Cnfi03 Len: 594 Check: 4932 Weight: 1.00 C C// C C 1 50 CCnfi02 MMYSPICLTQ DEFHPFIEAL LPHVRAIAYT WFNLQARKRK YFKKHEKRMS CCnfi03 MMYSPICLTQ DEFHPFIEAL LPHVRAIAYT WFNLQARKRK YFKKHEKRMS CGAPCHAR = '.' ERROR = .FALSE. CALL STRPOS(FILENAME,ISTART,ISTOP) ERRORMESSAGE = ' open error for file: ' // 1 FILENAME(MAX(ISTART,1):MAX(1,ISTOP)) CALL OPEN_FILE(KUNIT,FILENAME,'old,readonly',error) IF ( ERROR ) GOTO 99 C READ MSF IDENTIFICATION LINE C Nfi.Msf MSF: 594 Type: P February 17, 1992 14:37 Check: 1709 .. ERROR = .TRUE. ERRORMESSAGE = ' MSF identification line missing !! ' READ(KUNIT,'(A)',END = 99) LINE DO WHILE ( INDEX(LINE,'MSF: ') .EQ. 0 ) READ(KUNIT,'(A)',END = 99) LINE ENDDO LNRES_ALI = .FALSE. LCHECK = .FALSE. LTYPE = .FALSE. NEXT_IS_NRES_ALI = .FALSE. NEXT_IS_CHECK = .FALSE. NEXT_IS_TYPE = .FALSE. C DUMMY VALUE FOR "POSITION OF START OF NEXT WORD" FPOS = -1 C ITH WORD I = 1 CALL GETTOKEN(LINE,LINELEN,I,FPOS,CTOKEN) DO WHILE ( FPOS .NE. 0 ) CALL STRPOS(CTOKEN,ISTART,ISTOP) CALL LOWTOUP(CTOKEN, LEN(CTOKEN)) IF ( NEXT_IS_NRES_ALI ) THEN NEXT_IS_NRES_ALI = .FALSE. CALL MAKE_FORMAT_INT(ISTOP-ISTART+1,CFORMAT) READ(CTOKEN(ISTART:ISTOP),CFORMAT) NRES_ALI ELSE IF ( NEXT_IS_TYPE ) THEN TYPE = CTOKEN(ISTART:ISTOP) NEXT_IS_TYPE = .FALSE. ELSE IF ( NEXT_IS_CHECK ) THEN CALL MAKE_FORMAT_INT(ISTOP-ISTART+1,CFORMAT) READ(CTOKEN(ISTART:ISTOP),CFORMAT) MSFCHECK NEXT_IS_CHECK = .FALSE. ENDIF IF ( CTOKEN(ISTART:ISTOP) .EQ. 'MSF:' ) THEN LNRES_ALI = .TRUE. NEXT_IS_NRES_ALI = .TRUE. ELSE IF ( CTOKEN(ISTART:ISTOP) .EQ. 'TYPE:' ) THEN LTYPE = .TRUE. NEXT_IS_TYPE = .TRUE. ELSE IF ( CTOKEN(ISTART:ISTOP) .EQ. 'CHECK:' ) THEN LCHECK = .TRUE. NEXT_IS_CHECK = .TRUE. ENDIF I = I + 1 CALL GETTOKEN(LINE,LINELEN,I,FPOS,CTOKEN) ENDDO IF ( .NOT. LNRES_ALI ) THEN ERROR = .TRUE. ERRORMESSAGE = ' MSF identification line missing !! ' GOTO 99 ENDIF IF ( .NOT. LTYPE ) THEN ERROR = .TRUE. ERRORMESSAGE = ' Type identifier missing !! ' GOTO 99 ENDIF IF ( .NOT. LCHECK ) THEN ERROR = .TRUE. ERRORMESSAGE = ' CHECKSUM MISSING !! ' GOTO 99 ENDIF C READ SEQUENCE DESCRIPTION SECTION READ(KUNIT,'(A)',END = 99) LINE C Name: Cnfi02 Len: 594 Check: 7754 Weight: 1.00 ERROR = .TRUE. ERRORMESSAGE = ' Sequence description section missing !! ' DO WHILE ( INDEX(LINE,'Name: ') .EQ. 0 ) READ(KUNIT,'(A)',END = 99) LINE ENDDO ERROR = .FALSE. ERRORMESSAGE = ' Alignment missing !! ' C>>> NALIGN = 0 DO WHILE ( INDEX(LINE,'Name: ') .NE. 0 ) NALIGN = NALIGN + 1 IF ( NALIGN .GT. MAXALIGNS .OR. 1 NALIGN .GT. MAXALIGNS_LOC) THEN ERROR = .TRUE. ERRORMESSAGE = ' MAXALIGNS overflow in read_msf !' GOTO 99 ENDIF NEXT_IS_NAME = .FALSE. NEXT_IS_LEN = .FALSE. NEXT_IS_SEQCHECK = .FALSE. NEXT_IS_WEIGHT = .FALSE. C DUMMY VALUE FOR "POSITION OF START OF NEXT WORD" FPOS = -1 C ith word I = 1 CALL GETTOKEN(LINE,LINELEN,I,FPOS,CTOKEN_ORIGINAL) CTOKEN=CTOKEN_ORIGINAL DO WHILE ( FPOS .NE. 0 ) CALL STRPOS(CTOKEN_ORIGINAL,ISTART,ISTOP) CALL LOWTOUP(CTOKEN,LEN(CTOKEN)) IF ( NEXT_IS_NAME ) THEN NEXT_IS_NAME = .FALSE. SEQNAMES_UPPER(NALIGN)= CTOKEN(ISTART:ISTOP) SEQNAMES(NALIGN)= CTOKEN_ORIGINAL(ISTART:ISTOP) C WRITE(6,*)'CTOKEN(ISTART:ISTOP)>', CTOKEN(ISTART:ISTOP), C + '.',ISTART,':',ISTOP,'', C + CTOKEN_ORIGINAL(ISTART:ISTOP), '<' ELSE IF ( NEXT_IS_LEN ) THEN NEXT_IS_LEN = .FALSE. CALL MAKE_FORMAT_INT(ISTOP-ISTART+1,CFORMAT) READ(CTOKEN(ISTART:ISTOP),CFORMAT) ILEN NRES_ALI = MAX(NRES_ALI,ILEN) ELSE IF ( NEXT_IS_SEQCHECK ) THEN NEXT_IS_SEQCHECK = .FALSE. CALL MAKE_FORMAT_INT(ISTOP-ISTART+1,CFORMAT) READ(CTOKEN(ISTART:ISTOP),CFORMAT) SEQCHECK(NALIGN) ELSE IF ( NEXT_IS_WEIGHT ) THEN NEXT_IS_WEIGHT = .FALSE. READ(CTOKEN(ISTART:ISTOP),*) WEIGHT(NALIGN) ENDIF IF ( CTOKEN(ISTART:ISTOP) .EQ. 'NAME:' ) THEN NEXT_IS_NAME = .TRUE. ELSE IF ( CTOKEN(ISTART:ISTOP) .EQ. 'LEN:' ) THEN NEXT_IS_LEN = .TRUE. ELSE IF ( CTOKEN(ISTART:ISTOP) .EQ. 'CHECK:' ) THEN NEXT_IS_SEQCHECK = .TRUE. ELSE IF ( CTOKEN(ISTART:ISTOP) .EQ. 'WEIGHT:' ) THEN NEXT_IS_WEIGHT = .TRUE. ENDIF I = I + 1 CALL GETTOKEN(LINE,LINELEN,I,FPOS,CTOKEN_ORIGINAL) CTOKEN=CTOKEN_ORIGINAL ENDDO READ(KUNIT,'(A)',END = 99) LINE ENDDO ERROR = .FALSE. CALL MSFCHECKSEQ(SEQCHECK,NALIGN,TESTCHECK) IF ( TESTCHECK .NE. MSFCHECK ) THEN C ERROR = .TRUE. ERRORMESSAGE = 1 ' Total checksum incompatible with single checksums !!' WRITE(6,'(A)') ERRORMESSAGE c goto 99 ENDIF C SEARCH FOR "//" DIVIDER ERROR = .TRUE. ERRORMESSAGE = ' No proper MSFfile: divider missing !! ' DO WHILE ( INDEX(LINE,'//' ) .EQ. 0 ) READ(KUNIT,'(A)',END=99) LINE ENDDO ERROR = .FALSE. C READ MULTIPLE ALIGNMENT C 1 50 CCnfi02 MMYSPICLTQ DEFHPFIEAL LPHVRAIAYT WFNLQARKRK YFKKHEKRMS C initialize DO ISEQ = 1, NALIGN NSEQLINES(ISEQ)= 0 NRES(ISEQ)= 0 LASTOCCUPIED(ISEQ)= 0 INSIDE(ISEQ)= .FALSE. C TEMPORARY assignment! IF ( ISEQ .EQ. 1 ) THEN ALIPOINTER(ISEQ) = 1 ELSE ALIPOINTER(ISEQ) = ALIPOINTER(ISEQ-1)+NRES_ALI+1 ENDIF JFIR(ISEQ) = 1 JLAS(ISEQ) = 0 ENDDO ERROR = .TRUE. ERRORMESSAGE = ' ALIGNMENT MISSING !! ' C---- first line of alignment READ(KUNIT,'(A)',END=99) LINE C---- -------------------------------------------------- C---- now loop over all blocks C---- end if overflow of some array, or file read C---- LINELEN= maximal length of a line read C---- C---- C---- C---- -------------------------------------------------- ERROR = .FALSE. DO WHILE ( .TRUE. ) C------- get the first non-blank string in the line read (CNAME) C------- note: this is the protein name CALL GETTOKEN(LINE,LINELEN,1,FPOS,CNAME) C------- lkajan: The below LOWTOUP is not a lucky choice: C 3h4p_A .........I TVFSPEGRLY QVEYAREAVR RGTTAIGIAC KDGVVLAVDR C cf. C 3h4p_a .......... .......... .......... ..TTTVGLIC DDAVILATDK C not /quite/ the same protein... we need to keep the case so we can C distinguish between these. C lkajan, see above CALL LOWTOUP(CNAME, LEN(CNAME) ) C------- get the number of the protein with that name (CNAME) C------- out: NPROT_THIS=number of protein with name CNAME C------- NPROT_THIS=0 if none matched! CALL GETARRAYINDEX(SEQNAMES,CNAME,NALIGN,NPROT_THIS) C WRITE(6,*)'SEQNAME ', CNAME, ' NPROT_THIS ',NPROT_THIS C------ one of the names found IF ( NPROT_THIS .GT. 0 ) THEN NSEQLINES(NPROT_THIS)=NSEQLINES(NPROT_THIS)+1 C---------- get the second non-blank string in the line read (TMPSEQ) C---------- note: this is the sequence CALL GETTOKEN(LINE,LINELEN,2,IBEG,TMPSEQ) CALL LOWTOUP(LINE,LEN(LINE)) C---- C---- loop over all characters of line read C---- DO IPOS=IBEG,LINELEN C------------- if current residue neither ' ' nor TAB IF ( LINE(IPOS:IPOS) .NE. ' ' .AND. 1 LINE(IPOS:IPOS) .NE. CHAR(0) ) THEN C---------------- count up protein length NRES(NPROT_THIS)=NRES(NPROT_THIS) + 1 IF ( NRES(NPROT_THIS) .GT. NRES_ALI ) THEN WRITE(6,'(A,I10,A,A,I10,A)') 1 '*** ERROR in READ_MSF: SEQUENCE LENGTH (', 2 NRES(NPROT_THIS),') EXCEEDS ', 3 'ALIGNMENT LENGTH (', 4 NRES_ALI,') GIVEN IN HEADER !!! ***' WRITE(6,*)'*** line=',LINE(1:LEN(LINE)) WRITE(6,*)'*** this=',NRES(NPROT_THIS), + ' > ',NRES_ALI,' (NRES_ALI)' STOP ENDIF C---------------- is gap IF ( LINE(IPOS:IPOS) .EQ. CGAPCHAR ) THEN INGAP(NPROT_THIS)= .TRUE. IF ( INSIDE(NPROT_THIS) ) THEN ITMP=ALIPOINTER(NPROT_THIS)+NRES(NPROT_THIS)-1 ALISEQ(ITMP)=CGAPCHAR ENDIF C---------------- is NOT gap ELSE INGAP(NPROT_THIS) = .FALSE. LASTOCCUPIED(NPROT_THIS) = NRES(NPROT_THIS) IF ( .NOT. INSIDE(NPROT_THIS) ) THEN INSIDE(NPROT_THIS) = .TRUE. IFIR(NPROT_THIS) = NRES(NPROT_THIS) ENDIF JLAS(NPROT_THIS) = JLAS(NPROT_THIS) + 1 ALISEQ(ALIPOINTER(NPROT_THIS)+NRES(NPROT_THIS)-1)= + LINE(IPOS:IPOS) ENDIF ENDIF ENDDO ENDIF C else do nothing - blank or scale line READ(KUNIT,'(A)',END=99) LINE ENDDO 99 CONTINUE IF ( .NOT. ERROR ) THEN DO ISEQ=2,NALIGN IF (NSEQLINES(ISEQ) .NE. NSEQLINES(1)) THEN ERROR= .TRUE. ERRORMESSAGE = 1 ' Inconsistent sequence names !!' STOP ENDIF ENDDO ENDIF IF ( ERROR ) THEN WRITE(6,'(A)') ERRORMESSAGE RETURN ENDIF NO_ENDGAPS = .TRUE. DO ISEQ = 1,NALIGN NO_ENDGAPS = NO_ENDGAPS .AND. ( .NOT. INGAP(ISEQ)) ILAS(ISEQ) = LASTOCCUPIED(ISEQ) ENDDO C delete n- and c-terminal gaps from aliseq; C set ifir and ilas accordingly; C set pointers to alignments C 1.6.94 : C truncate NRES_ALI to be the last position occupied in at least one C ........ one of the sequences ! DIFF = 0 CFREE = NRES_ALI + 1 IPOS = 1 DO ISEQ = 1,NALIGN ALIPOINTER(ISEQ) = IPOS DIFF = DIFF + IFIR(ISEQ) - 1 I = IFIR(ISEQ) DO WHILE ( I .LE. ILAS(ISEQ) ) ALISEQ(IPOS) = ALISEQ(IPOS+DIFF) I = I + 1 IPOS = IPOS + 1 ENDDO ALISEQ(IPOS) = '/' IPOS = IPOS + 1 DIFF = DIFF + ( NRES_ALI - ILAS(ISEQ) ) C SMALLEST DISTANCE OF LAST OCCUPIED POSITION TO LAST ALIGNMENT POSITION C .. SHOULD BE ZERO, IF AT LEAST ONE SEQUENCE EXTENDS TO THE VERY END CFREE = MIN(CFREE,(NRES_ALI - ILAS(ISEQ)) ) ENDDO IF ( CFREE .GT. 0 ) THEN WRITE(6,'(1X,A)') 1 ' *** WARNING : empty c-terminal positions truncated ***' NRES_ALI = NRES_ALI - CFREE ENDIF ERROR = .FALSE. DO ISEQ = 1, NALIGN STRAND = ' ' CALL GET_SEQ_FROM_ALISEQ(ALISEQ,IFIR,ILAS,ALIPOINTER, 1 NRES_ALI,ISEQ,STRAND,NREAD, 2 ERROR ) IF ( NO_ENDGAPS ) THEN CALL CHECKSEQ(STRAND,1,ILAS(ISEQ),TESTCHECK) ELSE CALL CHECKSEQ(STRAND,1,NRES_ALI,TESTCHECK) ENDIF IF ( TESTCHECK .NE. SEQCHECK(ISEQ) ) THEN C ERROR = .TRUE. CALL STRPOS(SEQNAMES_UPPER(ISEQ),ISTART,ISTOP) ERRORMESSAGE = 1 ' checksum of sequence '//seqnames(iseq)(istart:istop)// 2 ' is not the same as checksum given in the header !' C goto 99 ENDIF ENDDO CLOSE(KUNIT) RETURN END C END READ_MSF C...................................................................... C...................................................................... C SUB READ_PIR SUBROUTINE READ_PIR(KIN,INFILE,CTRANS,RLEN,NRES,ACCESSION, 1 COMPND,SEQ,TRUNCATED,ERROR) C 14.5.93 C>P1; test Ctest.pir ( test.pep from: 1 to: 13 ) C A A A A A A A A A A A A A * IMPLICIT NONE C IMPORT INTEGER KIN, RLEN CHARACTER*(*) CTRANS,INFILE C EXPORT INTEGER NRES CHARACTER*(*) ACCESSION,COMPND, SEQ LOGICAL TRUNCATED,ERROR C INTERNAL INTEGER LINELEN PARAMETER (LINELEN= 1000) INTEGER IPOS, ISTART, ISTOP C INTEGER JSTART, JSTOP CHARACTER*1 C CHARACTER*(LINELEN) LINE c logical empty *----------------------------------------------------------------------* ERROR = .FALSE. ISTOP=0 C try to open outfile; return if unsuccessful CALL OPEN_FILE(KIN,INFILE,'old,readonly',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE DO WHILE ( LINE(1:1) .NE. '>' ) READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO ISTOP=INDEX(LINE,' ')-1 ACCESSION(1:LEN(ACCESSION))=LINE(2:ISTOP) c istart=index(line,'|')+1 c if ( istart .gt. 1) then c istop=index(line(istart:),'|')-1 c if ( istop .gt. 0) then c ACCESSION(1:len(ACCESSION))=line(istart:istart+istop-1) c else c ACCESSION(1:len(ACCESSION))=line(istart:) c endif c else c ACCESSION(1:len(ACCESSION))=line(2:) c ENDIF C ?? one comment line ?? always ?? READ(KIN,'(A)',ERR=1,END=2) LINE CALL STRPOS(LINE,ISTART,ISTOP) IF ( ISTOP .GT. 0 ) THEN COMPND = LINE(ISTART:ISTOP) ELSE COMPND = ' ' ENDIF c if ( empty ) then c call strpos(line,istart,istop) c if ( istop .gt. 0 ) then c empty = .false. c compnd = line(istart:istop) c endif c endif c if ( empty ) then c call strpos(infile,istart,istop) c compnd = infile(istart:istop) c endif NRES = 0 READ(KIN,'(A)',ERR=1,END=2) LINE DO WHILE ( .NOT. TRUNCATED ) CALL STRPOS(LINE,ISTART,ISTOP) IF ( ISTOP .NE. 0 ) THEN DO IPOS = ISTART,ISTOP C = LINE(IPOS:IPOS) CALL LOWTOUP(C,1) IF ( INDEX(CTRANS,C) .NE. 0 ) THEN TRUNCATED = ( NRES+1 .GT. LEN(SEQ) ) IF ( .NOT. TRUNCATED ) THEN NRES = NRES + 1 SEQ(NRES:NRES) = C ENDIF ELSE IF (C .EQ. '*') THEN GOTO 2 ENDIF ENDDO ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO GOTO 2 1 ERROR = .TRUE. WRITE(6,'(a)') ' ** error reading PIR file **' 2 CONTINUE CLOSE(KIN) RETURN END C END READ_PIR C...................................................................... C...................................................................... C SUB READ_REAL SUBROUTINE READ_REAL(STRING,XREAL) CHARACTER*(*) STRING REAL XREAL INTEGER EXPONENT IEXP=INDEX(STRING,'E') JEXP=INDEX(STRING,'E') CALL STRPOS(STRING,IBEG,IEND) IF (IEXP .EQ.0 .AND. JEXP .EQ. 0) THEN CALL READ_REAL_FROM_STRING(STRING(IBEG:IEND),XREAL) ELSE IPOS=MAX(IEXP,JEXP) CALL READ_INT_FROM_STRING(STRING(IPOS+1:IEND),EXPONENT) CALL READ_REAL_FROM_STRING(STRING(IBEG:IPOS-1),XREAL) XEXPONENT=FLOAT(EXPONENT) XREAL=XREAL * (10.0**XEXPONENT) ENDIF RETURN END C END READ_REAL C...................................................................... C...................................................................... C SUB READ_REAL_FROM_STRING SUBROUTINE READ_REAL_FROM_STRING(CSTRING,XNUMBER) C import CHARACTER*(*) CSTRING C export REAL XNUMBER C internal CHARACTER*100 CFORMAT,CTEMP INTEGER IPOS XNUMBER=0.0 CALL STRPOS(CSTRING,ISTART,ISTOP) ITOTAL=ISTOP-ISTART+1 IAFTER=0 IPOS=INDEX(CSTRING,'.') IF (IPOS .GT. 0) THEN IAFTER=ISTOP-IPOS ENDIF CALL CONCAT_STRING_INT('(F',ITOTAL,CTEMP) CALL CONCAT_STRINGS(CTEMP,'.',CFORMAT) CALL CONCAT_STRING_INT(CFORMAT,IAFTER,CTEMP) CALL CONCAT_STRINGS(CTEMP,')',CFORMAT) READ(CSTRING(ISTART:ISTOP),CFORMAT)XNUMBER RETURN END C END READ_REAL_FROM_STRING C...................................................................... C...................................................................... C SUB READ_SEQ_FROM_DSSP SUBROUTINE READ_SEQ_FROM_DSSP(KIN,INFILE,CHAINS,CTRANS,RLEN, 1 SEQ,STRUC,ACC,PDBNO,COMPND,NRES, 2 LACCZERO,TRUNCATED,ERROR) C 18. Dec 96, hackedihack, fixed a problem with dssp files with more C than 9 chains, C NOTE: this whole routine is bullsh...; RS 96 C 14.5.93 Ulrike Goebel C 1 1 O A 0 0 81 0, 0.0 149,-0.2 0, 0.0 104,-0.1 IMPLICIT NONE C IMPORT INTEGER KIN,RLEN INTEGER PDBNO(*), ACC(*) CHARACTER*(*) INFILE C EXPORT INTEGER NRES CHARACTER*(*) CHAINS CHARACTER*(*) COMPND, CTRANS, SEQ, STRUC LOGICAL LACCZERO,TRUNCATED,ERROR C INTERNAL INTEGER LINELEN PARAMETER (LINELEN= 1000) INTEGER N,ISTART,ISTOP,IPOS,JPOS,ICHAIN,JCHAIN,NREAD CHARACTER*1 C CHARACTER*(LINELEN) LINE CHARACTER*1000 NUMBERS CHARACTER*1000 TCHAINS,T2CHAINS C ebi version 02.98 (and not: tchains, t2chains) C CHARACTER*10 NUMBERS ERROR = .FALSE. C try to open outfile; return if unsuccessful CALL OPEN_FILE(KIN,INFILE,'old,readonly',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF C not in ebi version 02.98 TCHAINS=' ' T2CHAINS=' ' C end ebi version 02.98 LACCZERO = .TRUE. NUMBERS = '01234567891011121314151617181920'// 1 '21222324252627282930313233343536'// 2 '37383940414243444546474849505152'// 3 '53545556575859606162636465666768'// 4 '69707172737475767778798081828384'// 5 '858687888990919293949596979899100' C in ebi version 02.98 C NUMBERS = '0123456789' READ(KIN,'(A)',ERR=1,END=2) LINE DO WHILE ( LINE(3:12) .NE. '# RESIDUE' ) IF (LINE(1:6) .EQ. 'COMPND' ) THEN CALL STRPOS(LINE,ISTART,ISTOP) COMPND = LINE(7:MIN(200,ISTOP)) ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO C .. read pointer is now on first data line READ(KIN,'(A)',ERR=1,END=2) LINE NRES = 0 ICHAIN = 1 JCHAIN = 1 CALL STRPOS(CHAINS,ISTART,ISTOP) TCHAINS(1:)=CHAINS(ISTART:ISTOP) IPOS=1 JPOS=INDEX(TCHAINS,' ')-1 C last 3 lines not in ebi version 02.98, instead: C CALL GETTOKEN(CHAINS,LEN(CHAINS),1,IPOS,CHAIN) DO WHILE ( IPOS .LE. ISTOP ) C = LINE(12:12) IF ( INDEX(NUMBERS,TCHAINS(1:JPOS) ) .NE. 0 ) THEN READ(TCHAINS(1:JPOS),'(I2)') N C 2 lines in ebi version 02.98, instead: C IF ( INDEX(NUMBERS,CHAIN ) .NE. 0 ) THEN C READ(CHAIN,'(I1)') N IF ( N .EQ. ICHAIN ) THEN CALL READ_DSSPCHAIN(KIN,NRES,CTRANS,RLEN,LINE,SEQ, 1 STRUC,ACC,PDBNO,NREAD,LACCZERO, 2 TRUNCATED,ERROR) NRES = NRES + NREAD ICHAIN = ICHAIN + 1 JCHAIN = JCHAIN + 1 READ(KIN,'(A)',ERR=1,END=2) LINE ELSE CALL SKIP_DSSPCHAIN(KIN,RLEN,LINE,ERROR) ICHAIN = ICHAIN + 1 READ(KIN,'(A)',ERR=1,END=2) LINE ENDIF ELSE IF ( C .EQ. CHAINS .OR. CHAINS .EQ. ' ') THEN C line in ebi version 02.98, instead: C IF ( C .EQ. CHAIN .OR. CHAINS .EQ. ' ') THEN CALL READ_DSSPCHAIN(KIN,NRES,CTRANS,RLEN,LINE,SEQ, 1 STRUC,ACC,PDBNO,NREAD,LACCZERO, 2 TRUNCATED,ERROR) NRES = NRES + NREAD ICHAIN = ICHAIN + 1 JCHAIN = JCHAIN + 1 READ(KIN,'(A)',ERR=1,END=2) LINE ELSE CALL SKIP_DSSPCHAIN(KIN,RLEN,LINE,ERROR) ICHAIN = ICHAIN + 1 READ(KIN,'(A)',ERR=1,END=2) LINE ENDIF ENDIF T2CHAINS(1:)=TCHAINS(JPOS+1:) CALL STRPOS(T2CHAINS,ISTART,ISTOP) TCHAINS(1:)=T2CHAINS(ISTART:ISTOP) JPOS=INDEX(TCHAINS,' ')-1 C last 4 lines in ebi version 02.98, instead: C CALL STRPOS(CHAINS,ISTART,ISTOP) C CALL GETTOKEN(CHAINS,LEN(CHAINS),JCHAIN,IPOS,CHAIN) ENDDO IF ( SEQ(NRES:NRES) .EQ. '!' ) THEN SEQ(NRES:NRES) = ' ' STRUC(NRES:NRES) = ' ' NRES = NRES - 1 ENDIF GOTO 2 1 ERROR = .TRUE. WRITE(6,'(a)') '*** ERROR reading DSSP file (READ_SEQ_FROM_DSSP)' 2 CONTINUE CLOSE(KIN) RETURN END C END READ_SEQ_FROM_DSSP C...................................................................... C...................................................................... C SUB READ_SEQ_FROM_HSSP SUBROUTINE READ_SEQ_FROM_HSSP(KIN,INFILE,CHAINS,CTRANS,RLEN,SEQ, 1 STRUC,ACC,PDBNO,COMPND,NRES,LACCZERO,TRUNCATED,ERROR ) C 14.5.93 C 1 1 O A 0 0 81 11 13 AAAAAAAA IMPLICIT NONE C IMPORT INTEGER KIN, RLEN CHARACTER*(*) CHAINS CHARACTER*(*) INFILE C EXPORT INTEGER NRES INTEGER PDBNO(*), ACC(*) CHARACTER*(*) COMPND, CTRANS, SEQ, STRUC LOGICAL LACCZERO,TRUNCATED,ERROR C INTERNAL INTEGER LINELEN PARAMETER (LINELEN= 1000) INTEGER N,ISTART,ISTOP,IPOS,ICHAIN,JCHAIN,NREAD CHARACTER*1 C,CHAIN CHARACTER*(LINELEN) LINE CHARACTER*10 NUMBERS *----------------------------------------------------------------------* ERROR = .FALSE. C try to open outfile; return if unsuccessful CALL OPEN_FILE(KIN,INFILE,'old,readonly',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF ERROR = .FALSE. LACCZERO = .TRUE. NUMBERS = '0123456789' READ(KIN,'(A)',ERR=1,END=2) LINE DO WHILE ( LINE(1:13) .NE. '## ALIGNMENTS' ) IF (LINE(1:6) .EQ. 'COMPND' ) THEN CALL STRPOS(LINE,ISTART,ISTOP) COMPND = LINE(7:MIN(200,ISTOP)) ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO C .. skip 1 line READ(KIN,'(A)',ERR=1,END=2) LINE C .. read pointer is now on first data line READ(KIN,'(A)',ERR=1,END=2) LINE NRES = 0 ICHAIN = 1 JCHAIN = 1 CALL STRPOS(CHAINS,ISTART,ISTOP) CALL GETTOKEN(CHAINS,LEN(CHAINS),1,IPOS,CHAIN) DO WHILE ( IPOS .LE. ISTOP ) C = LINE(13:13) IF ( INDEX(NUMBERS,CHAIN ) .NE. 0 ) THEN READ(CHAIN,'(I1)') N IF ( N .EQ. ICHAIN ) THEN CALL READ_HSSPCHAIN(KIN,NRES,CTRANS,RLEN,LINE,SEQ, 1 STRUC,ACC,PDBNO,NREAD,LACCZERO, 2 TRUNCATED,ERROR) NRES = NRES + NREAD ICHAIN = ICHAIN + 1 JCHAIN = JCHAIN + 1 READ(KIN,'(A)',ERR=1,END=2) LINE ELSE CALL SKIP_HSSPCHAIN(KIN,RLEN,LINE,ERROR) ICHAIN = ICHAIN + 1 READ(KIN,'(A)',ERR=1,END=2) LINE ENDIF ELSE IF ( C .EQ. CHAIN ) THEN CALL READ_HSSPCHAIN(KIN,NRES,CTRANS,RLEN,LINE,SEQ, 1 STRUC,ACC,PDBNO,NREAD,LACCZERO, 2 TRUNCATED,ERROR) NRES = NRES + NREAD ICHAIN = ICHAIN + 1 JCHAIN = JCHAIN + 1 READ(KIN,'(A)',ERR=1,END=2) LINE ELSE CALL SKIP_HSSPCHAIN(KIN,RLEN,LINE,ERROR) ICHAIN = ICHAIN + 1 READ(KIN,'(A)',ERR=1,END=2) LINE ENDIF ENDIF CALL STRPOS(CHAINS,ISTART,ISTOP) CALL GETTOKEN(CHAINS,LEN(CHAINS),JCHAIN,IPOS,CHAIN) ENDDO IF ( SEQ(NRES:NRES) .EQ. '!' ) THEN SEQ(NRES:NRES) = ' ' STRUC(NRES:NRES) = ' ' NRES = NRES - 1 ENDIF GOTO 2 1 ERROR = .TRUE. WRITE(6,'(a)') '*** ERROR reading HSSP file (read_seq_from_hssp)' 2 CONTINUE CLOSE(KIN) RETURN END C END READ_SEQ_FROM_HSSP C...................................................................... C...................................................................... C SUB READ_STAR SUBROUTINE READ_STAR(KIN,INFILE,CTRANS,RLEN,NRES,SEQ, 1 TRUNCATED,ERROR) C 7.12.93 C*test.star ( test.pep from: 1 to: 13 ) C A A A A A A A A A A A A A IMPLICIT NONE C IMPORT INTEGER KIN, RLEN CHARACTER*(*) CTRANS,INFILE C EXPORT INTEGER NRES CHARACTER*(*) SEQ LOGICAL TRUNCATED,ERROR C INTERNAL INTEGER LINELEN PARAMETER (LINELEN= 1000) INTEGER IPOS, ISTART, ISTOP C INTEGER JSTART, JSTOP CHARACTER*1 C CHARACTER*(LINELEN) LINE C LOGICAL EMPTY *----------------------------------------------------------------------* ERROR = .FALSE. C try to open outfile; return if unsuccessful CALL OPEN_FILE(KIN,INFILE,'old,readonly',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF NRES = 0 READ(KIN,'(A)',ERR=1,END=2) LINE DO WHILE ( .NOT. TRUNCATED ) IF ( LINE(1:1) .NE. '*' ) THEN CALL STRPOS(LINE,ISTART,ISTOP) IF ( ISTOP .NE. 0 ) THEN DO IPOS = ISTART,ISTOP C = LINE(IPOS:IPOS) CALL LOWTOUP(C,1) IF ( INDEX(CTRANS,C) .NE. 0 ) THEN TRUNCATED = ( NRES+1 .GT. LEN(SEQ) ) IF ( .NOT. TRUNCATED ) THEN NRES = NRES + 1 SEQ(NRES:NRES) = C ENDIF ENDIF ENDDO ENDIF ENDIF READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO GOTO 2 1 ERROR = .TRUE. WRITE(6,'(A)') ' ** ERROR READING STAR FILE **' 2 CONTINUE CLOSE(KIN) RETURN END C END READ_STAR C...................................................................... C...................................................................... C SUB READHSSP SUBROUTINE READHSSP(IUNIT,HSSPFILE,ERROR,MAXRES,MAXALIGNS, + MAXCORE,MAXINS,MAXINSBUFFER,PDBID,HEADER,COMPOUND, + SOURCE,AUTHOR,SEQLENGTH,NCHAIN,KCHAIN,CHAINREMARK, + NALIGN,EXCLUDEFLAG,EMBLID,STRID,IDE,SIM,IFIR,ILAS, + JFIR,JLAS,LALI,NGAP,LGAP,LENSEQ,ACCESSION,PROTNAME, + PDBNO,PDBSEQ,CHAINID,SECSTR,COLS,SHEETLABEL,BP1,BP2, + ACC,NOCC,VAR,ALISEQ,ALIPOINTER,SEQPROF,NDEL,NINS, + ENTROPY,RELENT,CONSWEIGHT,INSNUMBER,INSALI, + INSPOINTER,INSLEN,INSBEG_1,INSBEG_2,INSBUFFER, + LCONSERV,LHSSP_LONG_ID) C C Reinhard Schneider 1989, BIOcomputing EMBL, D-6900 Heidelberg, FRG C please report any bug, e-mail (INTERNET): C schneider@EMBL-Heidelberg.DE C or sander@EMBL-Heidelberg.DE C======================================================================= C INCREASE THE NUMBER OF FOLLOWING THREE PARAMETER IN THE CALLING C PROGRAM IF NECESSARY C======================================================================= C maxaligns = maximal number of alignments in a HSSP-file C maxres= maximal number of residues in a PDB-protein C maxcore= maximal space for storing the alignments C======================================================================= C C maxaa= 20 amino acids C nblocksize= number of alignments in one line C pdbid= Brookhaven Data Bank identifier C header,compound,source,author= informations about the PDB-protein C pdbseq= amino acid sequence of the PDB-protein C chainid= chain identifier (chain A etc.) C secstr= DSSP secondary structure summary C bp1,bp2= beta-bridge partner C cols= DSSP hydrogen bonding patterns for turns and helices, C geometrical bend, chirality, one character name of beta-ladder C and of beta-sheet C sheetlabel= chain identifier of beta bridge partner C seqlength= number of amino acids in the PDB-protein C pdbno= residue number as in PDB file C nchain= number of different chains in pdbid.DSSP data set C kchain= number of chains used in HSSP data set C nalign= number of alignments C acc= solvated residue surface area in A**2 C emblid= EMBL/SWISSPROT identifier of the alignend protein C strid= if the 3-D structure of this protein is known, then strid C (structure ID)is the Protein Data Bank identifier as taken C from the EMBL/SWISSPROT entry C protname= one line description of alignend protein C aliseq= sequential storage for the alignments C alipointer= points to the beginning of alignment X ( 1>= X <=nalign ) C ifir,ilas= first and last position of the alignment in the test C protein C jfir,jlas= first and last position of the alignment in the alignend C protein C lali= length of the alignment excluding insertions and deletions C ngap= number of insertions and deletions in the alignment C lgap= total length of all insertions and deletions C lenseq= length of the entire sequence of the alignend protein C ide= percentage of residue identity of the alignment C var= sequence variability as derived from the nalign alignments C seqprof= relative frequency for each of the 20 amino acids C nocc= number of alignend sequences spanning this position (including C the test sequence C ndel= number of sequences with a deletion in the test protein at this C position C nins= number of sequences with an insertion in the test protein at C this position C entropy= entropy measure of sequence variability at this position C relent= relative entropy (entropy normalized to the range 0-100) C consweight= conservation weight C======================================================================= IMPLICIT NONE INTEGER MAXALIGNS,MAXRES,MAXCORE,MAXINS,MAXAA,NBLOCKSIZE INTEGER MAXINSBUFFER PARAMETER (MAXAA= 20) PARAMETER (NBLOCKSIZE= 70) C============================ import ================================== CHARACTER HSSPFILE*(*) INTEGER IUNIT LOGICAL ERROR C ATTRIBUTES OF SEQUENCE WITH KNOWN STRUCTURE CHARACTER*(*) PDBID,HEADER,COMPOUND,SOURCE,AUTHOR CHARACTER PDBSEQ(MAXRES),CHAINID(MAXRES),SECSTR(MAXRES) C.......LENGHT*7 CHARACTER*(*) COLS(MAXRES),CHAINREMARK CHARACTER SHEETLABEL(MAXRES) INTEGER SEQLENGTH,PDBNO(MAXRES),NCHAIN,KCHAIN,NALIGN INTEGER BP1(MAXRES),BP2(MAXRES),ACC(MAXRES) C ATTRIBUTES OF ALIGNEND SEQUENCES CHARACTER*(*) EMBLID(MAXALIGNS),STRID(MAXALIGNS) CHARACTER*(*) ACCESSION(MAXALIGNS),PROTNAME(MAXALIGNS) CHARACTER ALISEQ(MAXCORE) CHARACTER EXCLUDEFLAG(MAXALIGNS) INTEGER ALIPOINTER(MAXALIGNS), + IFIR(MAXALIGNS),ILAS(MAXALIGNS),JFIR(MAXALIGNS), + JLAS(MAXALIGNS),LALI(MAXALIGNS),NGAP(MAXALIGNS), + LGAP(MAXALIGNS),LENSEQ(MAXALIGNS) REAL IDE(MAXALIGNS),SIM(MAXALIGNS) C ATTRIBUTES OF PROFILE INTEGER VAR(MAXRES),SEQPROF(MAXRES,MAXAA),RELENT(MAXRES), + NOCC(MAXRES),NDEL(MAXRES),NINS(MAXRES), + INSNUMBER,INSALI(MAXINS),INSPOINTER(MAXINS), + INSLEN(MAXINS),INSBEG_1(MAXINS),INSBEG_2(MAXINS) REAL ENTROPY(MAXRES),CONSWEIGHT(MAXRES) CHARACTER INSBUFFER(MAXINSBUFFER) LOGICAL LCONSERV,LHSSP_LONG_ID C======================================================================= C internal INTEGER MAXALIGNS_LOC PARAMETER (MAXALIGNS_LOC= 8765) C PARAMETER (MAXALIGNS_LOC= 12345) C PARAMETER (MAXALIGNS_LOC= 21987) CHARACTER CTEMP*(NBLOCKSIZE),TEMPNAME*200 CHARACTER*200 LINE CHARACTER CHAINSELECT LOGICAL LCHAIN INTEGER ICHAINBEG,ICHAINEND,NALIGNORG, + I,J,K,IPOS,ILEN,NRES,IRES,NBLOCK,IALIGN,IBLOCK, + IALI,IBEG,IEND,IPOINTER(300000),IPOINT,IINS C---- ------------------------------------------------------------ C---- initialise C---- ------------------------------------------------------------ C ORDER OF AMINO ACID SYMBOLS IN THE HSSP SEQUENCE PROFILE BLOCK C PROFILESEQ='VLIMFWYGAPSTCHRKQEND' IINS= 0 ERROR=.FALSE. NALIGN=0 CHAINREMARK=' ' CHAINSELECT=' ' DO I=1,MAXINSBUFFER INSBUFFER(I)=' ' ENDDO DO I=1,MAXALIGNS_LOC IPOINTER(I)=0 ENDDO LCHAIN=.FALSE. LHSSP_LONG_ID = .FALSE. TEMPNAME(1:)=HSSPFILE I=INDEX(TEMPNAME,'_!_') J=INDEX(TEMPNAME,'hssp_') IF (I.NE.0) THEN TEMPNAME(1:)=HSSPFILE(1:I-1) LCHAIN=.TRUE. READ(HSSPFILE(I+3:),'(A1)')CHAINSELECT WRITE(6,*)'*** ReadHSSP: extract the chain: ',chainselect ELSE IF (J.NE.0) THEN TEMPNAME(1:)=HSSPFILE(1:J+3) LCHAIN=.TRUE. READ(HSSPFILE(J+5:),'(A1)')CHAINSELECT WRITE(6,*)'*** ReadHSSP: extract the chain: ',chainselect ENDIF CALL OPEN_FILE(IUNIT,TEMPNAME,'old,readonly',error) IF (ERROR) THEN WRITE(6,'(A,A)')'*** ERROR READHSSP failed opening file:', + TEMPNAME GOTO 99 END IF READ(IUNIT,'(A)',ERR=99)LINE C check if it is a HSSP-file and get the release number for format flags IF (LINE(1:4).NE.'HSSP') THEN WRITE(6,'(A)')' ERROR: is not a HSSP-file' ERROR=.TRUE. RETURN ENDIF C read in PDBID etc. DO WHILE(LINE(1:6).NE.'PDBID') READ(IUNIT,'(A)',ERR=99)LINE ENDDO READ(LINE,'(11X,A)',ERR=99)PDBID DO WHILE(LINE(1:6).NE.'HEADER') READ(IUNIT,'(A)',ERR=99)LINE IF (INDEX(LINE,'LONG-ID').NE.0) THEN IF (INDEX(LINE,'YES').NE.0) THEN LHSSP_LONG_ID = .TRUE. ENDIF ENDIF ENDDO READ(LINE ,'(11X,A)',ERR=99)HEADER READ(IUNIT,'(11X,A)',ERR=99)COMPOUND READ(IUNIT,'(11X,A)',ERR=99)SOURCE READ(IUNIT,'(11X,A)',ERR=99)AUTHOR READ(IUNIT,'(11X,I4)',ERR=99)SEQLENGTH READ(IUNIT,'(11X,I4)',ERR=99)NCHAIN IF (CHAINSELECT .NE. ' ')NCHAIN=1 KCHAIN=NCHAIN READ(IUNIT,'(A)',ERR=99)LINE IF (INDEX(LINE,'KCHAIN').NE.0) THEN READ(LINE,'(11X,I4,A)',ERR=99)KCHAIN,CHAINREMARK READ(IUNIT,'(11X,I4)',ERR=99)NALIGNORG ELSE READ(LINE,'(11X,I4)',ERR=99)NALIGNORG ENDIF C if HSSP-file contains no alignments return IF (NALIGNORG.EQ.0) THEN WRITE(6,'(A)')'*** HSSP-file contains no alignments ***' CLOSE(IUNIT) c error=.true. RETURN ENDIF C parameter overflow handling IF (NALIGNORG.GT.MAXALIGNS) THEN WRITE(6,'(A)')'*** HSSP-file contains too many alignments **' WRITE(6,'(A)')'*** INCREASE MAXALIGNS IN COMMOM BLOCK ***' CLOSE(IUNIT) ERROR=.TRUE. RETURN ENDIF IF (NALIGNORG .GT. MAXALIGNS_LOC) THEN WRITE(6,*)'*** READHSSP: MAXALIGNS overflow, increase to >', + NALIGNORG STOP ENDIF IF (SEQLENGTH+KCHAIN-1.GT.MAXRES) THEN WRITE(6,'(A)')'*** PDB-sequence in HSSP-file too long ***' WRITE(6,'(A)')'*** INCREASE MAXRES ***' WRITE(6,'(A,I6,A,I6)') + 'need: ',seqlength+kchain-1,' limit is: ',maxres CLOSE(IUNIT) ERROR=.TRUE. RETURN ENDIF C number of sequence positions is number of residues + number of chains C chain break is indicated by a '!' NRES=SEQLENGTH+KCHAIN-1 ICHAINBEG=1 ICHAINEND=NRES IF (LCHAIN) THEN C search for ALIGNMENT-block DO WHILE (LINE(1:13).NE.'## ALIGNMENTS') READ(IUNIT,'(A)',ERR=99)LINE ENDDO READ(IUNIT,'(A)',ERR=99)LINE ICHAINBEG=0 ICHAINEND=0 C read till end ; some PDB-chains have DSSP-chain breaks !! DO I=1,NRES READ(IUNIT,'(7X,I4,1X,A1)',ERR=99)PDBNO(I),CHAINID(I) IF (CHAINID(I) .EQ. CHAINSELECT) THEN IF (ICHAINBEG .EQ. 0)ICHAINBEG=I ICHAINEND=I ENDIF ENDDO C increment chain number for artificial chain breaks DO I=ICHAINBEG,ICHAINEND IF (CHAINID(I) .NE. CHAINSELECT)NCHAIN=NCHAIN+1 ENDDO REWIND(IUNIT) SEQLENGTH=ICHAINEND-ICHAINBEG+1 NRES=SEQLENGTH+NCHAIN-1 ENDIF C search for the PROTEINS block LINE=' ' DO WHILE(LINE(1:11).NE.'## PROTEINS') READ(IUNIT,'(A)',ERR=99)LINE ENDDO READ(IUNIT,'(A)',ERR=99)LINE LCONSERV=.FALSE. IF (INDEX(LINE,'%WSIM').NE.0)LCONSERV=.TRUE. C read data about the alignments IALIGN=1 DO I=1,NALIGNORG IF ( LHSSP_LONG_ID) THEN READ(IUNIT,50,ERR=99) + EXCLUDEFLAG(IALIGN),EMBLID(IALIGN)(1:),STRID(IALIGN), + IDE(IALIGN),SIM(IALIGN),IFIR(IALIGN),ILAS(IALIGN), + JFIR(IALIGN),JLAS(IALIGN),LALI(IALIGN),NGAP(IALIGN), + LGAP(IALIGN),LENSEQ(IALIGN),ACCESSION(IALIGN), + PROTNAME(IALIGN) ELSE READ(IUNIT,100,ERR=99) + EXCLUDEFLAG(IALIGN),EMBLID(IALIGN)(1:),STRID(IALIGN), + IDE(IALIGN),SIM(IALIGN),IFIR(IALIGN),ILAS(IALIGN), + JFIR(IALIGN),JLAS(IALIGN),LALI(IALIGN),NGAP(IALIGN), + LGAP(IALIGN),LENSEQ(IALIGN),ACCESSION(IALIGN), + PROTNAME(IALIGN) ENDIF IF ( IFIR(IALIGN) .GE. ICHAINBEG .AND. + ILAS(IALIGN) .LE. ICHAINEND) THEN IPOINTER(I)=IALIGN IALIGN=IALIGN+1 ELSE WRITE(6,*)'READHSSP INFO: skip alignment: ',IALIGN ENDIF ENDDO 50 FORMAT (5X,A1,2X,A40,A6,1X,F5.2,1X,F5.2,8(1X,I4),2X,A10,1X,A) 100 FORMAT (5X,A1,2X,A12,A6,1X,F5.2,1X,F5.2,8(1X,I4),2X,A10,1X,A) NALIGN=IALIGN-1 WRITE(6,*)'--- number of alignments: ',nalign WRITE(6,*)'--- PROTEINS block done' C init pointer ; aliseq contains the alignments (amino acid symbols) C stored in the following way ; '/' separates alignments C alignment(x) is stored from: C aliseq(alipointer(x)) to aliseq(ilas(x)-ifir(x)) C aliseq(1........46/48.........60/62....) C | | | C | | | C pointer pointer pointer C ali 1 ali 2 ali 3 C C C init pointer IPOS=1 DO I=1,NALIGN IF (IPOS.GE.MAXCORE) THEN WRITE(6,'(A)')'*** ERROR: INCREASE MAXCORE ***' STOP ENDIF ALIPOINTER(I)=IPOS ILEN=ILAS(I)-IFIR(I)+1 IPOS=IPOS+ILEN ALISEQ(IPOS)='/' IPOS=IPOS+1 ENDDO IF (NALIGN .LT. MAXALIGNS) THEN ALIPOINTER(NALIGN+1)=IPOS+1 ENDIF C number of ALIGNMENTS-blocks IF (MOD(FLOAT(NALIGNORG),FLOAT(NBLOCKSIZE)).EQ. 0.0) THEN NBLOCK=NALIGNORG/NBLOCKSIZE ELSE NBLOCK=NALIGNORG/NBLOCKSIZE+1 ENDIF C search for ALIGNMENT-block DO WHILE (LINE(1:13).NE.'## ALIGNMENTS') READ(IUNIT,'(A)',ERR=99)LINE ENDDO READ(IUNIT,'(A)',ERR=99)LINE C loop over ALIGNMENTS blocks C read in pdbno, chainid, secstr etc. IALIGN=0 IALI=0 DO IBLOCK=1,NBLOCK IRES=1 DO I=1,NRES READ(IUNIT,200,ERR=99) + PDBNO(IRES),CHAINID(IRES),PDBSEQ(IRES),SECSTR(IRES), + COLS(IRES),BP1(IRES),BP2(IRES),SHEETLABEL(IRES), + ACC(IRES),NOCC(IRES),VAR(IRES),CTEMP 200 FORMAT(7X,I4,2(1X,A1),2X,A1,1X,A7,2(I4),A1,I4,2(1X,I4),2X,A) C fill up aliseq IF (I .GE. ICHAINBEG .AND. I .LE. ICHAINEND) THEN IRES=IRES+1 C write(6,*)'xx i=',i,' ires=',ires,' ra=',ibeg,'-',iend IF (PDBSEQ(I) .NE. '!') THEN CALL STRPOS(CTEMP,IBEG,IEND) DO IPOS=MAX(IBEG,1),MIN(NBLOCKSIZE,IEND) IALI=IALIGN+IPOS IF (CTEMP(IPOS:IPOS) .NE. ' '.AND. + IPOINTER(IALI) .GT. 0) THEN IF (IPOINTER(IALI) .LE. 0 ) THEN WRITE(6,*)'*** READHSSP: ipointer=', + ipointer(iali), + 'iali,ialign,ipos=',iali,ialign,ipos ENDIF J=ALIPOINTER(IPOINTER(IALI)) + + (I-IFIR(IPOINTER(IALI))) ALISEQ(J)=CTEMP(IPOS:IPOS) ENDIF ENDDO ENDIF ENDIF ENDDO IALIGN=IALIGN+NBLOCKSIZE DO K=1,2 READ(IUNIT,'(A)',ERR=99)LINE ENDDO ENDDO WRITE(6,*)' ALIGNMENTS block done' C read in sequence profile, entropy etc. IRES=1 DO I=1,NRES READ(IUNIT,300,ERR=99)(SEQPROF(IRES,K),K=1,MAXAA), + NOCC(IRES),NDEL(IRES),NINS(IRES),ENTROPY(IRES), + RELENT(IRES),CONSWEIGHT(IRES) IF (I .GE. ICHAINBEG .AND. I .LE. ICHAINEND) THEN IRES=IRES+1 ENDIF ENDDO 300 FORMAT(12X,20(I4),1X,3(1X,I4),1X,F7.3,3X,I4,2X,F4.2) WRITE(6,*)' PROFILE block done' IF (LCHAIN) THEN DO I=1,NALIGN IFIR(I)=IFIR(I)-ICHAINBEG+1 ILAS(I)=ILAS(I)-ICHAINBEG+1 ENDDO ENDIF C read the insertion list COLD check if next line (last line in a HSSP-file) contains a '//' READ(IUNIT,'(A)',ERR=99)LINE IF (INDEX (LINE,'## INSERTION') .NE. 0) THEN READ(IUNIT,'(A)',ERR=99)LINE READ(IUNIT,'(A)',ERR=99)LINE IINS=0 IPOINT=1 DO WHILE (LINE(1:2) .NE. '//') CALL STRPOS(LINE,IBEG,IEND) IF (LINE(6:6) .NE. '+') THEN IF (IINS+1 .GT. MAXINS) THEN WRITE(6,*)'*** ERROR: MAXINS OVERFLOW, INCREASE !' GOTO 99 ENDIF IINS=IINS+1 INSPOINTER(IINS)=IPOINT READ(LINE,'(4(I6))')INSALI(IINS),INSBEG_1(IINS), + INSBEG_2(IINS),INSLEN(IINS) IF (IPOINT + INSLEN(IINS)+3 .GT. MAXINSBUFFER) THEN WRITE(6,*) + '*** ERROR: MAXINSBUFFER overflow, increase !' GOTO 99 c else c insbuffer(ipoint:)=line(26:iend) c ipoint=ipoint+inslen(iins)+3 ENDIF c else c call strpos(insbuffer,ipos,jpos) c insbuffer(jpos+1:)=line(26:iend) ENDIF c changed DO I=26,IEND INSBUFFER(IPOINT)=LINE(I:I) IPOINT=IPOINT+1 ENDDO c end change READ(IUNIT,'(A)',ERR=99)LINE ENDDO WRITE(6,*)' INSERTION list done' INSNUMBER=IINS ELSE IF (LINE(1:2) .NE. '//') THEN WRITE(6,'(A,A)')'*** READHSSP: missing line "//"' GOTO 99 ENDIF CLOSE(IUNIT) CALL STRPOS(HSSPFILE,IBEG,IEND) WRITE(6,'(A,A,A)')' ReadHSSP: ',HSSPFILE(IBEG:IEND),' OK' RETURN 99 WRITE(6,'(A,A)')'*** ERROR in READHSSP reading: ',HSSPFILE ERROR=.TRUE. NALIGN=0 SEQLENGTH=0 RETURN END C END READHSSP C...................................................................... C...................................................................... C SUB READPROFILE SUBROUTINE READPROFILE(KPROF,PROFILENAME,MAXRES,NTRANS,TRANS, + LDSSP,NRES,NCHAIN,HSSPID,HEADER,COMPOUND,SOURCE, + AUTHOR,SMIN,SMAX,MAPLOW,MAPHIGH,METRICFILE,PDBNO, + CHAINID,SEQ,STRUC,ACC,COLS,SHEETLABEL,BP1,BP2, + NOCC,GAPOPEN,GAPELONG,CONSWEIGHT,PROFILEMETRIC, + MAXBOX,NBOX,PROFILEBOX) IMPLICIT NONE C order of amino acids INTEGER NTRANS CHARACTER*(*) TRANS LOGICAL LDSSP INTEGER NACID PARAMETER (NACID= 20) INTEGER KPROF,MAXRES,NRES,ACC(MAXRES),BP1(MAXRES), + BP2(MAXRES),NOCC(MAXRES),NCHAIN,PDBNO(MAXRES) REAL PROFILEMETRIC(MAXRES,NTRANS),GAPOPEN(MAXRES), + GAPELONG(MAXRES),CONSWEIGHT(MAXRES), + SMIN,SMAX,MAPLOW,MAPHIGH CHARACTER*(*) HSSPID,HEADER,COMPOUND,SOURCE,AUTHOR,METRICFILE, + PROFILENAME,SEQ(MAXRES),STRUC(MAXRES), + CHAINID(MAXRES) CHARACTER*7 COLS(MAXRES) CHARACTER*1 SHEETLABEL(MAXRES) CHARACTER*300 LINE INTEGER MAXBOX,NBOX,PROFILEBOX(MAXBOX,2) C internal INTEGER I,J,K,IBOX CHARACTER CDIVIDE1,CDIVIDE2 LOGICAL LERROR *----------------------------------------------------------------------* C init LDSSP=.FALSE. LINE=' ' CDIVIDE1=':' CDIVIDE2='-' SMIN=0.0 SMAX=0.0 MAPLOW=0.0 MAPHIGH=0.0 DO I=1,MAXRES PDBNO(I)=0 CHAINID(I)=' ' SEQ(I)=' ' STRUC(I)=' ' COLS(I)=' ' BP1(I)=0 BP2(I)=0 SHEETLABEL(I)=' ' ACC(I)=0 NOCC(I)=0 GAPOPEN(I)=0.0 GAPELONG(I)=0.0 CONSWEIGHT(I)=0.0 DO J=1,NTRANS PROFILEMETRIC(I,J)=0.0 ENDDO ENDDO NBOX=1 DO I=1,MAXBOX PROFILEBOX(I,1)=0 PROFILEBOX(I,2)=0 ENDDO C====================================================================== CALL OPEN_FILE(KPROF,PROFILENAME,'OLD,RECL=2000,readonly', + LERROR) READ(KPROF,'(A)')LINE IF (INDEX(LINE,'-PROFILE').EQ.0) THEN WRITE(6,'(A,A)') + '*** ERROR: file is not a proper MAXHOM-PROFILE: ',profilename STOP ELSE IF (INDEX(LINE,'SECONDARY').NE.0) THEN LDSSP=.TRUE. ENDIF ENDIF C search for keywords C "SMIN" and "SMAX" scale metric C "MAPLOW" and "MAPHIGH" C if MAPLOW and MAPHIGH are specified the profile is rescaled C such that the profile values are mapped between MAPLOW and C MAPHIGH to ingnore outsider values C (fx. MAPHIGH=mean-value + standart-deviation) DO WHILE(INDEX(LINE,'SeqNo PDBNo AA STRUCTURE BP1 BP2').EQ.0) LINE=' ' READ(KPROF,'(A)')LINE c read(kprof,'(a)',end=999)line CALL EXTRACT_STRING(LINE,CDIVIDE1,'ID',HSSPID) CALL EXTRACT_STRING(LINE,CDIVIDE1,'HEADER',HEADER) CALL EXTRACT_STRING(LINE,CDIVIDE1,'COMPOUND',COMPOUND) CALL EXTRACT_STRING(LINE,CDIVIDE1,'SOURCE',SOURCE) CALL EXTRACT_STRING(LINE,CDIVIDE1,'AUTHOR',AUTHOR) CALL EXTRACT_STRING(LINE,CDIVIDE1,'METRIC',METRICFILE) CALL EXTRACT_INTEGER(LINE,CDIVIDE1,'NRES',NRES) CALL EXTRACT_INTEGER(LINE,CDIVIDE1,'NCHAIN',NCHAIN) CALL EXTRACT_INTEGER(LINE,CDIVIDE1,'NBOX',NBOX) CALL EXTRACT_REAL(LINE,CDIVIDE1,'SMIN',SMIN) CALL EXTRACT_REAL(LINE,CDIVIDE1,'SMAX',SMAX) CALL EXTRACT_REAL(LINE,CDIVIDE1,'MAPLOW',MAPLOW) CALL EXTRACT_REAL(LINE,CDIVIDE1,'MAPHIGH',MAPHIGH) ENDDO C read BOX definition IF (NBOX .GT. 1) THEN REWIND(KPROF) LINE=' ' IBOX=0 DO WHILE(INDEX(LINE,'SeqNo PDBNo AA STRUCTURE BP1 BP2').EQ.0) LINE=' ' READ(KPROF,'(A)',END=999)LINE IF (LINE(1:3).EQ.'BOX') THEN IBOX=IBOX+1 CALL EXTRACT_INTEGER_RANGE(LINE,CDIVIDE1,CDIVIDE2, + PROFILEBOX(IBOX,1)) ENDIF ENDDO IF (IBOX .NE. NBOX) THEN WRITE(6,*)' ERROR: number of boxes does not match number'// + ' of box specification' WRITE(6,*)NBOX,IBOX STOP ENDIF ELSE PROFILEBOX(NBOX,1)=1 PROFILEBOX(NBOX,2)=NRES ENDIF LINE=' ' I=0 READ(KPROF,'(A)')LINE DO WHILE(LINE(1:2).NE.'//') I=I+1 IF (I.GT.MAXRES) THEN WRITE(6,'(A)') + ' *** ERROR IN READROFILE: NRES.GT.MAXRES' STOP ENDIF c WRITE(6,*)line c read(line,100,end=999)pdbno(i),chainid(i),seq(i), c + struc(i),cols(i),bp1(i),bp2(i),sheetlabel(i),acc(i), c + nocc(i),gapopen(i),gapelong(i),consweight(i), c + (profilemetric(i,j),j=1,nacid) c read(line,100,err=999,end=999)pdbno(i),chainid(i),seq(i), READ(LINE,100)PDBNO(I),CHAINID(I),SEQ(I), + STRUC(I),COLS(I),BP1(I),BP2(I),SHEETLABEL(I),ACC(I), + NOCC(I),GAPOPEN(I),GAPELONG(I),CONSWEIGHT(I), + (PROFILEMETRIC(I,J),J=1,NACID) 100 FORMAT(6X,1X,I4,1X,A1,1X,A1,2X,A1,1X,A7,2(I4),A1,2(I4,1X), + 2(F6.2),F7.2,20(F8.3)) READ(KPROF,'(A)')LINE ENDDO IF (I .NE. NRES) THEN WRITE(6,*) ' ********************************************' WRITE(6,*) ' FATAL ERROR' WRITE(6,*) ' Heee, number of positions read in is: ',i WRITE(6,*) ' NRES in Header is: ',nres STOP ENDIF CLOSE(KPROF) C add 'B' 'Z' 'X' '!' '-' '.' I=INDEX(TRANS,'N') J=INDEX(TRANS,'B') DO K=1,NRES PROFILEMETRIC(K,J)=PROFILEMETRIC(K,I) ENDDO I=INDEX(TRANS,'Q') J=INDEX(TRANS,'Z') DO K=1,NRES PROFILEMETRIC(K,J)=PROFILEMETRIC(K,I) ENDDO RETURN C read error 999 CLOSE(KPROF) WRITE(6,*)'*** ERROR: read error in MAXHOM-PROFILE' NRES=0 RETURN END C END READROFILE C...................................................................... C...................................................................... C SUB RECEIVE_DATA_FROM_HOST SUBROUTINE RECEIVE_DATA_FROM_HOST(ILINK) C node routine: get all relevant information about sequence 1 and C control flow C IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' INTEGER ILINK C internal CHARACTER CPARSYTEC_BUG*(MAXSQ) INTEGER ISIZE,I INTEGER ILBACKWARD,ILINSERT_2,ILISTOFSEQ_2,ILSHOW_SAMESEQ, + ILSWISSBASE,ILDSSP_1,ILCONSERV_1,ILCONSERV_2, + ILCONSIMPORT,ILALL,ILFORMULA,ILTHRESHOLD, + ILCOMPSTR,ILPASS2,ILTRACE,ILONG_OUT,ILBATCH, + I3WAY,I3WAYDONE,IWARM_START,IBINARY C INTEGER ILMIXED_ARCH, C init logicals ILBACKWARD=0 ILINSERT_2=0 ILISTOFSEQ_2=0 ILSHOW_SAMESEQ=0 ILSWISSBASE=0 ILDSSP_1=0 ILCONSERV_1=0 ILCONSERV_2=0 ILCONSIMPORT=0 ILALL=0 ILFORMULA=0 ILTHRESHOLD=0 ILCOMPSTR=0 ILPASS2=0 ILTRACE=0 ILONG_OUT=0 ILBATCH=0 C ILMIXED_ARCH=0 I3WAY=0 I3WAYDONE=0 IWARM_START=0 IBINARY=0 LBACKWARD = .FALSE. LINSERT_2 = .FALSE. LISTOFSEQ_2 = .FALSE. LSHOW_SAMESEQ = .FALSE. LSWISSBASE = .FALSE. LDSSP_1 = .FALSE. LCONSERV_1 = .FALSE. LCONSERV_2 = .FALSE. LCONSIMPORT = .FALSE. LALL = .FALSE. LFORMULA = .FALSE. LTHRESHOLD = .FALSE. LCOMPSTR = .FALSE. LPASS2 = .FALSE. LTRACE = .FALSE. LONG_OUT = .FALSE. LBATCH = .FALSE. L3WAY=.FALSE. L3WAYDONE=.FALSE. LWARM_START=.FALSE. LBINARY=.FALSE. C LMIXED_ARCH=.FALSE. C INIT WRITE(6,*)' receive data start 1: ',idproc CALL FLUSH_UNIT(6) MSGTYPE=1 c if (mp_model .eq. 'PARIX') then ; msgtype=idtop ; endif CALL MP_RECEIVE_DATA(MSGTYPE,LINK(ID_HOST)) CALL MP_GET_INT4(MSGTYPE,ILINK,ID_HOST,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,N1,N_ONE) IF (N1 .GT. 0) THEN ISIZE=N1 CALL MP_GET_INT4_ARRAY(MSGTYPE,ILINK,LSQ_1,ISIZE) CALL MP_GET_INT4_ARRAY(MSGTYPE,ILINK,LSTRUC_1,ISIZE) CALL MP_GET_INT4_ARRAY(MSGTYPE,ILINK,LSTRCLASS_1,ISIZE) CALL MP_GET_INT4_ARRAY(MSGTYPE,ILINK,LACC_1,ISIZE) ISIZE=MAXBREAK CALL MP_GET_INT4_ARRAY(MSGTYPE,ILINK,IBREAKPOS_1,ISIZE) CALL MP_GET_INT4(MSGTYPE,ILINK,NBREAK_1,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,NBEST,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,IPROFBEG,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,IPROFEND,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,PROFILEMODE,N_ONE) ISIZE=NASCII CALL MP_GET_INT4_ARRAY(MSGTYPE,ILINK,TRANSPOS,ISIZE) ISIZE=MAXCUTOFFSTEPS CALL MP_GET_INT4_ARRAY(MSGTYPE,ILINK,ISOLEN,ISIZE) CALL MP_GET_INT4(MSGTYPE,ILINK,NSTEP,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ISAFE,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,NSTRSTATES_1,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,NSTRSTATES_2,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,NIOSTATES_1,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,NIOSTATES_2,N_ONE) ISIZE=N1 CALL MP_GET_INT4_ARRAY(MSGTYPE,ILINK,PDBNO_1,ISIZE) ISIZE=MAXCUTOFFSTEPS CALL MP_GET_REAL4_ARRAY(MSGTYPE,ILINK,ISOIDE,ISIZE) ISIZE=N1 CALL MP_GET_REAL4_ARRAY(MSGTYPE,ILINK,GAPOPEN_1,ISIZE) CALL MP_GET_REAL4_ARRAY(MSGTYPE,ILINK,GAPELONG_1,ISIZE) CALL MP_GET_REAL4(MSGTYPE,ILINK,OPEN_1,N_ONE) CALL MP_GET_REAL4(MSGTYPE,ILINK,ELONG_1,N_ONE) CALL MP_GET_REAL4_ARRAY(MSGTYPE,ILINK,CONSWEIGHT_1,ISIZE) ISIZE=MAXSQ*NTRANS CALL MP_GET_REAL4_ARRAY(MSGTYPE,ILINK,SIMMETRIC_1,ISIZE) IF (PROFILEMODE .EQ. 6) THEN ISIZE= NTRANS * NTRANS * MAXSTRSTATES * MAXIOSTATES * + MAXSTRSTATES*MAXIOSTATES CALL MP_GET_REAL4_ARRAY(MSGTYPE,ILINK,SIMORG,ISIZE) ENDIF CALL MP_GET_REAL4(MSGTYPE,ILINK,FILTER_VAL,N_ONE) CALL MP_GET_REAL4(MSGTYPE,ILINK,PUNISH,N_ONE) CALL MP_GET_REAL4(MSGTYPE,ILINK,CUTVALUE1,N_ONE) CALL MP_GET_REAL4(MSGTYPE,ILINK,CUTVALUE2,N_ONE) CALL MP_GET_REAL4(MSGTYPE,ILINK,SMIN,N_ONE) CALL MP_GET_REAL4(MSGTYPE,ILINK,SMAX,N_ONE) CALL MP_GET_REAL4(MSGTYPE,ILINK,MAPLOW,N_ONE) CALL MP_GET_REAL4(MSGTYPE,ILINK,MAPHIGH,N_ONE) ISIZE=MAXSTRSTATES*MAXIOSTATES CALL MP_GET_REAL4_ARRAY(MSGTYPE,ILINK,IORANGE,ISIZE) ISIZE=LEN(MP_MODEL) CALL MP_GET_STRING(MSGTYPE,ILINK,MP_MODEL,ISIZE) ISIZE=LEN(SPLIT_DB_PATH) CALL MP_GET_STRING(MSGTYPE,ILINK,SPLIT_DB_PATH,ISIZE) ISIZE=LEN(SPLIT_DB_DATA) CALL MP_GET_STRING(MSGTYPE,ILINK,SPLIT_DB_DATA,ISIZE) ISIZE=LEN(SWISSPROT_SEQ) CALL MP_GET_STRING(MSGTYPE,ILINK,SWISSPROT_SEQ,ISIZE) ISIZE=LEN(LISTFILE_2) CALL MP_GET_STRING(MSGTYPE,ILINK,LISTFILE_2,ISIZE) ISIZE=LEN(CSQ_1) CALL MP_GET_STRING(MSGTYPE,ILINK,CSQ_1,ISIZE) c isize=MAXSQ c call mp_get_string_array(msgtype,ilink,struc_1,isize) c call mp_get_string_array(msgtype,ilink,chainid_1,isize) ISIZE=N1 CALL MP_GET_STRING(MSGTYPE,ILINK,CPARSYTEC_BUG,ISIZE) DO I=1,N1 STRUC_1(I)=CPARSYTEC_BUG(I:I) ENDDO CALL MP_GET_STRING(MSGTYPE,ILINK,CPARSYTEC_BUG,ISIZE) DO I=1,N1 CHAINID_1(I)=CPARSYTEC_BUG(I:I) ENDDO ISIZE=LEN(OPENWEIGHT_ANSWER) CALL MP_GET_STRING(MSGTYPE,ILINK,OPENWEIGHT_ANSWER,ISIZE) ISIZE=LEN(ELONGWEIGHT_ANSWER) CALL MP_GET_STRING(MSGTYPE,ILINK,ELONGWEIGHT_ANSWER,ISIZE) ISIZE=LEN(SMIN_ANSWER) CALL MP_GET_STRING(MSGTYPE,ILINK,SMIN_ANSWER,ISIZE) ISIZE=LEN(NAME_1) CALL MP_GET_STRING(MSGTYPE,ILINK,NAME_1,ISIZE) ISIZE=LEN(HSSPID_2) CALL MP_GET_STRING(MSGTYPE,ILINK,HSSPID_2,ISIZE) ISIZE=LEN(CSORTMODE) CALL MP_GET_STRING(MSGTYPE,ILINK,CSORTMODE,ISIZE) ISIZE=LEN(METRICFILE) CALL MP_GET_STRING(MSGTYPE,ILINK,METRICFILE,ISIZE) ISIZE=LEN(CURRENT_DIR) CALL MP_GET_STRING(MSGTYPE,ILINK,CURRENT_DIR,ISIZE) ISIZE=LEN(DSSP_PATH) CALL MP_GET_STRING(MSGTYPE,ILINK,DSSP_PATH,ISIZE) ISIZE=LEN(PDBPATH) CALL MP_GET_STRING(MSGTYPE,ILINK,PDBPATH,ISIZE) ISIZE=LEN(PLOTFILE) CALL MP_GET_STRING(MSGTYPE,ILINK,PLOTFILE,ISIZE) ISIZE=LEN(COREPATH) CALL MP_GET_STRING(MSGTYPE,ILINK,COREPATH,ISIZE) ISIZE=LEN(COREFILE) CALL MP_GET_STRING(MSGTYPE,ILINK,COREFILE,ISIZE) ISIZE=LEN(TRANS) CALL MP_GET_STRING(MSGTYPE,ILINK,TRANS,ISIZE) ISIZE=LEN(STRTRANS) CALL MP_GET_STRING(MSGTYPE,ILINK,STRTRANS,ISIZE) ISIZE=LEN(CSTRSTATES) CALL MP_GET_STRING(MSGTYPE,ILINK,CSTRSTATES,ISIZE) ISIZE=LEN(CIOSTATES) CALL MP_GET_STRING(MSGTYPE,ILINK,CIOSTATES,ISIZE) DO I=1,MAXSTRSTATES ISIZE=LEN(STR_CLASSES(I)) CALL MP_GET_STRING(MSGTYPE,ILINK,STR_CLASSES(I),ISIZE) ENDDO C CALL MP_GET_INT4(MSGTYPE,ILINK,ILMIXED_ARCH,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILBACKWARD,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILINSERT_2,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILISTOFSEQ_2,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILSHOW_SAMESEQ,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILSWISSBASE,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILDSSP_1,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILCONSERV_1,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILCONSERV_2,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILCONSIMPORT,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILALL,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILFORMULA,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILTHRESHOLD,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILCOMPSTR,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILPASS2,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILTRACE,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILONG_OUT,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,ILBATCH,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,I3WAY,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,I3WAYDONE,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,IWARM_START,N_ONE) CALL MP_GET_INT4(MSGTYPE,ILINK,IBINARY,N_ONE) C IF ( ILMIXED_ARCH .EQ. 1 )LMIXED_ARCH = .TRUE. IF ( ILBACKWARD .EQ. 1 )LBACKWARD = .TRUE. IF ( ILINSERT_2 .EQ. 1 )LINSERT_2 = .TRUE. IF ( ILISTOFSEQ_2 .EQ. 1 )LISTOFSEQ_2 = .TRUE. IF ( ILSHOW_SAMESEQ .EQ. 1 )LSHOW_SAMESEQ = .TRUE. IF ( ILSWISSBASE .EQ. 1 )LSWISSBASE = .TRUE. IF ( ILDSSP_1 .EQ. 1 )LDSSP_1 = .TRUE. IF ( ILCONSERV_1 .EQ. 1 )LCONSERV_1 = .TRUE. IF ( ILCONSERV_2 .EQ. 1 )LCONSERV_2 = .TRUE. IF ( ILCONSIMPORT .EQ. 1 )LCONSIMPORT = .TRUE. IF ( ILALL .EQ. 1 )LALL = .TRUE. IF ( ILFORMULA .EQ. 1 )LFORMULA = .TRUE. IF ( ILTHRESHOLD .EQ. 1 )LTHRESHOLD = .TRUE. IF ( ILCOMPSTR .EQ. 1 )LCOMPSTR = .TRUE. IF ( ILPASS2 .EQ. 1 )LPASS2 = .TRUE. IF ( ILTRACE .EQ. 1 )LTRACE = .TRUE. IF ( ILONG_OUT .EQ. 1 )LONG_OUT = .TRUE. IF ( ILBATCH .EQ. 1 )LBATCH = .TRUE. IF ( I3WAY .EQ. 1 )L3WAY = .TRUE. IF ( I3WAYDONE .EQ. 1 )L3WAYDONE = .TRUE. IF ( IWARM_START .EQ. 1 )LWARM_START = .TRUE. IF ( IBINARY .EQ. 1 )LBINARY = .TRUE. ENDIF WRITE(6,*)' receive data OK: ',idproc CALL FLUSH_UNIT(6) RETURN END C END RECEIVE_DATA_FROM_HOST C...................................................................... C...................................................................... C SUB REPORTPIECES SUBROUTINE REPORTPIECES PARAMETER (MXPIECES= 50) COMMON/CPIECE/IPRESPIE(2,2,MXPIECES),NPIECES,NRESPIE(2), + NATMPIE(2) CALL CHECKRANGE(NPIECES,1,MXPIECES,'NPIECES ','REPORTPIEC') C IPRESPIE(1/2, molA/molB, IPIECE) WRITE(6,*)'------- you chose ---------' WRITE(6,'(I10,A10)') NPIECES,' pieces ' WRITE(6,*)'---------------------------' WRITE(6,*)' mol A mol B' WRITE(6,*)' from...to from...to ' WRITE(6,*)'----------------------------' DO IPIECE=1,NPIECES WRITE(6,'(I3,1x,2I5,5X,2I5)') IPIECE, + ( (IPRESPIE(I,M,IPIECE),I=1,2), M=1,2) C FOR IPIECE=1,NPIECES ENDDO WRITE(6,*)'----------------------------' RETURN END C END REPORTPIECES C...................................................................... C...................................................................... C SUB RightADJUST SUBROUTINE RIGHTADJUST(STRING,NDIM,NLEN) C right-adjust of astring CHARACTER*(*) STRING INTEGER NDIM, NLEN, l,il C find position of last non-blank IF (NDIM.LT.1.OR.NLEN.LT.1) RETURN IF (NDIM .gt. 1 ) STOP' update routine rightadjust' L=NLEN DO WHILE(STRING(L:L) .EQ. ' ' .AND. L .GT. 1) L=L-1 ENDDO IF (L .LT. NLEN) THEN C L is position of last non-blank STRING(NLEN-L+1:NLEN)=STRING(1:L) C fill rest with blanks from 1 to NLEN-L DO IL=1,NLEN-L STRING(IL:IL)=' ' ENDDO ENDIF c DO I=1,NDIM ! for each string c L=NLEN c DO WHILE(STRINGS(I)(L:L).EQ.' '.AND.L.GT.1) c L=L-1 c ENDDO c IF (L.LT.NLEN) THEN C L is position of last non-blank c STRINGS(I)(NLEN-L+1:NLEN)=STRINGS(I)(1:L) C fill rest with blanks from 1 to NLEN-L c DO IL=1,NLEN-L c STRINGS(I)(IL:IL)=' ' c ENDDO c ENDIF c ENDDO RETURN END C END RightADJUST C...................................................................... C...................................................................... C SUB S3TOS1 SUBROUTINE S3TOS1(SEQ3,SEQ1,NRES) C TRANSLATES A3 TO A1 AND VICE VERSA. CHRIS SANDER MAY 1983 C INPUT/OUTPUT CHARACTER SEQ3(*)*3,SEQ1(*)*1 INTEGER NRES C LOCAL CHARACTER AA3(24)*3, AA1(24)*1 DATA AA3/'GLY','PRO','ASP','GLU','ALA','ASN','GLN','SER', + 'THR','LYS','ARG','HIS','VAL','ILE','MET','CYS', + 'LEU','PHE','TYR','TRP','ASX','GLX','---','!!!'/ DATA AA1/'G','P','D','E','A','N','Q','S','T','K', + 'R','H','V','I','M','C','L','F','Y','W','B','Z','-','!'/ C 'X' OR 'XYZ' FOR NON-STANDARD OR UNKNOWN AMINO ACID RESIDUES DO I=1,NRES DO J=1,24 CD WRITE(6,*)'S3TOS1: ',SEQ3(I),I,' =?= ',AA3(J),J IF (SEQ3(I).EQ.AA3(J)) THEN SEQ1(I)=AA1(J) GOTO 9 ENDIF ENDDO SEQ1(I)='X' WRITE(6,100) SEQ3(I),SEQ1(I) WRITE(6,*)' legal residues are: ' WRITE(6,*) (AA3(J),J=1,24) 9 CONTINUE ENDDO 100 FORMAT(' UNUSUAL RESIDUE NAME <',A3,'> TRANSLATED TO <',A1,'>') C c ENTRY S1TOS3(SEQ3,SEQ1,NRES) c DO I=1,NRES c DO J=1,24 c IF (SEQ1(I).EQ.AA1(J)) THEN c SEQ3(I)=AA3(J) c GOTO 99 c ENDIF c ENDDO c SEQ3(I)='XYZ' c WRITE(6,100) SEQ1(I),SEQ3(I) c99 CONTINUE c ENDDO RETURN END C END S3TOS1 C...................................................................... C...................................................................... C SUB S1TOS3 SUBROUTINE S1TOS3(SEQ3,SEQ1,NRES) C TRANSLATES A3 TO A1 AND VICE VERSA. CHRIS SANDER MAY 1983 C INPUT/OUTPUT CHARACTER SEQ3(*)*3,SEQ1(*)*1 INTEGER NRES C LOCAL CHARACTER AA3(24)*3, AA1(24)*1 DATA AA3/'GLY','PRO','ASP','GLU','ALA','ASN','GLN','SER', + 'THR','LYS','ARG','HIS','VAL','ILE','MET','CYS', + 'LEU','PHE','TYR','TRP','ASX','GLX','---','!!!'/ DATA AA1/'G','P','D','E','A','N','Q','S','T','K', + 'R','H','V','I','M','C','L','F','Y','W','B','Z','-','!'/ C 'X' OR 'XYZ' FOR NON-STANDARD OR UNKNOWN AMINO ACID RESIDUES DO I=1,NRES DO J=1,24 IF (SEQ1(I).EQ.AA1(J)) THEN SEQ3(I)=AA3(J) GOTO 99 ENDIF ENDDO SEQ3(I)='XYZ' WRITE(6,100) SEQ1(I),SEQ3(I) 99 CONTINUE ENDDO 100 FORMAT(' UNUSUAL RESIDUE NAME <',A3,'> TRANSLATED TO <',A1,'>') RETURN END C END S1TOS3 C...................................................................... C...................................................................... C SUB SCALE_PROFILE_METRIC SUBROUTINE SCALE_PROFILE_METRIC(MAXRES,NTRANS,TRANS, + PROFILEMETRIC,SMIN,SMAX,MAPLOW,MAPHIGH) C======================================================================= C scale profile metric according to SMIN,SMAX,MAPLOW,MAPHIGH C profilemetric is sim(maxres,26) C======================================================================= IMPLICIT NONE INTEGER MAXRES,NTRANS REAL PROFILEMETRIC(MAXRES,NTRANS) REAL SMIN,SMAX,MAPLOW,MAPHIGH CHARACTER*(*) TRANS C internal INTEGER NN,I,J,K,L,M NN=MAXRES*NTRANS C======================================================================= C reset value for chain breaks etc... C add 'X' '!' and "-" J=INDEX(TRANS,'X') K=INDEX(TRANS,'!') L=INDEX(TRANS,'-') M=INDEX(TRANS,'.') IF (J.EQ.0 .OR. K.EQ.0 .OR. L.EQ.0 .or. M.eq. 0) THEN WRITE(6,*)'*** ERROR: "X","!","-" or "." unknown in '// + 'SCALE_PROFILE_METRIC' ENDIF DO I=1,MAXRES PROFILEMETRIC(I,J)=0.0 PROFILEMETRIC(I,K)=0.0 PROFILEMETRIC(I,L)=0.0 PROFILEMETRIC(I,M)=0.0 ENDDO CALL SCALEINTERVAL(PROFILEMETRIC,NN,SMIN,SMAX,MAPLOW,MAPHIGH) C======================================================================= C reset value for chain breaks etc... C add 'X' '!' and "-" J=INDEX(TRANS,'X') K=INDEX(TRANS,'!') L=INDEX(TRANS,'-') M=INDEX(TRANS,'.') IF (J.EQ.0 .OR. K.EQ.0 .OR. L.EQ.0 .or. M.eq. 0) THEN WRITE(6,*)'*** ERROR: "X","!","-" or "." unknown in '// + 'SCALE_PROFILE_METRIC' ENDIF DO I=1,MAXRES PROFILEMETRIC(I,J)=0.0 PROFILEMETRIC(I,K)=-200.0 PROFILEMETRIC(I,L)=0.0 PROFILEMETRIC(I,M)=0.0 ENDDO C======================================================================= C DEBUG: WRITE MATRIX IN OUTPUT-FILE C======================================================================= c OPEN(99,FILE='METRIC_DEBUG.X',STATUS='NEW',RECL=500) c DO I=1,50 c WRITE(99,'(1X,26(F7.2))')(PROFILEMETRIC(I,J),J=1,NTRANS) c ENDDO c CLOSE(99) C======================================================================= RETURN END C END SCALE_PROFILE_METRIC C...................................................................... C...................................................................... C SUB SCALEMETRIC SUBROUTINE SCALEMETRIC(NTRANS,TRANS,MAXSTRSTATES, + MAXIOSTATES,SIMMETRIC,SMIN,SMAX,MAPLOW,MAPHIGH) C======================================================================= C scale matrix according to SMIN,SMAX,MAPLOW,MAPHIGH C======================================================================= IMPLICIT NONE INTEGER NTRANS,MAXSTRSTATES,MAXIOSTATES REAL SIMMETRIC(NTRANS,NTRANS,MAXSTRSTATES,MAXIOSTATES, + MAXSTRSTATES,MAXIOSTATES) REAL SMIN,SMAX,MAPLOW,MAPHIGH CHARACTER*(*) TRANS C internal INTEGER NN,I,J,istr1,io1,istr2,io2 NN= (NTRANS * NTRANS) * (MAXSTRSTATES * MAXSTRSTATES) * + (MAXIOSTATES * MAXIOSTATES) CALL SCALEINTERVAL(SIMMETRIC,NN,SMIN,SMAX,MAPLOW,MAPHIGH) C======================================================================= C reset value for chain breaks etc... C add 'X' I=INDEX(TRANS,'X') IF (I.EQ.0) THEN WRITE(6,*)'*** ERROR: "X" unknown in SCALEMETRIC' STOP ENDIF DO J=1,NTRANS DO ISTR1=1,MAXSTRSTATES DO IO1=1,MAXIOSTATES DO ISTR2=1,MAXSTRSTATES DO IO2=1,MAXIOSTATES SIMMETRIC(I,J,ISTR1,IO1,ISTR2,IO2)=0.0 SIMMETRIC(J,I,ISTR1,IO1,ISTR2,IO2)=0.0 ENDDO ENDDO ENDDO ENDDO ENDDO C add '!' I=INDEX(TRANS,'!') IF (I.EQ.0) THEN WRITE(6,*)'*** ERROR: "!" unknown in SCALEMETRIC' STOP ENDIF DO J=1,NTRANS DO ISTR1=1,MAXSTRSTATES DO IO1=1,MAXIOSTATES DO ISTR2=1,MAXSTRSTATES DO IO2=1,MAXIOSTATES SIMMETRIC(I,J,ISTR1,IO1,ISTR2,IO2)=0.0 SIMMETRIC(J,I,ISTR1,IO1,ISTR2,IO2)=0.0 ENDDO ENDDO ENDDO ENDDO ENDDO C add '-' I=INDEX(TRANS,'-') IF (I.EQ.0) THEN WRITE(6,*)'*** ERROR: "-" unknown in SCALEMETRIC' STOP ENDIF DO J=1,NTRANS DO ISTR1=1,MAXSTRSTATES DO IO1=1,MAXIOSTATES DO ISTR2=1,MAXSTRSTATES DO IO2=1,MAXIOSTATES SIMMETRIC(I,J,ISTR1,IO1,ISTR2,IO2)=0.0 SIMMETRIC(J,I,ISTR1,IO1,ISTR2,IO2)=0.0 ENDDO ENDDO ENDDO ENDDO ENDDO C add '.' c I=INDEX(TRANS,'.') c IF (I.EQ.0) THEN c WRITE(6,*)'*** ERROR: "." unknown in SCALEMETRIC' c STOP c ENDIF c DO J=1,NTRANS c DO istr1=1,MAXSTRSTATES c DO io1=1,MAXIOSTATES c DO istr2=1,MAXSTRSTATES c DO io2=1,MAXIOSTATES c SIMMETRIC(I,J,istr1,io1,istr2,io2)=0.0 c SIMMETRIC(j,i,istr1,io1,istr2,io2)=0.0 c enddo c enddo c ENDDO c ENDDO c ENDDO C======================================================================= C DEBUG: WRITE MATRIX IN OUTPUT-FILE C======================================================================= c open(99,file='METRIC_DEBUG.X',status='NEW') c istr1=1 c io1=1 c istr2=1 c io2=1 c do i=1,ntrans c do istr1=1,maxstrstates c do io1=1,maxiostates c do istr2=1,maxstrstates c do io2=1,maxiostates c write(99,'(1x,a1,4(1x,i3),5x,26(f6.2))') c + trans(i:i),istr1,io1,istr2,io2, c + (simmetric(i,j,istr1,io1,istr2,io2),j=1,ntrans) c enddo c enddo c enddo c enddo c enddo c close(99) C======================================================================= RETURN END C END SCALEMETRIC C...................................................................... C...................................................................... C SUB SCALEINTERVAL SUBROUTINE SCALEINTERVAL(S,N,SMIN,SMAX,MAPLOW,MAPHIGH) C imported: old values in S(1..N) C maplow and maphigh C target limits SMAX, SMIN C C exported: new values in S(1..N) C internal: SHI, SLO C SHI.........*.........SLO map this interval onto C SMAX...*...SMIN this interval or C MAPLOW MAPHIGH C REAL S(*),MAPLOW,MAPHIGH,SMIN,SMAX,SHI,SLO SHI=-1.0E+10 SLO=1.0E+10 C IF (SMIN.EQ.0.0 .AND. SMAX.EQ.0.0 .AND. + MAPLOW.EQ.0.0 .AND. MAPHIGH.EQ.0.0) THEN WRITE(6,*)' SCALEINTERVAL: NO SCALING ' RETURN ENDIF IF (MAPLOW.EQ.0.0 .AND. MAPHIGH.EQ.0.0) THEN c WRITE(6,*)' SCALEINTERVAL: scale between SMIN/SMAX' DO I=1,N IF (S(I) .GT. SHI)SHI=S(I) IF (S(I) .LT. SLO)SLO=S(I) ENDDO ELSE WRITE(6,*)' SCALEINTERVAL: scale between MAPLOW/MAPHIGH' SHI=MAPHIGH SLO=MAPLOW ENDIF c WRITE(6,*)'high/low: ',shi,slo,n,(SHI-SLO),(SMAX-SMIN)+SMIN DO I=1,N S(I)=((S(I)-SLO)/(SHI-SLO))*(SMAX-SMIN)+SMIN ENDDO c WRITE(6,'(20F5.2)')(S(I),I=1,N) RETURN END C END SCALEINTERVAL C...................................................................... C...................................................................... C SUB SECSTRUC_TO_3_STATE SUBROUTINE SECSTRUC_TO_3_STATE(SECSTRUC,CLASS,ICLASS) C convert DSSP-secondary structure symbol to 3-state (L,H,E) secondary C structure C given SECSTRUC, what is the class number ICLASS and class C representative CLASS ? C undefined states is set CLASS='U', ICLASS=0 C C input CHARACTER SECSTRUC C output CHARACTER CLASS INTEGER ICLASS C internal c INTEGER MAXSTRSTATES c PARAMETER (MAXSTRSTATES=3) CHARACTER*25 STATES c 1234567890123456789012345 STATES='L TCSltcsEBAPMebapmHGIhgi' c DATA STATES/'L TCStclss','EBAPMebapm','HGIhgiiiii'/ c CHARACTER STATES(MAXSTRSTATES)*10 C====================================================================== ICLASS=0 CLASS='U' I=INDEX(STATES,SECSTRUC) IF (I .NE. 0) THEN IF (I .LE. 9) THEN ICLASS=1 CLASS='L' RETURN ELSE IF (I .GE. 10 .AND. I .LE. 19) THEN ICLASS=10 CLASS='E' RETURN ELSE IF (I .GE. 20 .AND. I .LE. 25) THEN ICLASS=20 CLASS='H' RETURN ENDIF ENDIF c DO K=1,MAXSTRSTATES c IF (INDEX(STATES(K),SECSTRUC).NE.0) THEN c ICLASS=K c CLASS=STATES(K)(1:1) c RETURN c ENDIF c ENDDO RETURN END C END SECSTRUC_TO_3_STATE C...................................................................... C...................................................................... C SUB SELECT_PDB_POINTER SUBROUTINE SELECT_PDB_POINTER(KUNIT,DSSP_PATH,PDBIN,PDBOUT) C selects from a string returned from GETSEQ one pdb-pointer for HSSP C the selection is done by a "best-guess": C C 1.) check if there is a valid DSSP-file C if so, take the latest entry in PDB C 2.) if not 1 then check if it is a C-alpha set C 3.) if not 2 then check if it is a model-structure C C INPUT: pdbin C 1INS; 15-JAN-91 | 2INS; 15-JAN-91 | 3INS; 20-OCT-92 || 3 C or 1NSB; PRELIMINARY. C OUTPUT: pdbout C 1INS if "normal" DSSP-file or C 1INS_C if c-alpha only or C 1INS_M if model structure or C 1INS_P if pre-released structure or C 1INS_? if none of the above cases, like the SwissProt pointer C is pointing to a PDB-file which is gone (renamed) in the C current version of PDB OR C if something is wrong with the "normal" DSSP-file IMPLICIT NONE C input: CHARACTER*(*) PDBIN,DSSP_PATH INTEGER KUNIT C output: CHARACTER*(*) PDBOUT C internal INTEGER MAXPOINTER PARAMETER (MAXPOINTER= 200) INTEGER NPOINTER,SORTNUMBER(MAXPOINTER),IHIGH, + IDSSP_FLAG,ISTART,ISTOP,IPOS,JPOS,IPOINTER, + JPOINTER,KPOINTER,NEXTPOS,IMONTH,IYEAR CHARACTER*12 PDBPOINTER(MAXPOINTER) CHARACTER CTEMP*50,FILENAME*200 CHARACTER CMONTH*36 LOGICAL LERROR C used to convert entry date to sort number DATA CMONTH /'JANFEBMARAPRMAIJUNJULAUGSEPOCTNOVDEC'/ C init *----------------------------------------------------------------------* PDBOUT=' ' CTEMP=' ' IF (PDBIN.EQ.' ')RETURN C extract number of pointers IPOS=INDEX(PDBIN,'||') IF (IPOS .NE. 0) THEN CALL STRPOS(PDBIN,ISTART,ISTOP) CALL READ_INT_FROM_STRING(PDBIN(IPOS+2:ISTOP),NPOINTER) ELSE RETURN ENDIF IF (NPOINTER .LE. 0)RETURN C loop over pdb-pointers IPOS=1 IF (NPOINTER .GT. MAXPOINTER) THEN WRITE(6,*)' SELECT_PDB_POINTER: npointer .gt. maxpointer' WRITE(6,*)' set npointer to maxpointer' NPOINTER= MAXPOINTER ENDIF DO IPOINTER=1,NPOINTER SORTNUMBER(IPOINTER)=0 CTEMP=' ' NEXTPOS=INDEX(PDBIN(IPOS:),'|')+IPOS-1 CTEMP(1:)=PDBIN(IPOS:NEXTPOS-1) JPOS=INDEX(CTEMP,';') PDBPOINTER(IPOINTER)=CTEMP(1:JPOS-1) C extract month and year of pdb entry IF (INDEX(CTEMP,'PRELIM') .EQ. 0) THEN JPOS=INDEX(CTEMP,'-') IMONTH= ( (INDEX(CMONTH,CTEMP(JPOS+1:JPOS+4) )) / 3 )+1 CALL READ_INT_FROM_STRING(CTEMP(JPOS+5:JPOS+6),IYEAR) C build up a sort number C latest entry has largest number: 199201= JAN 1992 C with beginning of the year 2080 or so we have to add a line here :-) IF (IYEAR .GT. 0) THEN SORTNUMBER(IPOINTER)=10000*19 + 100*IYEAR + IMONTH ELSE SORTNUMBER(IPOINTER)=10000*20 + 100*IYEAR + IMONTH ENDIF ENDIF C set line pointer to next entry IPOS=NEXTPOS+1 ENDDO DO JPOINTER=1,NPOINTER IPOINTER=-1 IHIGH=-1 DO KPOINTER=1,NPOINTER IF (SORTNUMBER(KPOINTER) .GE. IHIGH) THEN IHIGH=SORTNUMBER(KPOINTER) IPOINTER=KPOINTER ENDIF ENDDO SORTNUMBER(IPOINTER)=-99 CALL UPTOLOW(PDBPOINTER(IPOINTER),LEN(PDBPOINTER(IPOINTER)) ) C LOOK IF THERE IS A "NORMAL" DSSP-FILE IDSSP_FLAG=4 CALL CONCAT_3STRINGS(DSSP_PATH,PDBPOINTER(IPOINTER),'.dssp', + FILENAME) CALL OPEN_FILE(KUNIT,FILENAME,'old,readonly,silent',lerror) IF (LERROR)GOTO 10 C check if there is something in the file CTEMP=' ' DO WHILE(INDEX(CTEMP,'# RES') .EQ. 0) READ(KUNIT,'(A10)',END=10,ERR=10)CTEMP ENDDO IDSSP_FLAG=0 10 CALL CLOSE_FILE(KUNIT,FILENAME) IF (.NOT. LERROR) GOTO 100 C look if there is C-alpha model set CALL CONCAT_3STRINGS(DSSP_PATH,PDBPOINTER(IPOINTER), + '.dssp_ca',filename) CALL OPEN_FILE(KUNIT,FILENAME,'old,readonly,silent',lerror) CALL CLOSE_FILE(KUNIT,FILENAME) IF (.NOT. LERROR) THEN IDSSP_FLAG=1 GOTO 100 ENDIF C look if there is a model-structure CALL CONCAT_3STRINGS(DSSP_PATH,PDBPOINTER(IPOINTER), + '.dssp_mod',filename) CALL OPEN_FILE(KUNIT,FILENAME,'old,readonly,silent',lerror) CALL CLOSE_FILE(KUNIT,FILENAME) IF (.NOT. LERROR) THEN IDSSP_FLAG=2 GOTO 100 ENDIF C look if there is a pre-released structure CALL CONCAT_3STRINGS(DSSP_PATH,PDBPOINTER(IPOINTER), + '.dssp_pre',filename) CALL OPEN_FILE(KUNIT,FILENAME,'old,readonly,silent',lerror) CALL CLOSE_FILE(KUNIT,FILENAME) IF (.NOT. LERROR) THEN IDSSP_FLAG=3 GOTO 100 ENDIF C set pdbpointer-extension according to selection 100 CALL STRPOS(PDBPOINTER(JPOINTER),ISTART,ISTOP) IF ( IDSSP_FLAG .EQ. 0) THEN PDBOUT=PDBPOINTER(IPOINTER)(ISTART:ISTOP) GOTO 200 ELSE IF ( IDSSP_FLAG .EQ. 1) THEN PDBOUT=PDBPOINTER(IPOINTER)(ISTART:ISTOP)//'_C' ELSE IF ( IDSSP_FLAG .EQ. 2) THEN PDBOUT=PDBPOINTER(IPOINTER)(ISTART:ISTOP)//'_M' ELSE IF ( IDSSP_FLAG .EQ. 3) THEN PDBOUT=PDBPOINTER(IPOINTER)(ISTART:ISTOP)//'_P' ELSE IF ( IDSSP_FLAG .EQ. 4) THEN PDBOUT=PDBPOINTER(IPOINTER)(ISTART:ISTOP)//'_?' ENDIF ENDDO 200 CALL LOWTOUP(PDBOUT,LEN(PDBOUT) ) C RETURN END C END SELECT_PDB_POINTER C...................................................................... C...................................................................... C SUB SELECT_UNIQUE_CHAIN SUBROUTINE SELECT_UNIQUE_CHAIN(KFILE,FILENAME,OUTNAME) C selects unique chains from dssp file, builds up a new filename of the C form: $pdb:4hhb.dssp_!_A,B IMPLICIT NONE INTEGER KFILE CHARACTER*(*) FILENAME,OUTNAME C internal INTEGER MAXRES_LOC PARAMETER (MAXRES_LOC= 9999) C PARAMETER (MAXRES_LOC= 10000) C PARAMETER (MAXRES_LOC= 19876) INTEGER NRES,NCHAIN CHARACTER CRESID(MAXRES_LOC) C CHARACTER*6 CRESID(MAXRES_LOC) CHARACTER CSEQ(MAXRES_LOC) C CHARACTER TRANS*26 INTEGER MAXCHAIN PARAMETER (MAXCHAIN=100) INTEGER IBREAK,IBREAKPOS(0:MAXCHAIN),I,J,ICHAIN,JCHAIN INTEGER ISTART,ISTOP CHARACTER CHAINID(0:MAXCHAIN) CHARACTER CTEMP*100 LOGICAL LALL,LSAME(MAXCHAIN,MAXCHAIN),LTAKE(MAXCHAIN) LOGICAL LERROR C CHARACTER LOWER*26 CHARACTER LINE*(1000) C DONT USE INDEX COMMAND (CPU TIME) C INTEGER NASCII C PARAMETER (NASCII=256) C INTEGER TRANSPOS(NASCII) c init LINE=' ' NRES=1 NCHAIN=1 c lower='abcdefghijklmnopqrstuvwxyz' c TRANS='VLIMFWYGAPSTCHRKQENDBZX!-.' IBREAK=0 IBREAKPOS(0)=0 CHAINID(0)='?' OUTNAME=' ' DO I=1,MAXCHAIN IBREAKPOS(I)=0 CHAINID(I)=' ' LTAKE(I)=.TRUE. DO J=1,MAXCHAIN LSAME(I,J)=.TRUE. ENDDO ENDDO DO I=1,MAXRES_LOC CSEQ(I)=' ' ENDDO CALL OPEN_FILE(KFILE,FILENAME,'readonly,old',LERROR) C READ FROM DSSP READ(KFILE,'(A)',END=199)LINE IF (INDEX(LINE,'SECONDARY').EQ.0) THEN WRITE(6,*)'***select_unique... error: dssp file assumed, ' WRITE(6,*)' but word /secondary/ is missing in first line' RETURN ENDIF c repeat until # 105 READ(KFILE,'(A)',END=199)LINE IF (INDEX(LINE(1:5),'#').EQ.0) GOTO 105 C C23456123451x1 C23456789011x1 Ccccccaaaaaaca C 9 9 A S C 21 21 Y C DO WHILE (.TRUE.) IF (NRES .LE. MAXRES_LOC) THEN READ(KFILE,'(11X,A1,1X,A1)',END=900)CRESID(NRES),CSEQ(NRES) c read(kfile,'(6x,a6,1x,a1)',end=900)cresid(nres),cseq(nres) c convert ss-bridges to 'c'.... c if (index(lower,cseq(nres)) .ne. 0) cseq(nres)='C' IF (CSEQ(NRES) .EQ. '!') THEN NCHAIN=NCHAIN+1 ENDIF c illegal residues c call getindex(cseq(nres),transpos,i) c if (i .le. 0) then c WRITE(6,'(a,a)')'*** seq unknown: ',cseq(nres) c ENDIF NRES=NRES+1 c dimension overflow ELSE WRITE(6,'(a)')'*** error: dimension overflow MAXRES_LOC ***' WRITE(6,*)'truncated to ',nres,' residues' GOTO 900 ENDIF c next line ENDDO C--------------DSSP read error ----------------------------------- 199 WRITE(6,*)'*** incomplete dssp file (eof) ' NRES=0 NCHAIN=0 WRITE(6,*) 'file: ',filename(1:40) CLOSE(KFILE) RETURN c finished reading----------------------- 900 CLOSE(KFILE) NRES=NRES-1 IF (NCHAIN .EQ. 1)RETURN DO I=1,NRES IF (CSEQ(I) .EQ. '!') THEN IBREAK=IBREAK+1 IBREAKPOS(IBREAK)=I ENDIF ENDDO IBREAK=IBREAK+1 CHAINID(1)=CRESID(1) c chainid(1)=cresid(1)(6:6) DO I=1,IBREAK CHAINID(I+1)=CRESID(IBREAKPOS(I)+1) c chainid(i+1)=cresid(ibreakpos(i)+1)(6:6) ENDDO IBREAKPOS(IBREAK)=NRES+1 DO ICHAIN=1,IBREAK-1 DO JCHAIN=ICHAIN+1,IBREAK IF (IBREAKPOS(ICHAIN)-IBREAKPOS(ICHAIN-1)-1 .EQ. + IBREAKPOS(JCHAIN)-IBREAKPOS(JCHAIN-1)-1 ) THEN J=IBREAKPOS(JCHAIN-1) DO I=IBREAKPOS(ICHAIN-1)+1,IBREAKPOS(ICHAIN)-1 J=J+1 IF (CSEQ(I) .NE. CSEQ(J)) THEN LSAME(ICHAIN,JCHAIN)=.FALSE. LSAME(JCHAIN,ICHAIN)=.FALSE. GOTO 50 ENDIF ENDDO 50 CONTINUE ELSE LSAME(ICHAIN,JCHAIN)=.FALSE. LSAME(JCHAIN,ICHAIN)=.FALSE. ENDIF ENDDO ENDDO DO I=1,NCHAIN-1 IF ( LTAKE(I) ) THEN DO J=I+1,NCHAIN IF (LSAME(I,J)) THEN LTAKE(J)=.FALSE. ENDIF ENDDO ENDIF ENDDO LALL=.TRUE. DO I=1,NCHAIN IF (LALL) THEN IF ( .NOT. LTAKE(I))LALL=.FALSE. ENDIF ENDDO CTEMP=' ' CALL STRPOS(FILENAME,ISTART,ISTOP) CTEMP=FILENAME(ISTART:ISTOP)//'_!_' DO I=1,NCHAIN IF (LTAKE(I)) THEN CALL STRPOS(CTEMP,ISTART,ISTOP) IF (CHAINID(I-1) .NE. CHAINID(I)) THEN WRITE(CTEMP(ISTOP+1:),'(A,A)')CHAINID(I),',' ENDIF ENDIF ENDDO CALL STRPOS(CTEMP,ISTART,ISTOP) IF (CTEMP(ISTOP:ISTOP) .EQ. ',') THEN CTEMP(ISTOP:ISTOP)=' ' ENDIF OUTNAME=' ' C IN CASE OF "ARTIFICIAL" CHAIN-BREAKS THE END IS EMPTY CALL STRPOS(CTEMP,ISTART,ISTOP) IF (CTEMP(ISTOP-2:ISTOP) .EQ. '_!_') THEN CALL STRPOS(FILENAME,ISTART,ISTOP) OUTNAME(1:)=FILENAME(ISTART:ISTOP) ELSE WRITE(OUTNAME(1:),'(A)')CTEMP(ISTART:ISTOP) ENDIF WRITE(6,*)'select_unique: ',outname(1:60) RETURN END C END SELECT_UNIQUE_CHAIN c$$$ C SUB SELECT_UNIQUE_CHAIN c$$$ subroutine select_unique_chain(kfile,filename,outname) c$$$C selects unique chains from dssp file, and builds up a new filename of the c$$$C form: $pdb:4hhb.dssp_!_A,B c$$$ c$$$ implicit none c$$$ c$$$ integer kfile c$$$cx character*80 filename,outname c$$$ character*(*) filename,outname c$$$C internal c$$$ integer MAXSQ c$$$ parameter (MAXSQ=4500) c$$$ integer nres,lacc(MAXSQ),iop,ntrans,kchain,nchain c$$$ integer ipdbno(MAXSQ) c$$$ character*6 cresid(MAXSQ) c$$$ character cseq(MAXSQ),struc(MAXSQ) c$$$ character*80 compound c$$$ character*12 ACCESSION,pdbref c$$$ character trans*26,cchain c$$$ character chains*26 c$$$ logical ldssp c$$$ c$$$ integer maxchain c$$$ parameter (maxchain=30) c$$$ integer ibreak,ibreakpos(0:maxchain),i,j,ichain,jchain c$$$ integer istart,istop c$$$ character chainid(0:maxchain) c$$$ character ctemp*100 c$$$ logical lall,lsame(maxchain,maxchain),ltake(maxchain) c$$$ logical ltruncated,lerror c$$$C init c$$$ ntrans=26 c$$$ TRANS='VLIMFWYGAPSTCHRKQENDBZX!-.' c$$$ iop=0 c$$$ ibreak=0 c$$$ ibreakpos(0)=0 c$$$ chainid(0)='?' c$$$ do i=1,maxchain c$$$ ibreakpos(i)=0 c$$$ chainid(i)=' ' c$$$ ltake(i)=.true. c$$$ do j=1,maxchain c$$$ lsame(i,j)=.true. c$$$ enddo c$$$ enddo c$$$C all chains wanted from DSSP data set c$$$ kchain=0 c$$$ chains=' ' c$$$ c$$$ c$$$ c$$$c call getseq(kfile,MAXSQ,nres,cresid,cseq,struc, c$$$c + lacc,ldssp,filename,compound,ACCESSION,pdbref,iop,trans, c$$$c + ntrans,kchain,nchain,cchain) c$$$ c$$$ if (nchain .eq. 1)return c$$$ do i=1,nres c$$$ if (cseq(i) .eq. '!') then c$$$ ibreak=ibreak+1 c$$$ ibreakpos(ibreak)=i c$$$ endif c$$$ enddo c$$$ ibreak=ibreak+1 c$$$ chainid(1)=cresid(1)(6:6) c$$$ do i=1,ibreak c$$$ chainid(i+1)=cresid(ibreakpos(i)+1)(6:6) c$$$ enddo c$$$ ibreakpos(ibreak)=nres+1 c$$$ do ichain=1,ibreak-1 c$$$ do jchain=ichain+1,ibreak c$$$ if (ibreakpos(ichain)-ibreakpos(ichain-1)-1 .eq. c$$$ + ibreakpos(jchain)-ibreakpos(jchain-1)-1 ) then c$$$ j=ibreakpos(jchain-1) c$$$ do i=ibreakpos(ichain-1)+1,ibreakpos(ichain)-1 c$$$ j=j+1 c$$$ if (cseq(i) .ne. cseq(j)) then c$$$ lsame(ichain,jchain)=.false. c$$$ lsame(jchain,ichain)=.false. c$$$ GOTO 50 c$$$ endif c$$$ enddo c$$$50 continue c$$$ else c$$$ lsame(ichain,jchain)=.false. c$$$ lsame(jchain,ichain)=.false. c$$$ endif c$$$ enddo c$$$ enddo c$$$ c$$$ do i=1,nchain-1 c$$$ if ( ltake(i) ) then c$$$ do j=i+1,nchain c$$$ if (lsame(i,j)) then c$$$ ltake(j)=.false. c$$$ endif c$$$ ENDDO c$$$ endif c$$$ enddo c$$$ c$$$ lall=.true. c$$$ do i=1,nchain c$$$ if (lall) then c$$$ if ( .not. ltake(i))lall=.false. c$$$ endif c$$$ enddo c$$$ c$$$ ctemp=' ' c$$$ call strpos(filename,istart,istop) c$$$ ctemp=filename(istart:istop)//'_!_' c$$$ do i=1,nchain c$$$ if (ltake(i)) then c$$$ call strpos(ctemp,istart,istop) c$$$ if (chainid(i-1) .ne. chainid(i)) then c$$$ write(ctemp(istop+1:),'(a,a)')chainid(i),',' c$$$ endif c$$$ endif c$$$ enddo c$$$ call strpos(ctemp,istart,istop) c$$$ if (ctemp(istop:istop) .eq. ',') then c$$$ ctemp(istop:istop)=' ' c$$$ endif c$$$ outname=' ' c$$$ write(outname(1:),'(a)')ctemp(istart:istop) c$$$ WRITE(6,*)outname c$$$ c$$$ return c$$$ end c$$$ C END SELECT_UNIQUE_CHAIN C...................................................................... C...................................................................... C SUB SEND_DATA_TO_NODE C send start signal and all data to workers C they have to wait until they received all information SUBROUTINE SEND_DATA_TO_NODE IMPLICIT NONE C import INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C internal C ISIZE is dummy variable; otherwise we have to pass a parameter , which C gets defined as a variable in the subroutines (not clear what happens) INTEGER ILINK cc integer iworker,iset,ilink,isize C C link=-1 means send to everybody C c if (mp_model .eq. 'PARIX') then c msgtype=idtop c if (lsmall_machine) then c do iworker=1,nworker c ilink= link(iworker) c call send_maxhom_data(ilink) c enddo c else c do iset=1,nworkset c ilink=sender_node(iset) c call send_maxhom_data(ilink) c ilink=receiver_node(iset) c isize=len(corepath) c call mp_put_string(msgtype,ilink,corepath,isize) c isize=len(corefile) c call mp_put_string(msgtype,ilink,corefile,isize) c enddo c endif c else if (mp_model .eq. 'DELTA') then c call mp_init_send() ; ilink=-1 c call send_maxhom_data(ilink) c else if (mp_model .eq. 'PVM3') then CALL MP_INIT_SEND() ILINK=-1 CALL SEND_MAXHOM_DATA(ILINK) c else if (mp_model .eq. 'PVM') then c call mp_init_send() ; ilink=-1 c call send_maxhom_data(ilink) c endif WRITE(6,*)' send init data finished' CALL FLUSH_UNIT(6) RETURN END C END SEND_DATA_TO_NODE C...................................................................... C...................................................................... C SUB SEND_MAXHOM_DATA SUBROUTINE SEND_MAXHOM_DATA(ILINK) C import INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' INTEGER ILINK C internal CHARACTER CPARSYTEC_BUG*(MAXSQ) INTEGER ISIZE,I INTEGER ILBACKWARD,ILINSERT_2,ILISTOFSEQ_2,ILSHOW_SAMESEQ, + ILSWISSBASE,ILDSSP_1,ILCONSERV_1,ILCONSERV_2, + ILCONSIMPORT,ILALL,ILFORMULA,ILTHRESHOLD, + ILCOMPSTR,ILPASS2,ILTRACE,ILONG_OUT,ILBATCH, + I3WAY,I3WAYDONE,IWARM_START,IBINARY c integer ilmixed_arch C init logicals C NOTE: LOGICALS are sent in an integer variable C on some machines its not clear what happens if one snets C logicals as integers ILBACKWARD=0 ILINSERT_2=0 ILISTOFSEQ_2=0 ILSHOW_SAMESEQ=0 ILSWISSBASE=0 ILDSSP_1=0 ILCONSERV_1=0 ILCONSERV_2=0 ILCONSIMPORT=0 ILALL=0 ILFORMULA=0 ILTHRESHOLD=0 ILCOMPSTR=0 ILPASS2=0 ILTRACE=0 ILONG_OUT=0 ILBATCH=0 I3WAY=0 I3WAYDONE=0 IWARM_START=0 IBINARY=0 c ilmixed_arch=0 MSGTYPE=1 CALL MP_PUT_INT4(MSGTYPE,ILINK,ID_HOST,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,N1,N_ONE) IF (N1 .GT. 0) THEN ISIZE=N1 CALL MP_PUT_INT4_ARRAY(MSGTYPE,ILINK,LSQ_1,ISIZE) CALL MP_PUT_INT4_ARRAY(MSGTYPE,ILINK,LSTRUC_1,ISIZE) CALL MP_PUT_INT4_ARRAY(MSGTYPE,ILINK,LSTRCLASS_1,ISIZE) CALL MP_PUT_INT4_ARRAY(MSGTYPE,ILINK,LACC_1,ISIZE) ISIZE=MAXBREAK CALL MP_PUT_INT4_ARRAY(MSGTYPE,ILINK,IBREAKPOS_1,ISIZE) CALL MP_PUT_INT4(MSGTYPE,ILINK,NBREAK_1,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,NBEST,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,IPROFBEG,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,IPROFEND,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,PROFILEMODE,N_ONE) ISIZE=NASCII CALL MP_PUT_INT4_ARRAY(MSGTYPE,ILINK,TRANSPOS,ISIZE) ISIZE=MAXCUTOFFSTEPS CALL MP_PUT_INT4_ARRAY(MSGTYPE,ILINK,ISOLEN,ISIZE) CALL MP_PUT_INT4(MSGTYPE,ILINK,NSTEP,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ISAFE,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,NSTRSTATES_1,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,NSTRSTATES_2,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,NIOSTATES_1,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,NIOSTATES_2,N_ONE) ISIZE=N1 CALL MP_PUT_INT4_ARRAY(MSGTYPE,ILINK,PDBNO_1,ISIZE) ISIZE=MAXCUTOFFSTEPS CALL MP_PUT_REAL4_ARRAY(MSGTYPE,ILINK,ISOIDE,ISIZE) ISIZE=N1 CALL MP_PUT_REAL4_ARRAY(MSGTYPE,ILINK,GAPOPEN_1,ISIZE) CALL MP_PUT_REAL4_ARRAY(MSGTYPE,ILINK,GAPELONG_1,ISIZE) CALL MP_PUT_REAL4(MSGTYPE,ILINK,OPEN_1,N_ONE) CALL MP_PUT_REAL4(MSGTYPE,ILINK,ELONG_1,N_ONE) CALL MP_PUT_REAL4_ARRAY(MSGTYPE,ILINK,CONSWEIGHT_1,ISIZE) ISIZE=MAXSQ*NTRANS CALL MP_PUT_REAL4_ARRAY(MSGTYPE,ILINK,SIMMETRIC_1,ISIZE) IF (PROFILEMODE .EQ. 6) THEN ISIZE= NTRANS * NTRANS * MAXSTRSTATES * MAXIOSTATES * + MAXSTRSTATES*MAXIOSTATES CALL MP_PUT_REAL4_ARRAY(MSGTYPE,ILINK,SIMORG,ISIZE) ENDIF CALL MP_PUT_REAL4(MSGTYPE,ILINK,FILTER_VAL,N_ONE) CALL MP_PUT_REAL4(MSGTYPE,ILINK,PUNISH,N_ONE) CALL MP_PUT_REAL4(MSGTYPE,ILINK,CUTVALUE1,N_ONE) CALL MP_PUT_REAL4(MSGTYPE,ILINK,CUTVALUE2,N_ONE) CALL MP_PUT_REAL4(MSGTYPE,ILINK,SMIN,N_ONE) CALL MP_PUT_REAL4(MSGTYPE,ILINK,SMAX,N_ONE) CALL MP_PUT_REAL4(MSGTYPE,ILINK,MAPLOW,N_ONE) CALL MP_PUT_REAL4(MSGTYPE,ILINK,MAPHIGH,N_ONE) ISIZE=MAXSTRSTATES*MAXIOSTATES CALL MP_PUT_REAL4_ARRAY(MSGTYPE,ILINK,IORANGE,ISIZE) ISIZE=LEN(MP_MODEL) CALL MP_PUT_STRING(MSGTYPE,ILINK,MP_MODEL,ISIZE) ISIZE=LEN(SPLIT_DB_PATH) CALL MP_PUT_STRING(MSGTYPE,ILINK,SPLIT_DB_PATH,ISIZE) ISIZE=LEN(SPLIT_DB_DATA) CALL MP_PUT_STRING(MSGTYPE,ILINK,SPLIT_DB_DATA,ISIZE) ISIZE=LEN(SWISSPROT_SEQ) CALL MP_PUT_STRING(MSGTYPE,ILINK,SWISSPROT_SEQ,ISIZE) ISIZE=LEN(LISTFILE_2) CALL MP_PUT_STRING(MSGTYPE,ILINK,LISTFILE_2,ISIZE) ISIZE=LEN(CSQ_1) CALL MP_PUT_STRING(MSGTYPE,ILINK,CSQ_1,ISIZE) C Parsytec bug c isize=MAXSQ c call mp_put_string_array(msgtype,ilink,struc_1,isize) c call mp_put_string_array(msgtype,ilink,chainid_1,isize) ISIZE=N1 DO I=1,N1 CPARSYTEC_BUG(I:I)=STRUC_1(I) ENDDO CALL MP_PUT_STRING(MSGTYPE,ILINK,CPARSYTEC_BUG,ISIZE) DO I=1,N1 CPARSYTEC_BUG(I:I)=CHAINID_1(I) ENDDO CALL MP_PUT_STRING(MSGTYPE,ILINK,CPARSYTEC_BUG,ISIZE) ISIZE=LEN(OPENWEIGHT_ANSWER) CALL MP_PUT_STRING(MSGTYPE,ILINK,OPENWEIGHT_ANSWER,ISIZE) ISIZE=LEN(ELONGWEIGHT_ANSWER) CALL MP_PUT_STRING(MSGTYPE,ILINK,ELONGWEIGHT_ANSWER,ISIZE) ISIZE=LEN(SMIN_ANSWER) CALL MP_PUT_STRING(MSGTYPE,ILINK,SMIN_ANSWER,ISIZE) ISIZE=LEN(NAME_1) CALL MP_PUT_STRING(MSGTYPE,ILINK,NAME_1,ISIZE) ISIZE=LEN(HSSPID_2) CALL MP_PUT_STRING(MSGTYPE,ILINK,HSSPID_2,ISIZE) ISIZE=LEN(CSORTMODE) CALL MP_PUT_STRING(MSGTYPE,ILINK,CSORTMODE,ISIZE) ISIZE=LEN(METRICFILE) CALL MP_PUT_STRING(MSGTYPE,ILINK,METRICFILE,ISIZE) ISIZE=LEN(CURRENT_DIR) CALL MP_PUT_STRING(MSGTYPE,ILINK,CURRENT_DIR,ISIZE) ISIZE=LEN(DSSP_PATH) CALL MP_PUT_STRING(MSGTYPE,ILINK,DSSP_PATH,ISIZE) ISIZE=LEN(PDBPATH) CALL MP_PUT_STRING(MSGTYPE,ILINK,PDBPATH,ISIZE) ISIZE=LEN(PLOTFILE) CALL MP_PUT_STRING(MSGTYPE,ILINK,PLOTFILE,ISIZE) ISIZE=LEN(COREPATH) CALL MP_PUT_STRING(MSGTYPE,ILINK,COREPATH,ISIZE) ISIZE=LEN(COREFILE) CALL MP_PUT_STRING(MSGTYPE,ILINK,COREFILE,ISIZE) ISIZE=LEN(TRANS) CALL MP_PUT_STRING(MSGTYPE,ILINK,TRANS,ISIZE) ISIZE=LEN(STRTRANS) CALL MP_PUT_STRING(MSGTYPE,ILINK,STRTRANS,ISIZE) ISIZE=LEN(CSTRSTATES) CALL MP_PUT_STRING(MSGTYPE,ILINK,CSTRSTATES,ISIZE) ISIZE=LEN(CIOSTATES) CALL MP_PUT_STRING(MSGTYPE,ILINK,CIOSTATES,ISIZE) DO I=1,MAXSTRSTATES ISIZE=LEN(STR_CLASSES(I)) CALL MP_PUT_STRING(MSGTYPE,ILINK,STR_CLASSES(I),ISIZE) ENDDO c if ( lmixed_arch ) ilmixed_arch=1 IF ( LBACKWARD ) ILBACKWARD=1 IF ( LINSERT_2 ) ILINSERT_2=1 IF ( LISTOFSEQ_2 ) ILISTOFSEQ_2=1 IF ( LSHOW_SAMESEQ ) ILSHOW_SAMESEQ=1 IF ( LSWISSBASE ) ILSWISSBASE=1 IF ( LDSSP_1 ) ILDSSP_1=1 IF ( LCONSERV_1 ) ILCONSERV_1=1 IF ( LCONSERV_2 ) ILCONSERV_2=1 IF ( LCONSIMPORT ) ILCONSIMPORT=1 IF ( LALL ) ILALL=1 IF ( LFORMULA ) ILFORMULA=1 IF ( LTHRESHOLD ) ILTHRESHOLD=1 IF ( LCOMPSTR ) ILCOMPSTR=1 IF ( LPASS2 ) ILPASS2=1 IF ( LTRACE ) ILTRACE=1 IF ( LONG_OUT ) ILONG_OUT=1 IF ( LBATCH ) ILBATCH=1 IF ( L3WAY ) I3WAY=1 IF ( L3WAYDONE ) I3WAYDONE=1 IF ( LWARM_START ) IWARM_START=1 IF ( LBINARY ) IBINARY=1 C CALL MP_PUT_INT4(MSGTYPE,ILINK,ILMIXED_ARCH,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILBACKWARD,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILINSERT_2,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILISTOFSEQ_2,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILSHOW_SAMESEQ,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILSWISSBASE,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILDSSP_1,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILCONSERV_1,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILCONSERV_2,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILCONSIMPORT,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILALL,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILFORMULA,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILTHRESHOLD,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILCOMPSTR,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILPASS2,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILTRACE,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILONG_OUT,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,ILBATCH,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,I3WAY,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,I3WAYDONE,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,IWARM_START,N_ONE) CALL MP_PUT_INT4(MSGTYPE,ILINK,IBINARY,N_ONE) ENDIF c if (mp_model .ne. 'PARIX') then c do ilink=1,nworker c WRITE(6,*)' send data to: ',link(ilink) c call mp_send_data(msgtype,link(ilink)) c enddo CALL MP_CAST(NWORKER,MSGTYPE,LINK(1)) c endif RETURN END C END SEND_MAXHOM_DATA C...................................................................... C...................................................................... C SUB SEQ_TO_INTEGER SUBROUTINE SEQ_TO_INTEGER(SEQ,LSEQ,NRES,TRANSPOS) C converts string of amino acid characters to amino acid integers. C uses integer table TRANSPOS C DOES NOT: internally converts DSSP SS bridges to 'C' before converting to C integer. Call "lower_to_cys" before calling this routine C input may contain funnies like '!' C output will be according to transpos IMPLICIT NONE C import CHARACTER*(*) SEQ INTEGER NRES INTEGER TRANSPOS(*) C export INTEGER LSEQ(*) C internal INTEGER I LOGICAL NOILLEGAL C NOILLEGAL=.TRUE. DO I=1,NRES LSEQ(I)=TRANSPOS ( ICHAR(SEQ(I:I)) ) IF (LSEQ(I) .LE. 0) THEN IF (NOILLEGAL) THEN NOILLEGAL=.FALSE. WRITE(6,'(A,I10,A,A,A1)')'*** ERROR SEQ_TO_INTEGER: '// + 'unknown res or chain separator I=',I, + ' =',SEQ(I:I),'|' ENDIF ENDIF ENDDO RETURN END C END SEQ_TO_INTEGER C...................................................................... C...................................................................... C SUB SETBACK SUBROUTINE SETBACK(N1BEG,N1END,N2BEG,N2END,N2,LH1,LH2, + BESTVAL_CHECK) C----------------------------------------------------------------------- C reverse SETMATRIX (see comments there also) C here the matrix is processed in the backward direction C the best path value is stored in a temporary array MAX_ALL(), C NO traceback is stored (this is done in SETMATRIX) C the original matrix values are overwritten by the sum of the forward C and backward path value C this allows the computation of all pairs of residues i,j that C CAN BE PART of an optimal and suboptimal alignments. C NOTE: optimal value forward = optimal value backward C LH_F(i-1,j-1) + sim_val(i,j) + LH_B(i+1,j+1) = LH_FB C LH_FB is the score of an optimal alignment of sequence A and B C which is constrained to align residue i with residue j C All matrix values for THE optimal path have the same value after C this routine. The matrix values can be displayed as a 2-D or 3-D C graph showing how reliable the alignment is. C in contrast to Zuker its done in the same memory C see: Zuker M., Suboptimal sequence alignment in molecular biology C Alignment with error analysis C J.Mol.Biol. (1991) 221, 403-429 C C 1,1 C \ C \ C LH_B(i+1,j+1)= best value from backward path up to i,j C C \ LH_FB = LH_F + LH_B + sim_val(i,j) C optimal path value trough i,j C C LH_F(i-1,j-1)= best value from forward path up to i,j C \ C \ C \ C N1,N2 C C===================================================================== C====================================================================== IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C import C DIMENSIONS AND ACTUAL SEQ LENGTH INTEGER N1BEG,N1END,N2BEG,N2END,N2 REAL BESTVAL_CHECK C IMPORT/EXPORT REAL LH1(0:N1+1,0:N2+1) INTEGER*2 LH2(0:N1+1,0:N2+1) C REAL LH(0:N1+1,0:N2+1,2) C INTERNAL INTEGER NSIZE1,NSIZE2 REAL SUM REAL BESTVAL C INTEGER BESTII,BESTJJ INTEGER I,J,II,JJ,IBEG,IEND,IIBEG,JJBEG,K INTEGER NDIAGONAL,LEN_DIAG,IDIAG,ISMALL_DIM,IBIG_DIM LOGICAL LERROR CHARACTER CTEMP*50 c======================================================================= c initialize c======================================================================= c WRITE(6,*)' setback: ',profilemode II=0 NSIZE1=N1END-N1BEG+1 NSIZE2=N2END-N2BEG+1 K=MAX(N1+1,N2+1) DO I=0,K C DO I=0,MAXSQ+1 MAX_H(I)=0.0 MAX_V(I)=0.0 RIGHT_LH(I)=0.0 DOWN_LH(I)=0.0 DIAG_LH(I)=0.0 MAX_ALL(I)=0.0 ENDDO BESTVAL=0.0 C BESTII=0 ; BESTJJ=0 C====================================================================== NDIAGONAL=NSIZE1+NSIZE2-1 C NDIAGONAL=IPROFEND-IPROFBEG+1+N2-1 ISMALL_DIM=MIN(NSIZE1,NSIZE2) IBIG_DIM=MAX(NSIZE1,NSIZE2) IIBEG=N1END-1 JJBEG=N2END LEN_DIAG=0 DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .GT. NSIZE2) THEN IIBEG=IIBEG-1 ELSE JJBEG=JJBEG-1 ENDIF JJ=JJBEG+IIBEG C===================================================================== C PROFILE 1 (NO PROFILES OR PROFILE FOR FIRST SEQUENCE) C-------------------------------------------------------------------- IF (PROFILEMODE .LE. 1) THEN DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 c===================================================================== c store best value for horizontal deletion c===================================================================== IF ( ( (MAX_H(JJ-II) - ELONG_GAP_1(II+1)) .GE. + (RIGHT_LH(JJ-II)-OPEN_GAP_1(II+1)) ) .AND. + ( (MAX_H(JJ-II) - ELONG_GAP_1(II+1)) .GT. 0.0 ) ) THEN MAX_H(JJ-II)= (MAX_H(JJ-II) - ELONG_GAP_1(II+1)) ELSE IF (( (RIGHT_LH(JJ-II)-OPEN_GAP_1(II+1)) .GE. + (MAX_H(JJ-II) - ELONG_GAP_1(II+1)) ) .AND. + ( (RIGHT_LH(JJ-II)-OPEN_GAP_1(II+1)) .GT. 0.0)) THEN MAX_H(JJ-II)= (RIGHT_LH(JJ-II) - OPEN_GAP_1(II+1)) ELSE MAX_H(JJ-II)= 0.0 ENDIF c===================================================================== c store best value for vertical deletion c===================================================================== IF ( ( (MAX_V(II) - ELONG_GAP_1(II+1)) .GE. + (DOWN_LH(II) - OPEN_GAP_1(II+1)) ) .AND. + ( (MAX_V(II) - ELONG_GAP_1(II+1)) .GT. 0.0 ) ) THEN MAX_V(II)=(MAX_V(II) - ELONG_GAP_1(II+1)) ELSE IF ( ( (DOWN_LH(II) - OPEN_GAP_1(II+1)) .GE. + (MAX_V(II) - ELONG_GAP_1(II+1)) ) .AND. + ((DOWN_LH(II) - OPEN_GAP_1(II+1)) .GT. 0.0)) THEN MAX_V(II)= (DOWN_LH(II) - OPEN_GAP_1(II+1)) ELSE MAX_V(II)= 0.0 ENDIF c====================================================================== c which value is the best (diagonal,horizontal or vertical) C====================================================================== MAX_D(II)= DIAG_LH(JJ-II)+METRIC_1(II+1,LSQ_2(JJ-II+1)) MAX_ALL(II)=MAX(MAX_D(II),MAX_V(II),MAX_H(JJ-II),0.0) C set matrix value to forward path + backward path + sim_val LH1(II+1,JJ-II+1)= LH1(II+1,JJ-II+1) + MAX_D(II) c if ( lh1(ii+1,jj-ii+1) .ge. subopt_val) then c lh2(ii+1,jj-ii+1)= -1 * lh2(ii+1,jj-ii+1) c endif IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN LH1(II,II)= 0.0 C LH2(II,II)= 0 MAX_ALL(II)=0.0 ENDIF DIAG_LH(JJ-II)=DOWN_LH(II) RIGHT_LH(JJ-II)=MAX_ALL(II) DOWN_LH(II)=MAX_ALL(II) IF (BESTVAL .LT. MAX_ALL(II) ) THEN BESTVAL=MAX_ALL(II) C BESTII=II ; BESTJJ=JJ-II ENDIF ENDDO C-------------------------------------------------------------------- C profile 2 (profile for sequence 2) C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 2) THEN DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 IF ( (MAX_H(JJ-II) - ELONG_GAP_2(JJ-II+1)) .GT. + (RIGHT_LH(JJ-II)-OPEN_GAP_1(II+1)) .AND. + (MAX_H(JJ-II) - ELONG_GAP_2(JJ-II+1)) .GT. 0.0 ) THEN MAX_H(JJ-II) = (MAX_H(JJ-II) - ELONG_GAP_2(JJ-II+1)) ELSE IF ( (RIGHT_LH(JJ-II)-OPEN_GAP_1(II+1)) .GE. + (MAX_H(JJ-II) - ELONG_GAP_2(JJ-II+1)) .AND. + (RIGHT_LH(JJ-II)-OPEN_GAP_1(II+1)) .GT. 0.0) THEN MAX_H(JJ-II) = (RIGHT_LH(JJ-II) - OPEN_GAP_1(II+1)) ELSE MAX_H(JJ-II) = 0.0 ENDIF IF ( (MAX_V(II) - ELONG_GAP_2(JJ-II+1)) .GT. + (DOWN_LH(II) - OPEN_GAP_1(II+1)) .AND. + (MAX_V(II) - ELONG_GAP_2(JJ-II+1)) .GT. 0.0 ) THEN MAX_V(II) = (MAX_V(II) - ELONG_GAP_2(JJ-II+1)) ELSE IF ( (DOWN_LH(II) - OPEN_GAP_1(II+1)) .GE. + (MAX_V(II) - ELONG_GAP_2(JJ-II+1)) .AND. + (DOWN_LH(II) - OPEN_GAP_1(II+1)) .GT. 0.0) THEN MAX_V(II)= (DOWN_LH(II) - OPEN_GAP_1(II+1)) ELSE MAX_V(II) = 0.0 ENDIF MAX_D(II)= DIAG_LH(JJ-II)+METRIC_2(JJ-II+1,LSQ_1(II+1)) MAX_ALL(II)=MAX(MAX_D(II),MAX_V(II),MAX_H(JJ-II),0.0) C set matrix value to forward path + backward path + sim_val LH1(II+1,JJ-II+1)= LH1(II+1,JJ-II+1) + DIAG_LH(JJ-II) + + METRIC_2(JJ-II+1,LSQ_1(II+1)) IF ( LH1(II+1,JJ-II+1) .GE. SUBOPT_VAL) THEN LH2(II+1,JJ-II+1)= -1 * LH2(II+1,JJ-II+1) ENDIF DIAG_LH(JJ-II) = DOWN_LH(II) RIGHT_LH(JJ-II)= MAX_ALL(II) DOWN_LH(II) = MAX_ALL(II) IF (BESTVAL .LT. MAX_ALL(II) ) THEN BESTVAL=MAX_ALL(II) C BESTII=II ; BESTJJ=JJ-II ENDIF ENDDO c-------------------------------------------------------------------- c full profile alignment C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 3) THEN DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 MAX_H(JJ-II)= MAX_H(JJ-II)- + (( ELONG_GAP_1(II+1)+ ELONG_GAP_2(JJ-II+1))* 0.5) IF ( (RIGHT_LH(JJ-II)- + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) *0.5 )) + .GE. MAX_H(JJ-II) .AND. + (RIGHT_LH(JJ-II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) *0.5 )) + .GT. 0.0) THEN MAX_H(JJ-II) = (RIGHT_LH(JJ-II) - + ( (OPEN_GAP_1(II+1)+OPEN_GAP_2(JJ-II+1)) *0.5 )) ELSE IF ( MAX_H(JJ-II) .LE. 0.0) THEN MAX_H(JJ-II) = 0.0 ENDIF MAX_V(II)= MAX_V(II)- + ( (ELONG_GAP_1(II+1)+ ELONG_GAP_2(JJ-II+1))* 0.5) IF ( (DOWN_LH(II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) * 0.5 )) + .GE. MAX_V(II) .AND. + (DOWN_LH(II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) * 0.5 )) + .GT. 0.0) THEN MAX_V(II)= (DOWN_LH(II) - + ((OPEN_GAP_1(II+1)+OPEN_GAP_2(JJ-II+1)) * 0.5 )) ELSE IF ( MAX_V(II) .LE. 0.0) THEN MAX_V(II) = 0.0 ENDIF SUM=0.0 DO K=1,NTRANS SUM = SUM + ( METRIC_1(II+1,K) * METRIC_2(JJ-II+1,K) ) ENDDO C MAX_D(II) = DIAG_LH(JJ-II) + (SUM/NTRANS) MAX_D(II) = DIAG_LH(JJ-II) + SUM C SET MATRIX VALUE TO FORWARD PATH + BACKWARD PATH + SIM_VAL LH1(II+1,JJ-II+1)= LH1(II+1,JJ-II+1) + DIAG_LH(JJ-II) + + SUM IF ( LH1(II+1,JJ-II+1) .GE. SUBOPT_VAL) THEN LH2(II+1,JJ-II+1)= -1 * LH2(II+1,JJ-II+1) ENDIF MAX_ALL(II)=MAX(MAX_D(II),MAX_V(II),MAX_H(JJ-II),0.0) DIAG_LH(JJ-II) = DOWN_LH(II) RIGHT_LH(JJ-II)= MAX_ALL(II) DOWN_LH(II) = MAX_ALL(II) IF (BESTVAL .LT. MAX_ALL(II) ) THEN BESTVAL=MAX_ALL(II) C BESTII=II ; BESTJJ=JJ-II ENDIF ENDDO c-------------------------------------------------------------------- c take sequences as representatives of family c-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 4) THEN DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 MAX_H(JJ-II)= MAX_H(JJ-II)- + ( (ELONG_GAP_1(II+1)+ELONG_GAP_2(JJ-II+1)) *0.5) IF ( (RIGHT_LH(JJ-II)- + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) *0.5 )) + .GE. MAX_H(JJ-II) .AND. + (RIGHT_LH(JJ-II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) *0.5 )) + .GT. 0.0) THEN MAX_H(JJ-II) = (RIGHT_LH(JJ-II) - + ( (OPEN_GAP_1(II+1)+OPEN_GAP_2(JJ-II+1)) *0.5 )) ELSE IF ( MAX_H(JJ-II) .LE. 0.0) THEN MAX_H(JJ-II) = 0.0 ENDIF MAX_V(II)= MAX_V(II)- + ( (ELONG_GAP_1(II+1)+ ELONG_GAP_2(JJ-II+1))* 0.5) IF ( (DOWN_LH(II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) * 0.5 )) + .GE. MAX_V(II) .AND. + (DOWN_LH(II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) * 0.5 )) + .GT. 0.0) THEN MAX_V(II)= (DOWN_LH(II) - + ( (OPEN_GAP_1(II+1)+OPEN_GAP_2(JJ-II+1)) * 0.5 )) ELSE IF ( MAX_V(II) .LE. 0.0) THEN MAX_V(II) = 0.0 ENDIF MAX_D(II)= DIAG_LH(JJ-II) + + (( METRIC_1 (II+1,LSQ_2(JJ-II+1)) + + METRIC_2 (JJ-II+1,LSQ_1(II+1)) ) * 0.5) MAX_ALL(II)=MAX(MAX_D(II),MAX_V(II),MAX_H(JJ-II),0.0) C SET MATRIX VALUE TO FORWARD PATH + BACKWARD PATH + SIM_VAL LH1(II+1,JJ-II+1)= LH1(II+1,JJ-II+1) + DIAG_LH(JJ-II) + + (( METRIC_1 (II+1,LSQ_2(JJ-II+1)) + + METRIC_2 (JJ-II+1,LSQ_1(II+1)) ) * 0.5) IF ( LH1(II+1,JJ-II+1) .GE. SUBOPT_VAL) THEN LH2(II+1,JJ-II+1)= -1.0 * LH2(II+1,JJ-II+1) ENDIF DIAG_LH(JJ-II) = DOWN_LH(II) RIGHT_LH(JJ-II)= MAX_ALL(II) DOWN_LH(II) = MAX_ALL(II) IF (BESTVAL .LT. MAX_ALL(II) ) THEN BESTVAL=MAX_ALL(II) C BESTII=II ; BESTJJ=JJ-II ENDIF ENDDO C-------------------------------------------------------------------- C take maximal value as consensus C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 5) THEN DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 MAX_H(JJ-II)=MAX_H(JJ-II) - + ((ELONG_GAP_1(II+1)+ELONG_GAP_2(JJ-II+1))*0.5) IF ( (RIGHT_LH(JJ-II)- + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) *0.5 )) + .GE. MAX_H(JJ-II) .AND. + (RIGHT_LH(JJ-II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) *0.5 )) + .GT. 0.0) THEN MAX_H(JJ-II) = (RIGHT_LH(JJ-II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) *0.5 )) ELSE IF ( MAX_H(JJ-II) .LE. 0.0) THEN MAX_H(JJ-II) = 0.0 ENDIF MAX_V(II)= MAX_V(II)- + ( (ELONG_GAP_1(II+1)+ ELONG_GAP_2(JJ-II+1))* 0.5) IF ( (DOWN_LH(II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) * 0.5 )) + .GE. MAX_V(II) .AND. + (DOWN_LH(II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) * 0.5 )) + .GT. 0.0) THEN MAX_V(II)= (DOWN_LH(II) - + ( (OPEN_GAP_1(II+1) + OPEN_GAP_2(JJ-II+1)) * 0.5 )) ELSE IF ( MAX_V(II) .LE. 0.0) THEN MAX_V(II) = 0.0 ENDIF MAX_D(II) = DIAG_LH(JJ-II) + + ((MAX_METRIC_1_VAL(II+1) + MAX_METRIC_2_VAL(II+1)) * 0.5) MAX_ALL(II)=MAX(MAX_D(II),MAX_V(II),MAX_H(JJ-II),0.0) C SET MATRIX VALUE TO FORWARD PATH + BACKWARD PATH + SIM_VAL LH1(II+1,JJ-II+1)= LH1(II+1,JJ-II+1) + DIAG_LH(JJ-II) + + ((MAX_METRIC_1_VAL(II+1) + MAX_METRIC_2_VAL(II+1)) * 0.5) IF ( LH1(II+1,JJ-II+1) .GE. SUBOPT_VAL) THEN LH2(II+1,JJ-II+1)= -1 * LH2(II+1,JJ-II+1) ENDIF DIAG_LH(JJ-II) = DOWN_LH(II) RIGHT_LH(JJ-II)= MAX_ALL(II) DOWN_LH(II) = MAX_ALL(II) IF (BESTVAL+0.0001 .LT. MAX_ALL(II) ) THEN BESTVAL=MAX_ALL(II) C BESTII=II ; BESTJJ=JJ-II ENDIF ENDDO C==================================================================== C PROFILE MODE SELECTION END ENDIF C======================================================================= IF (LSAMESEQ) THEN I=II IF (II .LE. 0)I=JJ-II LH1(I,I) = 0.0 RIGHT_LH(I)= 0.0 DOWN_LH(I) = 0.0 ENDIF C==================================================================== C next antidiagonal C==================================================================== ENDDO C==================================================================== c WRITE(6,*)' SETBACK: ',BESTVAL,BESTII,BESTJJ C write data for SciAn, XPrism3... IF (ABS(BESTVAL_CHECK - BESTVAL) .GT. 0.01) THEN WRITE(6,*)'*** FATAL ERROR in SETBACK' WRITE(6,*)' bestval_check .ne. bestval: ', + BESTVAL_CHECK,BESTVAL STOP ENDIF CTEMP=' ' WRITE(CTEMP,*) '(',N2,'(F7.2))' CALL STRPOS(CTEMP,IBEG,IEND) CALL OPEN_FILE(99,'matrix.dat','new,recl=20000',lerror) DO I=1,N1 WRITE(99,CTEMP(IBEG:IEND)) ( LH1(I,J),J=1,N2) ENDDO CLOSE(99) C==================================================================== RETURN END C END SETBACK C...................................................................... C...................................................................... C SUB SETMATRIX SUBROUTINE SETMATRIX(N1BEG,N1END,N2BEG,N2END,N2,LH1,LH2) C -------------------------------------------------------- C subroutine SETMATRIX finds LH matrix for maximum homologous C subsequence between any two sequences C generate the homology and traceback matrix C----------------------------------------------------------------------- C LH(.,.,1) is homology score C LH(.,.,2) is traceback value C encoding LDIREC and LDEL: DIREC + LDEL C LH(I,J,1) corresponds to seq postions II=I-1, JJ=J-1 C LH(1,.,1) and LH(.,1,1) are terminal margins C LDIREC 10000,20000,30000,40000 for termination,diagonal,vertical,horizontal C LDEL length of deletion C temporary values: C MAX_H(),MAX_V() best value for horizontal and verical deletions C LDEL_H,LDEL_V length of horizontal and vertical deletion C====================================================================== C JULY 1991 (RS) C MAXDEL restriction removed C see: O. Gotoh, An Improved Algorithm for Matching Biological C Sequences, JMB (1982) 162, 705-708 C----------------------------------------------------------------------- C JUNE 1991 (RS) C matrix setting in a antidiagonal way to run it in parallel C see: Jones R. et.al., Protein Sequence Comparison on the Connection C Machine CM-2, in: Computers and DNA, SFI Studies in the Sciences C of Complexity, Vol VII, Addison-Wesley, 1990 C====================================================================== C C ANTIDIAGONAL SETTING OF THE MATRIX C ================================== C N1,N2: length of sequence 1 and sequence 2 C ADVANTAGE: loop can run in parallel or vectorized C C C ICOUNT 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 C -----------------------------------------------------> C | sequence 1 ====> N1 C | 2345678901234567890123456789 <==IIBEG C | ------------------------------ C 1 | sequence2 | //// / /| C 2 | | 2 |//// / / | C 3 | | 3 |/// / / /| <== JJBEG C 4 | | 4 |// / / //| C 5 | v 5 |/ / / ///| C 6 | 6 | / / ////| C 7 | 7 | / / /////| C 8 | N2 8 |/ / //////| C 9 | -----------------------------| C 10 | C 11 | C 12 | C 13 | C V C C===================================================================== C at each position take the best value of: C C LH(i,j,1)= MAX( LH(i-1,j-1,1) + SIM(i,j) , MAX_H(j) ,MAX_V(i) ,0) C C LH(i-1,j-1,1) : best value of diagonal (no INDEL) C SIM(i,j) : similarity value for position i,j C MAX_H(j) : best value of horizontal INDELs C MAX_V(i) : best value of vertical INDELs C where: C MAX_H(i)=MAX( LH(I-1,J,1) - gap-open , MAX_H(i-1) - gap-elongation , 0) C MAX_V(j)=MAX( LH(I,J-1,1) - gap-open , MAX_V(j-1) - gap-elongation , 0) C NOTE: one has to store the length of the deletion for MAX_H() and MAX_V() C in LDEL_H(j) and LDEL_V() C C C NOTE: C 1) if no INDEL(s) in secondary structure allowed: C GAPOPEN contains PUNISH C 2) internal deletions are (postion dependent ) weighted as: C GAPOPEN + GAPELONG *LENGTH C 3) conservation weights: C gap penalties are dependent on sequence-position(s), so weight C gap-penalties with conservation-weights otherwise the gap penalties C in regions with low conservation are too big C 4) antidiagonal matrix setting: C position in sequence 2 is JJBEG+IIBEG-II: step back in sequence 1 and C down in sequence 2 C C 5) NOT LONGER VALID C if the MAXDEL option is set, one has to check if the number of C INDEL's exceeds the MAXDEL value. C In addition: when the value for opening a gap is higher than C for the elongation, we have to check if the previous length of C the gap is not greater than 0. C That means that for some special cases it's cheaper to punish C the alignment by some open-penalties in a row than to elongate C or continue the alignment in the diagonal. C open a gap if: C 1.) OPEN .gt. ELONG or C 2.) LDELx()+1 .ge. MAXDEL C 3.) but only if LDELx() .eq. 0 C====================================================================== IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C import C ACTUAL SEQ LENGTH INTEGER N1BEG,N1END,N2BEG,N2END,N2 c export REAL LH1(0:N1+1,0:N2+1) INTEGER*2 LH2(0:N1+1,0:N2+1) c real lh(0:n1+1,0:n2+1,2) c internal INTEGER NSIZE1,NSIZE2 REAL SUM,XMAX1,XMAX2 INTEGER I,J,K,NDAMP,NDIAGONAL,ISMALL_DIM,IBIG_DIM,IIBEG,JJBEG INTEGER LEN_DIAG,IDIAG,II,JJ C======================================================================= C DO SOME STUFF OUTSIDE THE LOOPS: C======================================================================= C initialize C======================================================================= NSIZE1=N1END-N1BEG+1 NSIZE2=N2END-N2BEG+1 DO I=N1BEG-1,N1END+1 LH1(I,N2BEG-1)=0.0 LH1(I,N2BEG)=0.0 ENDDO DO J=N2BEG-1,N2END+1 LH1(N1BEG-1,J)=0.0 LH1(N1BEG,J)=0.0 ENDDO C DO I=0,N1+1 LH(I,0,1)=0.0 ; LH(I,1,1)=0.0 ; ENDDO C DO J=0,N2+1 ; LH(0,J,1)=0.0 ; LH(1,J,1)=0.0 ; ENDDO J=MIN(N1BEG-1,N2BEG-1) K=MAX(N1END+1,N2END+1) DO I=J,K C DO I=0,MAXSQ+1 MAX_H(I)=0.0 MAX_V(I)=0.0 LDEL_H(I)=0 LDEL_V(I)=0 LEFT_LH(I)=0.0 UP_LH(I)=0.0 DIAG_LH(I)=0.0 ENDDO C======================================================================= C update the metric values (weights) C this can be done outside the main parallel loop C with this we save at lot of multiplications in the parallel loop C the update can be done in concurrent/vectorized mode C======================================================================= NDAMP=1 IF (PROFILEMODE .EQ. 6) THEN IF (LCONSERV_1) THEN DO I=N1BEG,N1END OPEN_GAP_1(I) = GAPOPEN_1(I) * CONSWEIGHT_1(I) ENDDO DO I=N1BEG,N1END ELONG_GAP_1(I)= GAPELONG_1(I) * CONSWEIGHT_1(I) ENDDO C DAMP PENALTIES CALL DAMP_GAPWEIGHT(N1BEG,N1END,OPEN_GAP_1,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N1BEG,N1END,ELONG_GAP_1,NDAMP,PUNISH) ELSE DO I=N1BEG,N1END OPEN_GAP_1(I) = GAPOPEN_1 (I) ENDDO DO I=N1BEG,N1END ELONG_GAP_1(I)= GAPELONG_1(I) ENDDO CALL DAMP_GAPWEIGHT(N1BEG,N1END,OPEN_GAP_1,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N1BEG,N1END,ELONG_GAP_1,NDAMP,PUNISH) ENDIF IF (LCONSERV_2) THEN DO I=N2BEG,N2END OPEN_GAP_2(I) = GAPOPEN_2(I) * CONSWEIGHT_2(I) ENDDO DO I=N2BEG,N2END ELONG_GAP_2(I)= GAPELONG_2(I) * CONSWEIGHT_2(I) ENDDO CALL DAMP_GAPWEIGHT(N2BEG,N2END,OPEN_GAP_2,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N2BEG,N2END,ELONG_GAP_2,NDAMP,PUNISH) ELSE DO I=N2BEG,N2END OPEN_GAP_2(I) = GAPOPEN_2(I) ENDDO DO I=N2BEG,N2END ELONG_GAP_2(I)= GAPELONG_2(I) ENDDO CALL DAMP_GAPWEIGHT(N2BEG,N2END,OPEN_GAP_2,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N2BEG,N2END,ELONG_GAP_2,NDAMP,PUNISH) ENDIF C============================= ELSE IF (PROFILEMODE .NE. 2) THEN IF (LCONSERV_1) THEN DO K=1,NTRANS DO I=N1BEG,N1END METRIC_1(I,K) = SIMMETRIC_1(I,K) * CONSWEIGHT_1(I) ENDDO ENDDO DO I=N1BEG,N1END OPEN_GAP_1(I) = GAPOPEN_1(I) * CONSWEIGHT_1(I) ENDDO DO I=N1BEG,N1END ELONG_GAP_1(I)= GAPELONG_1(I) * CONSWEIGHT_1(I) ENDDO c damp penalties CALL DAMP_GAPWEIGHT(N1BEG,N1END,OPEN_GAP_1,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N1BEG,N1END,ELONG_GAP_1,NDAMP,PUNISH) ELSE DO K=1,NTRANS DO I=N1BEG,N1END METRIC_1(I,K) = SIMMETRIC_1(I,K) ENDDO ENDDO DO I=N1BEG,N1END OPEN_GAP_1(I) = GAPOPEN_1 (I) ENDDO DO I=N1BEG,N1END ELONG_GAP_1(I)= GAPELONG_1(I) ENDDO CALL DAMP_GAPWEIGHT(N1BEG,N1END,OPEN_GAP_1,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N1BEG,N1END,ELONG_GAP_1,NDAMP,PUNISH) ENDIF ENDIF IF (PROFILEMODE .GE. 2) THEN IF (LCONSERV_2) THEN DO K=1,NTRANS DO I=N2BEG,N2END METRIC_2(I,K) = SIMMETRIC_2(I,K) * CONSWEIGHT_2(I) ENDDO ENDDO DO I=N2BEG,N2END OPEN_GAP_2(I) = GAPOPEN_2(I) * CONSWEIGHT_2(I) ENDDO DO I=N2BEG,N2END ELONG_GAP_2(I)= GAPELONG_2(I) * CONSWEIGHT_2(I) ENDDO CALL DAMP_GAPWEIGHT(N2BEG,N2END,OPEN_GAP_2,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N2BEG,N2END,ELONG_GAP_2,NDAMP,PUNISH) ELSE DO K=1,NTRANS DO I=N2BEG,N2END METRIC_2(I,K) = SIMMETRIC_2(I,K) ENDDO ENDDO DO I=N2BEG,N2END OPEN_GAP_2(I) = GAPOPEN_2(I) ENDDO DO I=N2BEG,N2END ELONG_GAP_2(I)= GAPELONG_2(I) ENDDO CALL DAMP_GAPWEIGHT(N2BEG,N2END,OPEN_GAP_2,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N2BEG,N2END,ELONG_GAP_2,NDAMP,PUNISH) ENDIF ENDIF IF (PROFILEMODE .EQ. 5) THEN DO I=N1BEG,N1END MAX_METRIC_1_VAL(I)=-10000.0 DO K=1,NTRANS MAX_METRIC_1_VAL(I)= + MAX(METRIC_1(I,K),MAX_METRIC_1_VAL(I)) ENDDO ENDDO DO J=N2BEG,N2END MAX_METRIC_2_VAL(J)=-10000.0 DO K=1,NTRANS MAX_METRIC_2_VAL(J)= + MAX(METRIC_2(J,K),MAX_METRIC_2_VAL(J)) ENDDO ENDDO ENDIF IF ( PROFILEMODE .EQ. 3 ) THEN DO I=N1BEG,N1END SUM=0.0 DO K=1,NTRANS SUM= SUM + ( METRIC_1(I,K) * METRIC_1(I,K) ) ENDDO SUM= SQRT(SUM) DO K=1,NTRANS METRIC_1(I,K)= METRIC_1(I,K) / SUM ENDDO ENDDO DO I=N2BEG,N2END SUM=0.0 DO K=1,NTRANS SUM= SUM + ( METRIC_2(I,K) * METRIC_2(I,K) ) ENDDO SUM= SQRT(SUM) DO K=1,NTRANS METRIC_2(I,K)= METRIC_2(I,K) / SUM ENDDO ENDDO ENDIF c====================================================================== NDIAGONAL=NSIZE1+NSIZE2-1 c ndiagonal=iprofend-iprofbeg+1+n2-1 c WRITE(6,'(A,I6)')' NUMBER OF ANTIDIAGONALS: ',NDIAGONAL ISMALL_DIM=MIN(NSIZE1,NSIZE2) IBIG_DIM=MAX(NSIZE1,NSIZE2) IIBEG=N1BEG JJBEG=N2BEG+1 LEN_DIAG=0 C===================================================================== C profile 1 (no profiles or profile for first sequence) C-------------------------------------------------------------------- IF (PROFILEMODE .LE. 1) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG C==================================================================== C THIS LOOP CAN BE EXECUTED IN VECTOR-MODE C====================================================================== C compiler directives for vector C---------------------------------------------------------------------- DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 C====================================================================== C values for diagonal, horizontal and vertical (open and elongation) C===================================================================== C store best value and length for horizontal deletion C===================================================================== MAX_H(JJ-II) = (MAX_H(JJ-II) - ELONG_GAP_1(II-1)) IF ((MAX_H(JJ-II) .GE.(LEFT_LH(JJ-II)-OPEN_GAP_1(II-1))) + .AND. (MAX_H(JJ-II) .GT.0.0 )) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF (((LEFT_LH(JJ-II)-OPEN_GAP_1(II-1)) .GE. + MAX_H(JJ-II)) + .AND. ((LEFT_LH(JJ-II)-OPEN_GAP_1(II-1)) .GT. + 0.0)) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II)= (LEFT_LH(JJ-II) - OPEN_GAP_1(II-1)) ELSE MAX_H(JJ-II)= 0.0 LDEL_H(JJ-II)= 0 ENDIF c===================================================================== c store best value and length for vertical deletion c===================================================================== MAX_V(II) = (MAX_V(II) - ELONG_GAP_1(II-1)) IF ((MAX_V(II).GE.(UP_LH(II) - OPEN_GAP_1(II-1))) .AND. + ( MAX_V(II) .GT. 0.0) ) THEN LDEL_V(II)= LDEL_V(II) + 1 ELSE IF (((UP_LH(II) - OPEN_GAP_1(II-1)).GE.MAX_V(II)) + .AND.((UP_LH(II)- OPEN_GAP_1(II-1)).GT. + 0.0)) THEN MAX_V(II)= (UP_LH(II) - OPEN_GAP_1(II-1)) LDEL_V(II)=1 ELSE MAX_V(II)= 0.0 LDEL_V(II)= 0 ENDIF C====================================================================== C which value is the best (diagonal,horizontal or vertical) C store traceback C LDIREC and LDEL are coded in one number C====================================================================== MAX_D(II)= DIAG_LH(JJ-II)+METRIC_1(II-1,LSQ_2(JJ-II-1)) IF ( (MAX_D(II) .GE. MAX_V(II) ) .AND. + (MAX_D(II) .GE. MAX_H(JJ-II)) .AND. + (MAX_D(II) .GT. 0.0 )) THEN LH1(II,JJ-II)= MAX_D(II) ELSE IF ( (MAX_V(II) .GE. MAX_D(II) ) .AND. + (MAX_V(II) .GE. MAX_H(JJ-II)) .AND. + (MAX_V(II) .GT. 0.0 )) THEN LH1(II,JJ-II)= MAX_V(II) LH2(II,JJ-II)= 10000 + LDEL_V(II) ELSE IF ( (MAX_H(JJ-II) .GE. MAX_D(II)) .AND. + (MAX_H(JJ-II) .GE. MAX_V(II)) .AND. + (MAX_H(JJ-II) .GT. 0.0 )) THEN LH1(II,JJ-II)= MAX_H(JJ-II) LH2(II,JJ-II)= 20000 + LDEL_H(JJ-II) ELSE LH1(II,JJ-II)= 0.0 LH2(II,JJ-II)= 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN LH1(II,II)= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= LH1(II,JJ-II) UP_LH(II)= LH1(II,JJ-II) ENDDO c if (lsameseq) then c x= ( float(iibeg)/ 2.0) + (float(jjbeg)/2.0) c i=nint(x) c lh1(i,i) = 0.0 ; lh2(i,i)= 0 c left_lh(i)= 0.0 c up_lh(i) = 0.0 c WRITE(6,*)iibeg,jjbeg,i c endif C==================================================================== C next antidiagonal C==================================================================== ENDDO C-------------------------------------------------------------------- C profile 2 (profile for sequence 2) C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 2) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG CPAR$ DO_PARALLEL cccC$DIR PARALLEL cvd$ nodepchk DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 MAX_H(JJ-II) = (MAX_H(JJ-II) - ELONG_GAP_2(JJ-II-1)) IF (MAX_H(JJ-II) .GT. + (LEFT_LH(JJ-II)-OPEN_GAP_2(JJ-II-1)) .AND. + MAX_H(JJ-II) .GT.0.0 ) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF ( (LEFT_LH(JJ-II)-OPEN_GAP_2(JJ-II-1)) .GE. + MAX_H(JJ-II) .AND. + (LEFT_LH(JJ-II)-OPEN_GAP_2(JJ-II-1)).GT. + 0.0) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II) = (LEFT_LH(JJ-II)-OPEN_GAP_2(JJ-II-1)) ELSE MAX_H(JJ-II) = 0.0 LDEL_H(JJ-II)= 0 ENDIF c===================================================================== c store best value and length for vertical deletion c===================================================================== MAX_V(II) = (MAX_V(II) - ELONG_GAP_2(JJ-II-1)) IF (MAX_V(II).GT.(UP_LH(II) - OPEN_GAP_2(JJ-II-1)).AND. + MAX_V(II) .GT. 0.0 ) THEN LDEL_V(II) = LDEL_V(II) + 1 ELSE IF ((UP_LH(II)-OPEN_GAP_2(JJ-II-1)) .GE. MAX_V(II) + .AND.(UP_LH(II)-OPEN_GAP_2(JJ-II-1)).GT.0.0) THEN MAX_V(II)= (UP_LH(II) - OPEN_GAP_2(JJ-II-1)) LDEL_V(II)= 1 ELSE MAX_V(II) = 0.0 LDEL_V(II)= 0 ENDIF c====================================================================== c which value is the best (diagonal,horizontal or vertical) c====================================================================== MAX_D(II)= DIAG_LH(JJ-II)+METRIC_2(JJ-II-1,LSQ_1(II-1)) IF ( MAX_D(II) .GE. MAX_V(II) .AND. + MAX_D(II) .GE. MAX_H(JJ-II) .AND. + MAX_D(II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_D(II) ELSE IF ( MAX_V(II) .GT. MAX_D(II) .AND. + MAX_V(II) .GT. MAX_H(JJ-II) .AND. + MAX_V(II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_V(II) LH2(II,JJ-II) = 10000 + LDEL_V(II) ELSE IF ( MAX_H(JJ-II) .GT. MAX_D(II) .AND. + MAX_H(JJ-II) .GT. MAX_V(II) .AND. + MAX_H(JJ-II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_H(JJ-II) LH2(II,JJ-II) = 20000 + LDEL_H(JJ-II) ELSE LH1(II,JJ-II) = 0.0 LH2(II,JJ-II) = 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN LH1(II,II)= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= LH1(II,JJ-II) UP_LH(II) = LH1(II,JJ-II) ENDDO C==================================================================== C next antidiagonal C==================================================================== ENDDO c-------------------------------------------------------------------- C full profile alignment C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 3) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG CPAR$ DO_PARALLEL cccC$DIR PARALLEL cvd$ nodepchk DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 SUM=0.0 XMAX1=0.0 XMAX2=0.0 DO K=1,NTRANS SUM = SUM + ( METRIC_1(II-1,K) * METRIC_2(JJ-II-1,K) ) IF ( ( METRIC_1(II-1,K) * METRIC_2(JJ-II-1,K) ) + .GT. XMAX1 ) THEN XMAX1 = ( METRIC_1(II-1,K) * METRIC_2(JJ-II-1,K) ) ENDIF ENDDO OPEN_GAP_1(II-1) = OPEN_GAP_1(II-1) * XMAX1 ELONG_GAP_1(II-1) = ELONG_GAP_1(II-1) * XMAX1 OPEN_GAP_2(JJ-II-1) = OPEN_GAP_2(JJ-II-1) * XMAX1 ELONG_GAP_2(JJ-II-1) = ELONG_GAP_2(JJ-II-1) * XMAX1 MAX_H(JJ-II)= MAX_H(JJ-II)- + (( ELONG_GAP_1(II-1)+ ELONG_GAP_2(JJ-II-1))* 0.5) IF (MAX_H(JJ-II) .GT. (LEFT_LH(JJ-II)- + (( OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .AND. MAX_H(JJ-II) .GT.0.0 ) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF ( (LEFT_LH(JJ-II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GE. MAX_H(JJ-II) .AND. (LEFT_LH(JJ-II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GT.0.0) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II) = (LEFT_LH(JJ-II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) ELSE MAX_H(JJ-II) = 0.0 LDEL_H(JJ-II)= 0 ENDIF C===================================================================== C store best value and length for vertical deletion C===================================================================== MAX_V(II)= MAX_V(II)- + (( ELONG_GAP_1(II-1)+ ELONG_GAP_2(JJ-II-1))* 0.5) IF ( MAX_V(II) .GT. (UP_LH(II)- + (( OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) .AND. + MAX_V(II) .GT. 0.0 ) THEN LDEL_V(II) = LDEL_V(II) + 1 ELSE IF ( (UP_LH(II)- (( OPEN_GAP_1(II-1)+ + OPEN_GAP_2(JJ-II-1))*0.5)) .GE. MAX_V(II) + .AND. (UP_LH(II)- ((OPEN_GAP_1(II-1)+ + OPEN_GAP_2(JJ-II-1))*0.5)).GT.0.0) THEN MAX_V(II)= (UP_LH(II)- + (( OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) LDEL_V(II)= 1 ELSE MAX_V(II) = 0.0 LDEL_V(II)= 0 ENDIF C====================================================================== C which value is the best (diagonal,horizontal or vertical) C store traceback C LDIREC and LDEL are coded in one number C====================================================================== MAX_D(II) = DIAG_LH(JJ-II) + SUM IF ( MAX_D(II) .GE. MAX_V(II) .AND. + MAX_D(II) .GE. MAX_H(JJ-II) .AND. + MAX_D(II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_D(II) ELSE IF ( MAX_V(II) .GT. MAX_D(II) .AND. + MAX_V(II) .GT. MAX_H(JJ-II) .AND. + MAX_V(II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_V(II) LH2(II,JJ-II) = 10000 + LDEL_V(II) ELSE IF ( MAX_H(JJ-II) .GT. MAX_D(II) .AND. + MAX_H(JJ-II) .GT. MAX_V(II) .AND. + MAX_H(JJ-II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_H(JJ-II) LH2(II,JJ-II) = 20000 + LDEL_H(JJ-II) ELSE LH1(II,JJ-II) = 0.0 LH2(II,JJ-II) = 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN LH1(II,II)= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= LH1(II,JJ-II) UP_LH(II) = LH1(II,JJ-II) ENDDO C==================================================================== C next antidiagonal C==================================================================== ENDDO C-------------------------------------------------------------------- C take sequences as representatives of family C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 4) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG CPAR$ DO_PARALLEL cccC$DIR PARALLEL cvd$ nodepchk DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 MAX_H(JJ-II)= MAX_H(JJ-II)- + ( (ELONG_GAP_1(II-1)+ELONG_GAP_2(JJ-II-1)) *0.5) IF (MAX_H(JJ-II) .GT. (LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .AND. MAX_H(JJ-II) .GT.0.0 ) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF ( (LEFT_LH(JJ-II) - ((OPEN_GAP_1(II-1)+ + OPEN_GAP_2(JJ-II-1))*0.5)) .GE. MAX_H(JJ-II) + .AND. (LEFT_LH(JJ-II) - ((OPEN_GAP_1(II-1)+ + OPEN_GAP_2(JJ-II-1))*0.5)) .GT. 0.0 ) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II) = (LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) ELSE MAX_H(JJ-II) = 0.0 LDEL_H(JJ-II)= 0 ENDIF C===================================================================== C store best value and length for vertical deletion C===================================================================== MAX_V(II)= (MAX_V(II)- + ( (ELONG_GAP_1(II-1)+ELONG_GAP_2(JJ-II-1)) *0.5)) IF ( MAX_V(II) .GT. (UP_LH(II)- + ((OPEN_GAP_1(II-1) +OPEN_GAP_2(JJ-II-1)) *0.5)) + .AND. MAX_V(II) .GT. 0.0 ) THEN LDEL_V(II) = LDEL_V(II) + 1 ELSE IF ( (UP_LH(II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)).GE. + MAX_V(II) + .AND. (UP_LH(II)- ((OPEN_GAP_1(II-1) + + OPEN_GAP_2(JJ-II-1)) *0.5)) .GT. 0.0 ) THEN MAX_V(II)=(UP_LH(II)- + ((OPEN_GAP_1(II-1) +OPEN_GAP_2(JJ-II-1)) *0.5)) LDEL_V(II)= 1 ELSE MAX_V(II) = 0.0 LDEL_V(II)= 0 ENDIF C====================================================================== C which value is the best (diagonal,horizontal or vertical) C store traceback C LDIREC and LDEL are coded in one number C====================================================================== MAX_D(II)= DIAG_LH(JJ-II) + + (( METRIC_1 (II-1,LSQ_2(JJ-II-1)) + + METRIC_2 (JJ-II-1,LSQ_1(II-1)) ) * 0.5) IF ( MAX_D(II) .GE. MAX_V(II) .AND. + MAX_D(II) .GE. MAX_H(JJ-II) .AND. + MAX_D(II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_D(II) ELSE IF ( MAX_V(II) .GT. MAX_D(II) .AND. + MAX_V(II) .GT. MAX_H(JJ-II) .AND. + MAX_V(II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_V(II) LH2(II,JJ-II) = 10000 + LDEL_V(II) ELSE IF ( MAX_H(JJ-II) .GT. MAX_D(II) .AND. + MAX_H(JJ-II) .GT. MAX_V(II) .AND. + MAX_H(JJ-II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_H(JJ-II) LH2(II,JJ-II) = 20000 + LDEL_H(JJ-II) ELSE LH1(II,JJ-II) = 0.0 LH2(II,JJ-II) = 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN LH1(II,II)= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= LH1(II,JJ-II) UP_LH(II) = LH1(II,JJ-II) ENDDO C==================================================================== C next antidiagonal C==================================================================== ENDDO C-------------------------------------------------------------------- C take maximal value as consensus C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 5) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG CPAR$ DO_PARALLEL cccC$DIR PARALLEL cvd$ nodepchk DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 MAX_H(JJ-II)=MAX_H(JJ-II) - + ((ELONG_GAP_1(II-1)+ELONG_GAP_2(JJ-II-1))*0.5) IF (MAX_H(JJ-II) .GT. (LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .AND. MAX_H(JJ-II) .GT. 0.0 ) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF ((LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)).GE. + MAX_H(JJ-II) .AND. (LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GT. 0.0) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II) = (LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) ELSE MAX_H(JJ-II) = 0.0 LDEL_H(JJ-II)= 0 ENDIF C===================================================================== C store best value and length for vertical deletion C===================================================================== MAX_V(II)= MAX_V(II) - + ( (ELONG_GAP_1(II-1)+ELONG_GAP_2(JJ-II-1))*0.5) IF ( MAX_V(II) .GT. (UP_LH(II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) .AND. + MAX_V(II) .GT. 0.0 ) THEN LDEL_V(II) = LDEL_V(II) + 1 ELSE IF ((UP_LH(II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)).GE. + MAX_V(II) .AND. (UP_LH(II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GT. 0.0) THEN MAX_V(II)= (UP_LH(II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) LDEL_V(II)= 1 ELSE MAX_V(II) = 0.0 LDEL_V(II)= 0 ENDIF C====================================================================== C which value is the best (diagonal,horizontal or vertical) C store traceback C LDIREC and LDEL are coded in one number C====================================================================== MAX_D(II) = DIAG_LH(JJ-II) + + ((MAX_METRIC_1_VAL(II-1)+MAX_METRIC_2_VAL(II-1))*0.5) IF ( MAX_D(II) .GE. MAX_V(II) .AND. + MAX_D(II) .GE. MAX_H(JJ-II) .AND. + MAX_D(II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_D(II) ELSE IF ( MAX_V(II) .GT. MAX_D(II) .AND. + MAX_V(II) .GT. MAX_H(JJ-II) .AND. + MAX_V(II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_V(II) LH2(II,JJ-II) = 10000 + LDEL_V(II) ELSE IF ( MAX_H(JJ-II) .GT. MAX_D(II) .AND. + MAX_H(JJ-II) .GT. MAX_V(II) .AND. + MAX_H(JJ-II) .GT. 0.0) THEN LH1(II,JJ-II) = MAX_H(JJ-II) LH2(II,JJ-II) = 20000 + LDEL_H(JJ-II) ELSE LH1(II,JJ-II) = 0.0 LH2(II,JJ-II) = 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN LH1(II,II)= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= LH1(II,JJ-II) UP_LH(II) = LH2(II,JJ-II) ENDDO C==================================================================== C next antidiagonal C==================================================================== ENDDO C==================================================================== ELSE IF (PROFILEMODE .EQ. 6) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG CPAR$ DO_PARALLEL cccC$DIR PARALLEL cvd$ nodepchk DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 C===================================================================== MAX_H(JJ-II) = (MAX_H(JJ-II) - ELONG_GAP_1(II-1)) IF ((MAX_H(JJ-II) .GE.(LEFT_LH(JJ-II)-OPEN_GAP_1(II-1))) + .AND. (MAX_H(JJ-II) .GT.0.0 )) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF (((LEFT_LH(JJ-II)-OPEN_GAP_1(II-1)) .GE. + MAX_H(JJ-II)) + .AND.((LEFT_LH(JJ-II)-OPEN_GAP_1(II-1)).GT. + 0.0)) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II)= (LEFT_LH(JJ-II) - OPEN_GAP_1(II-1)) ELSE MAX_H(JJ-II)= 0.0 LDEL_H(JJ-II)= 0 ENDIF c===================================================================== MAX_V(II) = (MAX_V(II) - ELONG_GAP_1(II-1)) IF ( (MAX_V(II) .GE. (UP_LH(II) - OPEN_GAP_1(II-1))) + .AND.( MAX_V(II) .GT. 0.0) ) THEN LDEL_V(II)= LDEL_V(II) + 1 ELSE IF (((UP_LH(II) - OPEN_GAP_1(II-1)).GE. MAX_V(II)) + .AND. ((UP_LH(II) - OPEN_GAP_1(II-1)) .GT. + 0.0)) THEN MAX_V(II)= (UP_LH(II) - OPEN_GAP_1(II-1)) LDEL_V(II)=1 ELSE MAX_V(II)= 0.0 LDEL_V(II)= 0 ENDIF C====================================================================== MAX_D(II)= DIAG_LH(JJ-II) + + SIMORG(LSQ_1(II-1),LSQ_2(JJ-II-1),LSTRCLASS_1(II-1), + LACC_1(II-1),LSTRCLASS_2(JJ-II-1), + LACC_2(JJ-II-1) ) IF ( (MAX_D(II) .GE. MAX_V(II) ) .AND. + (MAX_D(II) .GE. MAX_H(JJ-II)) .AND. + (MAX_D(II) .GT. 0.0 )) THEN LH1(II,JJ-II)= MAX_D(II) ELSE IF ( (MAX_V(II) .GE. MAX_D(II) ) .AND. + (MAX_V(II) .GE. MAX_H(JJ-II)) .AND. + (MAX_V(II) .GT. 0.0 )) THEN LH1(II,JJ-II)= MAX_V(II) LH2(II,JJ-II)= 10000 + LDEL_V(II) ELSE IF ( (MAX_H(JJ-II) .GE. MAX_D(II)) .AND. + (MAX_H(JJ-II) .GE. MAX_V(II)) .AND. + (MAX_H(JJ-II) .GT. 0.0 )) THEN LH1(II,JJ-II)= MAX_H(JJ-II) LH2(II,JJ-II)= 20000 + LDEL_H(JJ-II) ELSE LH1(II,JJ-II)= 0.0 LH2(II,JJ-II)= 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN LH1(II,II)= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= LH1(II,JJ-II) UP_LH(II)= LH1(II,JJ-II) ENDDO C==================================================================== C next antidiagonal C==================================================================== ENDDO C==================================================================== C PROFILE MODE SELECTION END ENDIF C==================================================================== C debug: output the LH (values and trace-back)matrix c call open_file(99,'matrix.dat','new,recl=2000',lerror) c nii=n1+1 ; njj=n2+1 c write(99,*) 'H-MATRIX Hij' c write(99,*)'Index i for Seq. 1' ; write(99,*)'Index j for Seq. 2' c do i=1,nii c write(99,'(i6)')i ; write(99,'(2x,20(i6))')(lh1(i,j),j=1,njj) c enddo c write(99,*)'TRACE-BACK MATRIX' c do i=1,nii c write(99,'(i6)')i ; write(99,'(2x,20(i6))')(lh2(i,j),j=1,njj) c enddo c close(99) C C write data for XPrism3 c call open_file(99,'xprism3.dat','new',lerror) c do I=0,N1+1 c write(99,*) (lh1(i,j),J=0,N2+1) c enddo c do I=0,N1+1 ; do J=0,N2+1 cc write(99,'(2x,i5,2x,i4,f7.2)')i,j,lh1(i,j) cc write(99,'(2x,i5,2x,i4,f7.2)')i,j,lh1(i,j) c trace back cc write(99,'(2x,i5,2x,i4,f7.2,1x,i6)')i,j,lh1(i,j),lh2(i,j) c ENDDO; enddo c close(99) C======================================================================= RETURN END C END SETMATRIX C...................................................................... C...................................................................... C SUB SETMATRIX_FAST SUBROUTINE SETMATRIX_FAST(N1BEG,N1END,N2BEG,N2END,N2,LH2, + BESTVAL,BESTIIPOS,BESTJJPOS) C -------------------------------------------------------- C subroutine SETMATRIX_fast finds LH matrix for maximum homologous C subsequence between any two sequences C generate the homology and traceback matrix C----------------------------------------------------------------------- C LH(.,.,1) is homology score C LH(.,.,2) is traceback value C encoding LDIREC and LDEL: DIREC + LDEL C LH(I,J,1) corresponds to seq postions II=I-1, JJ=J-1 C LH(1,.,1) and LH(.,1,1) are terminal margins C LDIREC 10000,20000,30000,40000 for termination,diagonal,vertical,horizontal C LDEL length of deletion C temporary values: C MAX_H(),MAX_V() best value for horizontal and verical deletions C LDEL_H,LDEL_V length of horizontal and vertical deletion C====================================================================== C JULY 1991 (RS) C MAXDEL restriction removed C see: O. Gotoh, An Improved Algorithm for Matching Biological C Sequences, JMB (1982) 162, 705-708 C----------------------------------------------------------------------- C JUNE 1991 (RS) C matrix setting in a antidiagonal way to run it in parallel C see: Jones R. et.al., Protein Sequence Comparison on the Connection C Machine CM-2, in: Computers and DNA, SFI Studies in the Sciences C of Complexity, Vol VII, Addison-Wesley, 1990 C====================================================================== C C ANTIDIAGONAL SETTING OF THE MATRIX C ================================== C N1,N2: length of sequence 1 and sequence 2 C ADVANTAGE: loop can run in parallel or vectorized C C C ICOUNT 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 C -----------------------------------------------------> C | sequence 1 ====> N1 C | 2345678901234567890123456789 <==IIBEG C | ------------------------------ C 1 | sequence2 | //// / /| C 2 | | 2 |//// / / | C 3 | | 3 |/// / / /| <== JJBEG C 4 | | 4 |// / / //| C 5 | v 5 |/ / / ///| C 6 | 6 | / / ////| C 7 | 7 | / / /////| C 8 | N2 8 |/ / //////| C 9 | -----------------------------| C 10 | C 11 | C 12 | C 13 | C V C C===================================================================== C at each position take the best value of: C C LH(i,j,1)= MAX( LH(i-1,j-1,1) + SIM(i,j) , MAX_H(j) ,MAX_V(i) ,0) C C LH(i-1,j-1,1) : best value of diagonal (no INDEL) C SIM(i,j) : similarity value for position i,j C MAX_H(j) : best value of horizontal INDELs C MAX_V(i) : best value of vertical INDELs C where: C MAX_H(i)=MAX( LH(I-1,J,1) - gap-open , MAX_H(i-1) - gap-elongation , 0) C MAX_V(j)=MAX( LH(I,J-1,1) - gap-open , MAX_V(j-1) - gap-elongation , 0) C NOTE: one has to store the length of the deletion for MAX_H() and MAX_V() C in LDEL_H(j) and LDEL_V() C C C NOTE: C 1) if no INDEL(s) in secondary structure allowed: C GAPOPEN contains PUNISH C 2) internal deletions are (postion dependent ) weighted as: C GAPOPEN + GAPELONG *LENGTH C 3) conservation weights: C gap penalties are dependent on sequence-position(s), so weight C gap-penalties with conservation-weights otherwise the gap penalties C in regions with low conservation are too big C 4) antidiagonal matrix setting: C position in sequence 2 is JJBEG+IIBEG-II: step back in sequence 1 and C down in sequence 2 C C 5) NOT LONGER VALID C if the MAXDEL option is set, one has to check if the number of C INDEL's exceeds the MAXDEL value. C In addition: when the value for opening a gap is higher than C for the elongation, we have to check if the previous length of C the gap is not greater than 0. C That means that for some special cases it's cheaper to punish C the alignment by some open-penalties in a row than to elongate C or continue the alignment in the diagonal. C open a gap if: C 1.) OPEN .gt. ELONG or C 2.) LDELx()+1 .ge. MAXDEL C 3.) but only if LDELx() .eq. 0 C====================================================================== IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C import C ACTUAL SEQ LENGTH INTEGER N1BEG,N1END,N2BEG,N2END,N2 c export c real lh1(0:n1+1,0:n2+1) INTEGER*2 LH2(0:N1+1,0:N2+1) c real lh(0:n1+1,0:n2+1) REAL BESTVAL INTEGER BESTIIPOS,BESTJJPOS c internal INTEGER NSIZE1,NSIZE2 REAL SUM,XMAX1,XMAX2 REAL BESTNOW INTEGER I,J,K,NDAMP,NDIAGONAL,ISMALL_DIM,IBIG_DIM,IIBEG,JJBEG INTEGER LEN_DIAG,IDIAG,II,JJ C======================================================================= C DO SOME STUFF OUTSIDE THE LOOPS: C======================================================================= C initialize C======================================================================= BESTVAL=-99999.0 BESTNOW=-99999.0 BESTIIPOS=-1 BESTJJPOS=-1 NSIZE1=N1END-N1BEG+1 NSIZE2=N2END-N2BEG+1 J=MIN(N1BEG-1,N2BEG-1) K=MAX(N1END+1,N2END+1) DO I=J,K c do i=0,MAXSQ+1 MAX_H(I)=0.0 MAX_V(I)=0.0 LDEL_H(I)=0 LDEL_V(I)=0 LEFT_LH(I)=0.0 UP_LH(I)=0.0 DIAG_LH(I)=0.0 ENDDO C======================================================================= C update the metric values (weights) C this can be done outside the main parallel loop C with this we save at lot of multiplications in the parallel loop C the update can be done in concurrent/vectorized mode C======================================================================= NDAMP=1 IF (PROFILEMODE .EQ. 6) THEN IF (LCONSERV_1) THEN DO I=N1BEG,N1END OPEN_GAP_1(I) = GAPOPEN_1(I) * CONSWEIGHT_1(I) ENDDO DO I=N1BEG,N1END ELONG_GAP_1(I)= GAPELONG_1(I) * CONSWEIGHT_1(I) ENDDO C DAMP PENALTIES CALL DAMP_GAPWEIGHT(N1BEG,N1END,OPEN_GAP_1,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N1BEG,N1END,ELONG_GAP_1,NDAMP,PUNISH) ELSE DO I=N1BEG,N1END OPEN_GAP_1(I) = GAPOPEN_1 (I) ENDDO DO I=N1BEG,N1END ELONG_GAP_1(I)= GAPELONG_1(I) ENDDO CALL DAMP_GAPWEIGHT(N1BEG,N1END,OPEN_GAP_1,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N1BEG,N1END,ELONG_GAP_1,NDAMP,PUNISH) ENDIF IF (LCONSERV_2) THEN DO I=N2BEG,N2END OPEN_GAP_2(I) = GAPOPEN_2(I) * CONSWEIGHT_2(I) ENDDO DO I=N2BEG,N2END ELONG_GAP_2(I)= GAPELONG_2(I) * CONSWEIGHT_2(I) ENDDO CALL DAMP_GAPWEIGHT(N2BEG,N2END,OPEN_GAP_2,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N2BEG,N2END,ELONG_GAP_2,NDAMP,PUNISH) ELSE DO I=N2BEG,N2END OPEN_GAP_2(I) = GAPOPEN_2(I) ENDDO DO I=N2BEG,N2END ELONG_GAP_2(I)= GAPELONG_2(I) ENDDO CALL DAMP_GAPWEIGHT(N2BEG,N2END,OPEN_GAP_2,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N2BEG,N2END,ELONG_GAP_2,NDAMP,PUNISH) ENDIF C============================= ELSE IF (PROFILEMODE .NE. 2) THEN IF (LCONSERV_1) THEN DO K=1,NTRANS DO I=N1BEG,N1END METRIC_1(I,K) = SIMMETRIC_1(I,K) * CONSWEIGHT_1(I) ENDDO ENDDO DO I=N1BEG,N1END OPEN_GAP_1(I) = GAPOPEN_1(I) * CONSWEIGHT_1(I) ENDDO DO I=N1BEG,N1END ELONG_GAP_1(I)= GAPELONG_1(I) * CONSWEIGHT_1(I) ENDDO C DAMP PENALTIES CALL DAMP_GAPWEIGHT(N1BEG,N1END,OPEN_GAP_1,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N1BEG,N1END,ELONG_GAP_1,NDAMP,PUNISH) ELSE DO K=1,NTRANS DO I=N1BEG,N1END METRIC_1(I,K) = SIMMETRIC_1(I,K) ENDDO ENDDO DO I=N1BEG,N1END OPEN_GAP_1(I) = GAPOPEN_1 (I) ENDDO DO I=N1BEG,N1END ELONG_GAP_1(I)= GAPELONG_1(I) ENDDO CALL DAMP_GAPWEIGHT(N1BEG,N1END,OPEN_GAP_1,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N1BEG,N1END,ELONG_GAP_1,NDAMP,PUNISH) ENDIF ENDIF IF (PROFILEMODE .GE. 2) THEN IF (LCONSERV_2) THEN DO K=1,NTRANS DO I=N2BEG,N2END METRIC_2(I,K) = SIMMETRIC_2(I,K) * CONSWEIGHT_2(I) ENDDO ENDDO DO I=N2BEG,N2END OPEN_GAP_2(I) = GAPOPEN_2(I) * CONSWEIGHT_2(I) ENDDO DO I=N2BEG,N2END ELONG_GAP_2(I)= GAPELONG_2(I) * CONSWEIGHT_2(I) ENDDO CALL DAMP_GAPWEIGHT(N2BEG,N2END,OPEN_GAP_2,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N2BEG,N2END,ELONG_GAP_2,NDAMP,PUNISH) ELSE DO K=1,NTRANS DO I=N2BEG,N2END METRIC_2(I,K) = SIMMETRIC_2(I,K) ENDDO ENDDO DO I=N2BEG,N2END OPEN_GAP_2(I) = GAPOPEN_2(I) ENDDO DO I=N2BEG,N2END ELONG_GAP_2(I)= GAPELONG_2(I) ENDDO CALL DAMP_GAPWEIGHT(N2BEG,N2END,OPEN_GAP_2,NDAMP,PUNISH) CALL DAMP_GAPWEIGHT(N2BEG,N2END,ELONG_GAP_2,NDAMP,PUNISH) ENDIF ENDIF IF (PROFILEMODE .EQ. 5) THEN DO I=N1BEG,N1END MAX_METRIC_1_VAL(I)=-10000.0 DO K=1,NTRANS MAX_METRIC_1_VAL(I)= + MAX(METRIC_1(I,K),MAX_METRIC_1_VAL(I)) ENDDO ENDDO DO J=N2BEG,N2END MAX_METRIC_2_VAL(J)=-10000.0 DO K=1,NTRANS MAX_METRIC_2_VAL(J)= + MAX(METRIC_2(J,K),MAX_METRIC_2_VAL(J)) ENDDO ENDDO ENDIF C IF ( PROFILEMODE .EQ. 3 ) THEN DO I=N1BEG,N1END SUM=0.0 DO K=1,NTRANS SUM= SUM + ( METRIC_1(I,K) * METRIC_1(I,K) ) ENDDO SUM= SQRT(SUM) DO K=1,NTRANS METRIC_1(I,K)= METRIC_1(I,K) / SUM ENDDO ENDDO DO I=N2BEG,N2END SUM=0.0 DO K=1,NTRANS SUM= SUM + ( METRIC_2(I,K) * METRIC_2(I,K) ) ENDDO SUM= SQRT(SUM) DO K=1,NTRANS METRIC_2(I,K)= METRIC_2(I,K) / SUM ENDDO ENDDO ENDIF c====================================================================== NDIAGONAL=NSIZE1+NSIZE2-1 c ndiagonal=iprofend-iprofbeg+1+n2-1 c WRITE(6,'(A,I6)')' NUMBER OF ANTIDIAGONALS: ',NDIAGONAL ISMALL_DIM=MIN(NSIZE1,NSIZE2) IBIG_DIM=MAX(NSIZE1,NSIZE2) IIBEG=N1BEG JJBEG=N2BEG+1 LEN_DIAG=0 C===================================================================== C profile 1 (no profiles or profile for first sequence) C-------------------------------------------------------------------- IF (PROFILEMODE .LE. 1) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG C==================================================================== C THIS LOOP CAN BE EXECUTED IN VECTOR-MODE C====================================================================== C compiler directives for vector C---------------------------------------------------------------------- DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 C====================================================================== C values for diagonal, horizontal and vertical (open and elongation) C===================================================================== C store best value and length for horizontal deletion C===================================================================== MAX_H(JJ-II) = (MAX_H(JJ-II) - ELONG_GAP_1(II-1)) IF ((MAX_H(JJ-II) .GE.(LEFT_LH(JJ-II)-OPEN_GAP_1(II-1))) + .AND. (MAX_H(JJ-II) .GT.0.0 )) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF (((LEFT_LH(JJ-II)-OPEN_GAP_1(II-1)) .GE. + MAX_H(JJ-II)) + .AND. ((LEFT_LH(JJ-II)-OPEN_GAP_1(II-1)) .GT. + 0.0)) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II)= (LEFT_LH(JJ-II) - OPEN_GAP_1(II-1)) ELSE MAX_H(JJ-II)= 0.0 LDEL_H(JJ-II)= 0 ENDIF c===================================================================== c store best value and length for vertical deletion c===================================================================== MAX_V(II) = (MAX_V(II) - ELONG_GAP_1(II-1)) IF ( (MAX_V(II).GE.(UP_LH(II) - OPEN_GAP_1(II-1))) .AND. + ( MAX_V(II) .GT. 0.0) ) THEN LDEL_V(II)= LDEL_V(II) + 1 ELSE IF ( ((UP_LH(II)-OPEN_GAP_1(II-1)) .GE. MAX_V(II)) + .AND. ((UP_LH(II) - OPEN_GAP_1(II-1)) .GT. + 0.0)) THEN MAX_V(II)= (UP_LH(II) - OPEN_GAP_1(II-1)) LDEL_V(II)=1 ELSE MAX_V(II)= 0.0 LDEL_V(II)= 0 ENDIF C====================================================================== C which value is the best (diagonal,horizontal or vertical) C store traceback C LDIREC and LDEL are coded in one number C====================================================================== MAX_D(II)= DIAG_LH(JJ-II)+METRIC_1(II-1,LSQ_2(JJ-II-1)) IF ( (MAX_D(II) .GE. MAX_V(II) ) .AND. + (MAX_D(II) .GE. MAX_H(JJ-II)) .AND. + (MAX_D(II) .GT. 0.0 )) THEN BESTNOW= MAX_D(II) ELSE IF ( (MAX_V(II) .GE. MAX_D(II) ) .AND. + (MAX_V(II) .GE. MAX_H(JJ-II)) .AND. + (MAX_V(II) .GT. 0.0 )) THEN BESTNOW= MAX_V(II) LH2(II,JJ-II)= 10000 + LDEL_V(II) ELSE IF ( (MAX_H(JJ-II) .GE. MAX_D(II)) .AND. + (MAX_H(JJ-II) .GE. MAX_V(II)) .AND. + (MAX_H(JJ-II) .GT. 0.0 )) THEN BESTNOW= MAX_H(JJ-II) LH2(II,JJ-II)= 20000 + LDEL_H(JJ-II) ELSE BESTNOW= 0.0 LH2(II,JJ-II)= 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN BESTNOW=0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= BESTNOW UP_LH(II)= BESTNOW IF (BESTNOW .GT. BESTVAL) THEN BESTVAL=BESTNOW BESTIIPOS=II BESTJJPOS=JJ-II ENDIF C END DIAGONAL ENDDO C IF (LSAMESEQ) THEN C X= ( FLOAT(IIBEG)/ 2.0) + (FLOAT(JJBEG)/2.0) C I=NINT(X) C LH1(I,I) = 0.0 ; LH2(I,I)= 0 C LEFT_LH(I)= 0.0 c up_lh(i) = 0.0 c WRITE(6,*)iibeg,jjbeg,i c endif C==================================================================== C next antidiagonal C==================================================================== ENDDO C-------------------------------------------------------------------- C profile 2 (profile for sequence 2) C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 2) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG CPAR$ DO_PARALLEL cccC$DIR PARALLEL cvd$ nodepchk DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 MAX_H(JJ-II) = (MAX_H(JJ-II) - ELONG_GAP_2(JJ-II-1)) IF (MAX_H(JJ-II) .GT. + (LEFT_LH(JJ-II)-OPEN_GAP_2(JJ-II-1)) .AND. + MAX_H(JJ-II) .GT.0.0 ) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF ( (LEFT_LH(JJ-II)-OPEN_GAP_2(JJ-II-1)) .GE. + MAX_H(JJ-II) .AND. + (LEFT_LH(JJ-II)-OPEN_GAP_2(JJ-II-1)) .GT. + 0.0) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II) = (LEFT_LH(JJ-II)-OPEN_GAP_2(JJ-II-1)) ELSE MAX_H(JJ-II) = 0.0 LDEL_H(JJ-II)= 0 ENDIF c===================================================================== c store best value and length for vertical deletion c===================================================================== MAX_V(II) = (MAX_V(II) - ELONG_GAP_2(JJ-II-1)) IF ( MAX_V(II).GT.(UP_LH(II) - OPEN_GAP_2(JJ-II-1)) .AND. + MAX_V(II) .GT. 0.0 ) THEN LDEL_V(II) = LDEL_V(II) + 1 ELSE IF ((UP_LH(II) - OPEN_GAP_2(JJ-II-1)).GE. MAX_V(II) + .AND. (UP_LH(II) - OPEN_GAP_2(JJ-II-1)) .GT. + 0.0) THEN MAX_V(II)= (UP_LH(II) - OPEN_GAP_2(JJ-II-1)) LDEL_V(II)= 1 ELSE MAX_V(II) = 0.0 LDEL_V(II)= 0 ENDIF c====================================================================== c which value is the best (diagonal,horizontal or vertical) c====================================================================== MAX_D(II)= DIAG_LH(JJ-II)+METRIC_2(JJ-II-1,LSQ_1(II-1)) IF ( MAX_D(II) .GE. MAX_V(II) .AND. + MAX_D(II) .GE. MAX_H(JJ-II) .AND. + MAX_D(II) .GT. 0.0) THEN BESTNOW = MAX_D(II) ELSE IF ( MAX_V(II) .GT. MAX_D(II) .AND. + MAX_V(II) .GT. MAX_H(JJ-II) .AND. + MAX_V(II) .GT. 0.0) THEN BESTNOW = MAX_V(II) LH2(II,JJ-II) = 10000 + LDEL_V(II) ELSE IF ( MAX_H(JJ-II) .GT. MAX_D(II) .AND. + MAX_H(JJ-II) .GT. MAX_V(II) .AND. + MAX_H(JJ-II) .GT. 0.0) THEN BESTNOW = MAX_H(JJ-II) LH2(II,JJ-II) = 20000 + LDEL_H(JJ-II) ELSE BESTNOW = 0.0 LH2(II,JJ-II) = 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN BESTNOW= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= BESTNOW UP_LH(II) = BESTNOW IF (BESTNOW .GT. BESTVAL) THEN BESTVAL=BESTNOW BESTIIPOS=II BESTJJPOS=JJ-II ENDIF ENDDO C==================================================================== C next antidiagonal C==================================================================== ENDDO c-------------------------------------------------------------------- C full profile alignment C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 3) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG CPAR$ DO_PARALLEL cccC$DIR PARALLEL cvd$ nodepchk DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 SUM=0.0 XMAX1=0.0 XMAX2=0.0 DO K=1,NTRANS SUM = SUM + ( METRIC_1(II-1,K) * METRIC_2(JJ-II-1,K) ) C IF ( METRIC_1(II-1,K) .GT. XMAX1 ) THEN C XMAX1 = METRIC_1(II-1,K) C ENDIF C IF ( METRIC_2(JJ-II-1,K) .GT. XMAX2 ) THEN C XMAX2 = METRIC_1(JJ-II-1,K) C ENDIF ENDDO C OPEN_GAP_1(II-1) = GAPOPEN_1(II-1) * XMAX1 C ELONG_GAP_1(II-1) = GAPELONG_1(II-1) * XMAX1 C OPEN_GAP_2(JJ-II-1) = GAPOPEN_2(JJ-II-1) * XMAX2 C ELONG_GAP_2(JJ-II-1) = GAPELONG_2(JJ-II-1) * XMAX2 c WRITE(6,*)ii,jj-ii,sum c MAX_D(II) = DIAG_LH(JJ-II) + (SUM/NTRANS) MAX_H(JJ-II)= MAX_H(JJ-II)- + (( ELONG_GAP_1(II-1)+ ELONG_GAP_2(JJ-II-1))* 0.5) IF (MAX_H(JJ-II) .GT. (LEFT_LH(JJ-II)- + (( OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .AND. MAX_H(JJ-II) .GT.0.0 ) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF ( (LEFT_LH(JJ-II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GE. MAX_H(JJ-II) .AND. (LEFT_LH(JJ-II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GT.0.0) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II) = (LEFT_LH(JJ-II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) ELSE MAX_H(JJ-II) = 0.0 LDEL_H(JJ-II)= 0 ENDIF C===================================================================== C store best value and length for vertical deletion C===================================================================== MAX_V(II)= MAX_V(II)- + (( ELONG_GAP_1(II-1)+ ELONG_GAP_2(JJ-II-1))* 0.5) IF ( MAX_V(II) .GT. (UP_LH(II)- + (( OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) .AND. + MAX_V(II) .GT. 0.0 ) THEN LDEL_V(II) = LDEL_V(II) + 1 ELSE IF ( (UP_LH(II)- (( OPEN_GAP_1(II-1)+ + OPEN_GAP_2(JJ-II-1))*0.5)) .GE. MAX_V(II) + .AND. (UP_LH(II)- ((OPEN_GAP_1(II-1)+ + OPEN_GAP_2(JJ-II-1))*0.5)).GT.0.0) THEN MAX_V(II)= (UP_LH(II)- + (( OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) LDEL_V(II)= 1 ELSE MAX_V(II) = 0.0 LDEL_V(II)= 0 ENDIF C====================================================================== C which value is the best (diagonal,horizontal or vertical) C store traceback C LDIREC and LDEL are coded in one number C====================================================================== MAX_D(II) = DIAG_LH(JJ-II) + SUM IF ( MAX_D(II) .GE. MAX_V(II) .AND. + MAX_D(II) .GE. MAX_H(JJ-II) .AND. + MAX_D(II) .GT. 0.0) THEN BESTNOW = MAX_D(II) ELSE IF ( MAX_V(II) .GT. MAX_D(II) .AND. + MAX_V(II) .GT. MAX_H(JJ-II) .AND. + MAX_V(II) .GT. 0.0) THEN BESTNOW = MAX_V(II) LH2(II,JJ-II) = 10000 + LDEL_V(II) ELSE IF ( MAX_H(JJ-II) .GT. MAX_D(II) .AND. + MAX_H(JJ-II) .GT. MAX_V(II) .AND. + MAX_H(JJ-II) .GT. 0.0) THEN BESTNOW = MAX_H(JJ-II) LH2(II,JJ-II) = 20000 + LDEL_H(JJ-II) ELSE BESTNOW = 0.0 LH2(II,JJ-II ) = 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN BESTNOW= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= BESTNOW UP_LH(II) = BESTNOW IF (BESTNOW .GT. BESTVAL) THEN BESTVAL=BESTNOW BESTIIPOS=II BESTJJPOS=JJ-II ENDIF ENDDO C==================================================================== C next antidiagonal C==================================================================== ENDDO C-------------------------------------------------------------------- C take sequences as representatives of family C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 4) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG CPAR$ DO_PARALLEL cccC$DIR PARALLEL cvd$ nodepchk DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 MAX_H(JJ-II)= MAX_H(JJ-II)- + ( (ELONG_GAP_1(II-1)+ELONG_GAP_2(JJ-II-1)) *0.5) IF (MAX_H(JJ-II) .GT. (LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .AND. MAX_H(JJ-II) .GT.0.0 ) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF ( (LEFT_LH(JJ-II) - ((OPEN_GAP_1(II-1)+ + OPEN_GAP_2(JJ-II-1))*0.5)) .GE. MAX_H(JJ-II) + .AND. (LEFT_LH(JJ-II) - ((OPEN_GAP_1(II-1)+ + OPEN_GAP_2(JJ-II-1))*0.5)) .GT. 0.0 ) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II) = (LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) ELSE MAX_H(JJ-II) = 0.0 LDEL_H(JJ-II)= 0 ENDIF C===================================================================== C STORE BEST VALUE AND LENGTH FOR VERTICAL DELETION C===================================================================== MAX_V(II)= (MAX_V(II)- + ( (ELONG_GAP_1(II-1)+ELONG_GAP_2(JJ-II-1)) *0.5)) IF ( MAX_V(II) .GT. (UP_LH(II)- + ((OPEN_GAP_1(II-1) +OPEN_GAP_2(JJ-II-1)) *0.5)) + .AND. MAX_V(II) .GT. 0.0 ) THEN LDEL_V(II) = LDEL_V(II) + 1 ELSE IF ( (UP_LH(II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GE. MAX_V(II) + .AND. (UP_LH(II)- ((OPEN_GAP_1(II-1) + + OPEN_GAP_2(JJ-II-1)) *0.5)) .GT. 0.0 ) THEN MAX_V(II)=(UP_LH(II)- + ((OPEN_GAP_1(II-1) +OPEN_GAP_2(JJ-II-1)) *0.5)) LDEL_V(II)= 1 ELSE MAX_V(II) = 0.0 LDEL_V(II)= 0 ENDIF C====================================================================== C which value is the best (diagonal,horizontal or vertical) C store traceback C LDIREC and LDEL are coded in one number C====================================================================== MAX_D(II)= DIAG_LH(JJ-II) + + (( METRIC_1 (II-1,LSQ_2(JJ-II-1)) + + METRIC_2 (JJ-II-1,LSQ_1(II-1)) ) * 0.5) IF ( MAX_D(II) .GE. MAX_V(II) .AND. + MAX_D(II) .GE. MAX_H(JJ-II) .AND. + MAX_D(II) .GT. 0.0) THEN BESTNOW = MAX_D(II) ELSE IF ( MAX_V(II) .GT. MAX_D(II) .AND. + MAX_V(II) .GT. MAX_H(JJ-II) .AND. + MAX_V(II) .GT. 0.0) THEN BESTNOW = MAX_V(II) LH2(II,JJ-II ) = 10000 + LDEL_V(II) ELSE IF ( MAX_H(JJ-II) .GT. MAX_D(II) .AND. + MAX_H(JJ-II) .GT. MAX_V(II) .AND. + MAX_H(JJ-II) .GT. 0.0) THEN BESTNOW = MAX_H(JJ-II) LH2(II,JJ-II ) = 20000 + LDEL_H(JJ-II) ELSE BESTNOW = 0.0 LH2(II,JJ-II ) = 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN BESTNOW= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= BESTNOW UP_LH(II) = BESTNOW IF (BESTNOW .GT. BESTVAL) THEN BESTVAL=BESTNOW BESTIIPOS=II BESTJJPOS=JJ-II ENDIF ENDDO C==================================================================== C next antidiagonal C==================================================================== ENDDO C-------------------------------------------------------------------- C take maximal value as consensus C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 5) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG CPAR$ DO_PARALLEL cccC$DIR PARALLEL cvd$ nodepchk DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 MAX_H(JJ-II)=MAX_H(JJ-II) - + ((ELONG_GAP_1(II-1)+ELONG_GAP_2(JJ-II-1))*0.5) IF (MAX_H(JJ-II) .GT. (LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .AND. MAX_H(JJ-II) .GT. 0.0 ) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF ((LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GE.MAX_H(JJ-II) .AND. (LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GT. 0.0) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II) = (LEFT_LH(JJ-II) - + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) ELSE MAX_H(JJ-II) = 0.0 LDEL_H(JJ-II)= 0 ENDIF C===================================================================== C store best value and length for vertical deletion C===================================================================== MAX_V(II)= MAX_V(II) - + ( (ELONG_GAP_1(II-1)+ELONG_GAP_2(JJ-II-1))*0.5) IF ( MAX_V(II) .GT. (UP_LH(II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) .AND. + MAX_V(II) .GT. 0.0 ) THEN LDEL_V(II) = LDEL_V(II) + 1 ELSE IF ((UP_LH(II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GE.MAX_V(II) .AND. (UP_LH(II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) + .GT. 0.0) THEN MAX_V(II)= (UP_LH(II)- + ((OPEN_GAP_1(II-1)+OPEN_GAP_2(JJ-II-1))*0.5)) LDEL_V(II)= 1 ELSE MAX_V(II) = 0.0 LDEL_V(II)= 0 ENDIF C====================================================================== C which value is the best (diagonal,horizontal or vertical) C store traceback C LDIREC and LDEL are coded in one number C====================================================================== MAX_D(II) = DIAG_LH(JJ-II) + + ((MAX_METRIC_1_VAL(II-1) + + MAX_METRIC_2_VAL(II-1)) * 0.5) IF ( MAX_D(II) .GE. MAX_V(II) .AND. + MAX_D(II) .GE. MAX_H(JJ-II) .AND. + MAX_D(II) .GT. 0.0) THEN BESTNOW = MAX_D(II) ELSE IF ( MAX_V(II) .GT. MAX_D(II) .AND. + MAX_V(II) .GT. MAX_H(JJ-II) .AND. + MAX_V(II) .GT. 0.0) THEN BESTNOW = MAX_V(II) LH2(II,JJ-II ) = 10000 + LDEL_V(II) ELSE IF ( MAX_H(JJ-II) .GT. MAX_D(II) .AND. + MAX_H(JJ-II) .GT. MAX_V(II) .AND. + MAX_H(JJ-II) .GT. 0.0) THEN BESTNOW = MAX_H(JJ-II) LH2(II,JJ-II ) = 20000 + LDEL_H(JJ-II) ELSE BESTNOW = 0.0 LH2(II,JJ-II ) = 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN BESTNOW= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= BESTNOW UP_LH(II) = BESTNOW IF (BESTNOW .GT. BESTVAL) THEN BESTVAL=BESTNOW BESTIIPOS=II BESTJJPOS=JJ-II ENDIF ENDDO C==================================================================== C next antidiagonal C==================================================================== ENDDO C==================================================================== ELSE IF (PROFILEMODE .EQ. 6) THEN DO IDIAG=1,NDIAGONAL IF ( IDIAG .LE. ISMALL_DIM) THEN LEN_DIAG=LEN_DIAG+1 ELSE IF ( IDIAG .GT. IBIG_DIM ) THEN LEN_DIAG=LEN_DIAG-1 ENDIF IF (IDIAG .LE. NSIZE1) THEN IIBEG=IIBEG+1 ELSE JJBEG=JJBEG+1 ENDIF JJ=JJBEG+IIBEG CPAR$ DO_PARALLEL cccC$DIR PARALLEL cvd$ nodepchk DO II=IIBEG,IIBEG-LEN_DIAG+1,-1 C===================================================================== MAX_H(JJ-II) = (MAX_H(JJ-II) - ELONG_GAP_1(II-1)) IF ((MAX_H(JJ-II).GE.(LEFT_LH(JJ-II)-OPEN_GAP_1(II-1))) + .AND. (MAX_H(JJ-II) .GT.0.0 )) THEN LDEL_H(JJ-II)= LDEL_H(JJ-II)+1 ELSE IF (((LEFT_LH(JJ-II)-OPEN_GAP_1(II-1)) .GE. + MAX_H(JJ-II)) + .AND. ((LEFT_LH(JJ-II)-OPEN_GAP_1(II-1)) + .GT. 0.0)) THEN LDEL_H(JJ-II)= 1 MAX_H(JJ-II)= (LEFT_LH(JJ-II) - OPEN_GAP_1(II-1)) ELSE MAX_H(JJ-II)= 0.0 LDEL_H(JJ-II)= 0 ENDIF c===================================================================== MAX_V(II) = (MAX_V(II) - ELONG_GAP_1(II-1)) IF ( (MAX_V(II).GE.(UP_LH(II)-OPEN_GAP_1(II-1))) .AND. + ( MAX_V(II) .GT. 0.0) ) THEN LDEL_V(II)= LDEL_V(II) + 1 ELSE IF ( ((UP_LH(II)-OPEN_GAP_1(II-1)).GE.MAX_V(II)) + .AND. ((UP_LH(II) - OPEN_GAP_1(II-1)) + .GT. 0.0)) THEN MAX_V(II)= (UP_LH(II) - OPEN_GAP_1(II-1)) LDEL_V(II)=1 ELSE MAX_V(II)= 0.0 LDEL_V(II)= 0 ENDIF C====================================================================== MAX_D(II)= DIAG_LH(JJ-II) + + SIMORG(LSQ_1(II-1),LSQ_2(JJ-II-1),LSTRCLASS_1(II-1), + LACC_1(II-1),LSTRCLASS_2(JJ-II-1), + LACC_2(JJ-II-1) ) IF ( (MAX_D(II) .GE. MAX_V(II) ) .AND. + (MAX_D(II) .GE. MAX_H(JJ-II)) .AND. + (MAX_D(II) .GT. 0.0 )) THEN BESTNOW= MAX_D(II) ELSE IF ( (MAX_V(II) .GE. MAX_D(II) ) .AND. + (MAX_V(II) .GE. MAX_H(JJ-II)) .AND. + (MAX_V(II) .GT. 0.0 )) THEN BESTNOW= MAX_V(II) LH2(II,JJ-II)= 10000 + LDEL_V(II) ELSE IF ( (MAX_H(JJ-II) .GE. MAX_D(II)) .AND. + (MAX_H(JJ-II) .GE. MAX_V(II)) .AND. + (MAX_H(JJ-II) .GT. 0.0 )) THEN BESTNOW= MAX_H(JJ-II) LH2(II,JJ-II)= 20000 + LDEL_H(JJ-II) ELSE BESTNOW= 0.0 LH2(II,JJ-II)= 0 ENDIF IF (LSAMESEQ .AND. II .EQ. JJ-II) THEN BESTNOW= 0.0 LH2(II,II)= 0 ENDIF DIAG_LH(JJ-II)= UP_LH(II) LEFT_LH(JJ-II)= BESTNOW UP_LH(II)= BESTNOW IF (BESTNOW .GT. BESTVAL) THEN BESTVAL=BESTNOW BESTIIPOS=II BESTJJPOS=JJ-II ENDIF ENDDO C==================================================================== C next antidiagonal C==================================================================== ENDDO C==================================================================== C PROFILE MODE SELECTION END ENDIF C==================================================================== C debug: output the LH (values and trace-back)matrix c call open_file(99,'matrix.dat','new,recl=2000',lerror) c nii=n1+1 ; njj=n2+1 c write(99,*)'TRACE-BACK MATRIX' c do i=1,nii c write(99,'(i6)')i ; write(99,'(2x,20(i6))')(lh2(i,j),j=1,njj) c enddo c close(99) C======================================================================= RETURN END C END SETMATRIX_FAST C...................................................................... C...................................................................... C SUB SETPIECES SUBROUTINE SETPIECES(MAXALSQ,ALI_1,ALI_2,LENALI,IFIR, + ILAS,JFIR,JLAS,IPIECE,MAXPIECES,NPIECES) C RS 89 C cut a sequence alignment in pieces if there are insertions/deletions C or chain-breaks (used in ALITOSTRUCRMS) C CAUTION : dont use an alignment like in MAXHOM (HSSP) C (no insertion in SEQ 1) C alignment must be : C ALI_1 : ACVEFG....FGHKLIPYDFGAS!KLHKLH C ALI_2 : ACAEFGAAAAFGH...PYDFGAS!KLHKLH C | | | | | | | | C PIECE : | 1 | |2| | 3 | |4 | C insertions must be marked by "." C chain-breaks by "!" C INPUT: C ALI_1,ALI_2 : sequence string for seq 1 and seq 2 (CHARACTER*(*)) C LENALI : length of the total alignment (include insertions) C IFIR,ILAS : first and last position of seq 1 (absolut position) C JFIR,JLAS : first and last position of seq 2 (absolut position) C OUTPUT: C IPIECE(2,2,MAXPIECES) : 1. index= begin and end of piece C 2. index= sequence 1 or sequence 2 C 3. index= number of piece C NPIECES : total number of pieces C INTERNAL: C ICOUNT : count alignend positions to get absolute position IMPLICIT NONE INTEGER MAXPIECES,MAXALSQ CHARACTER*1 ALI_1(MAXALSQ),ALI_2(MAXALSQ) INTEGER IPIECE(2,2,MAXPIECES),NPIECES, + LENALI,IFIR,ILAS,JFIR,JLAS C INTERNAL INTEGER IBEG,IEND,JBEG,JEND,ICOUNT,ISTART,ISTOP,K,I c init IBEG=IFIR IEND=ILAS JBEG=JFIR JEND=JLAS NPIECES=1 ICOUNT=0 C D WRITE(6,*)(ALI_1(I),I=1,LENALI) C D WRITE(6,*)(ALI_2(I),I=1,LENALI) C AUTION: FIRST AND LAST CHARACTER IN THE ALIGNMENT IS '<' IF (ALI_1(1).EQ.'<' .AND. ALI_2(1).EQ.'<') THEN ISTART=2 ELSE ISTART=1 ENDIF IF (ALI_1(LENALI).EQ.'<' .AND. ALI_2(LENALI).EQ.'<') THEN ISTOP=LENALI-1 ELSE ISTOP=LENALI ENDIF IF (ALI_1(ISTART).EQ. '!' .AND. ALI_2(ISTART) .EQ. '!') THEN ISTART=ISTART+1 IBEG=IBEG+1 JBEG=JBEG+1 ENDIF IF (ALI_1(ISTOP).EQ. '!' .AND. ALI_2(ISTOP) .EQ. '!') THEN ISTOP=ISTOP-1 IEND=ILAS-1 JEND=JEND-1 ILAS=IEND JLAS=JEND ENDIF C SET DEFAULT TO BEGIN AND END OF ALIGNMENT IPIECE(1,1,NPIECES)=IBEG IPIECE(2,1,NPIECES)=IEND IPIECE(1,2,NPIECES)=JBEG IPIECE(2,2,NPIECES)=JEND DO K=ISTART,ISTOP CALL CHECKRANGE(NPIECES,1,MAXPIECES,'MAXPIECES ','SETPIECES ') C search for an insertion in SEQuence 1 IF (ALI_1(K).EQ.'.' .AND. ALI_2(K).NE. '!') THEN C if: set end of previous piece, store piece in IPIECE and set begin C of next piece IF (ALI_1(K-1).NE.'.' .AND. ALI_1(K-1).NE.'!') THEN IEND=IBEG+ICOUNT-1 JEND=JBEG+ICOUNT-1 IPIECE(1,1,NPIECES)=IBEG IPIECE(2,1,NPIECES)=IEND IPIECE(1,2,NPIECES)=JBEG IPIECE(2,2,NPIECES)=JEND C D WRITE(6,*)' 1 SET PIECE : ',IBEG,IEND,JBEG,JEND C D WRITE(6,*)(ALI_1(I),I=1,K) WRITE(6,*)(ALI_2(I),I=1,K) IBEG=IEND+1 JBEG=JEND+2 NPIECES=NPIECES+1 ELSE JBEG=JBEG+1 ENDIF ICOUNT=0 C search for an insertion in SEQuence 2 ELSE IF (ALI_2(K).EQ.'.' .AND. ALI_1(K).NE.'!') THEN C if: set end of previous piece, store piece in IPIECE and set begin C of next piece IF (ALI_2(K-1).NE.'.' .AND. ALI_2(K-1).NE.'!') THEN IEND=IBEG+ICOUNT-1 JEND=JBEG+ICOUNT-1 IPIECE(1,1,NPIECES)=IBEG IPIECE(2,1,NPIECES)=IEND IPIECE(1,2,NPIECES)=JBEG IPIECE(2,2,NPIECES)=JEND C D WRITE(6,*)' 2 SET PIECE : ',IBEG,IEND,JBEG,JEND C D WRITE(6,*)(ALI_1(I),I=1,K) ; WRITE(6,*)(ALI_2(I),I=1,K) IBEG=IEND+2 JBEG=JEND+1 NPIECES=NPIECES+1 ELSE IBEG=IBEG+1 ENDIF ICOUNT=0 C search for a chain-break in SEQuence 1 or SEQuence 2 and set piece ELSE IF (ALI_1(K).EQ.'!' .OR. ALI_2(K).EQ.'!') THEN IF (ALI_1(K-1).NE.'.' .AND. ALI_2(K-1).NE.'.') THEN IEND=IBEG+ICOUNT-1 JEND=JBEG+ICOUNT-1 IPIECE(1,1,NPIECES)=IBEG IPIECE(2,1,NPIECES)=IEND IPIECE(1,2,NPIECES)=JBEG IPIECE(2,2,NPIECES)=JEND C D WRITE(6,*)' 3 SET PIECE : ',IBEG,IEND,JBEG,JEND C D WRITE(6,*)(ALI_1(I),I=1,K) ; WRITE(6,*)(ALI_2(I),I=1,K) NPIECES=NPIECES+1 IBEG=IEND+1 JBEG=JEND+1 ENDIF IF (ALI_1(K).EQ.'!' .AND. ALI_2(K).EQ.'!') THEN IBEG=IBEG+1 JBEG=JBEG+1 ELSE IF (ALI_1(K).EQ.'!') THEN IBEG=IBEG+1 ELSE IF (ALI_2(K).EQ.'!') THEN JBEG=JBEG+1 ENDIF ICOUNT=0 ELSE ICOUNT=ICOUNT+1 ENDIF ENDDO IPIECE(1,1,NPIECES)=IBEG IPIECE(2,1,NPIECES)=ILAS IPIECE(1,2,NPIECES)=JBEG IPIECE(2,2,NPIECES)=JLAS cd WRITE(6,*)' 4 set piece : ',ibeg,ilas,jbeg,jlas RETURN END C END SETPIECES C...................................................................... C...................................................................... C SUB SKIP_BRKCHAIN SUBROUTINE SKIP_BRKCHAIN(KIN,RLEN,FIRSTLINE,ERROR) C 15.5.93 IMPLICIT NONE C Import INTEGER KIN,RLEN CHARACTER*(*) FIRSTLINE C EXPORT LOGICAL ERROR C .. AND NEW LOCATION OF READ POINTER FOR UNIT KIN C INTERNAL INTEGER LINELEN PARAMETER (LINELEN= 1000) CHARACTER*(LINELEN) LINE *----------------------------------------------------------------------* IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF ERROR = .FALSE. LINE = FIRSTLINE DO WHILE ( LINE(1:3) .NE. 'TER' ) READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO GOTO 2 1 ERROR = .TRUE. WRITE(6,'(a)') ' ** error reading BRK file **' 2 CONTINUE RETURN END C END SKIP_BRKCHAIN C...................................................................... C...................................................................... C SUB SKIP_DSSPCHAIN SUBROUTINE SKIP_DSSPCHAIN(KIN,RLEN,FIRSTLINE,ERROR) C 15.5.93 IMPLICIT NONE C Import INTEGER KIN,RLEN CHARACTER*(*) FIRSTLINE C EXPORT LOGICAL ERROR C .. AND NEW LOCATION OF READ POINTER FOR UNIT KIN C INTERNAL INTEGER LINELEN PARAMETER (LINELEN= 1000) CHARACTER*(LINELEN) LINE *----------------------------------------------------------------------* IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF ERROR = .FALSE. LINE = FIRSTLINE DO WHILE ( LINE(14:14) .NE. '!' ) READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO GOTO 2 1 ERROR = .TRUE. WRITE(6,'(A)') ' ** error reading DSSP file **' 2 CONTINUE RETURN END C END SKIP_DSSPCHAIN C...................................................................... C...................................................................... C SUB SKIP_HSSPCHAIN SUBROUTINE SKIP_HSSPCHAIN(KIN,RLEN,FIRSTLINE,ERROR) C 18.5.93 IMPLICIT NONE C Import INTEGER KIN,RLEN CHARACTER*(*) FIRSTLINE C EXPORT LOGICAL ERROR C .. AND NEW LOCATION OF READ POINTER FOR UNIT KIN C INTERNAL INTEGER LINELEN PARAMETER (LINELEN= 1000) CHARACTER*(LINELEN) LINE *----------------------------------------------------------------------* IF ( LINELEN .LT. RLEN ) THEN WRITE(6,'(1X,A)') 1 ' *** record length of input file too big ***' GOTO 1 ENDIF LINE = FIRSTLINE DO WHILE ( LINE(15:15) .NE. '!' ) READ(KIN,'(A)',ERR=1,END=2) LINE ENDDO GOTO 2 1 ERROR = .TRUE. WRITE(6,'(a)') '*** ERROR SKIP_HSSPCHAIN reading HSSP file' 2 CONTINUE RETURN END C END SKIP_HSSPCHAIN C...................................................................... C...................................................................... C SUB STR_TO_INT SUBROUTINE STR_TO_INT(NRES,STRUC,LSTRUC,STRSTATES) IMPLICIT NONE INTEGER NRES,LSTRUC(*) CHARACTER*(*) STRUC(*),STRSTATES c internal INTEGER I c======================================================================= DO I=1,NRES LSTRUC(I)=INDEX(STRSTATES,STRUC(I)) IF (LSTRUC(I) .EQ. 0) THEN WRITE(6,*)' unknown structure in str_to_int:',struc(i),':' WRITE(6,*)STRSTATES CALL FLUSH_UNIT(6) STOP ENDIF ENDDO RETURN END C END STR_TO_INT C...................................................................... C...................................................................... C SUB STR_TO_CLASS SUBROUTINE STR_TO_CLASS(MAXSTATES,STR_STATES,NRES, + STRUC,CLASS,ICLASS) C convert DSSP-secondary structure symbol to secondary structure C classes (L,H,E..) ; first symbol in str_states(x) C str_states(1)='L TCStclss' C str_states(2)='EBAPMebapm' C str_states(3)='HGIhgiiiii' C given STRUC, what is the class number ICLASS and class representative CLASS ? C undefined states is set CLASS='U', ICLASS=0 C IMPLICIT NONE C input INTEGER MAXSTATES,NRES CHARACTER*(*) STRUC(*),STR_STATES(*) c output CHARACTER*(*) CLASS INTEGER ICLASS(*) C internal INTEGER I,J C====================================================================== DO I=1,NRES DO J=1,MAXSTATES IF ( INDEX(STR_STATES(J),STRUC(I)) .NE. 0) THEN GOTO 100 ENDIF ENDDO c iclass(i)=0 c class(i:i)='U' c WRITE(6,*)' symbol not known in STR_TO_CLASS: ',struc(i) 100 ICLASS(I)=J CLASS(I:I)=STR_STATES(J)(1:1) c WRITE(6,*)i,j,iclass(i),str_states(j)(1:1) ENDDO RETURN END C END STR_TO_CLASS C...................................................................... C...................................................................... C SUB StringLen SUBROUTINE STRINGLEN(CSTRING,ILEN) C searches for the last non-blank character in a string CHARACTER*(*) CSTRING INTEGER ILEN ILEN=LEN(CSTRING) DO WHILE(ILEN.GT.0 .AND. CSTRING(ILEN:ILEN).EQ. ' ') ILEN=ILEN-1 ENDDO RETURN END C END STRINGLEN C...................................................................... C...................................................................... C SUB STRTRIM SUBROUTINE STRTRIM(SOURCE,DEST,LENGTH) C StrTrim(Source,Dest,Length): Dest=Source(-1:-1)//filled with blanks C Length=length of Source(-1:-1) C ------------------------------------------------------------------- CHARACTER*(*) SOURCE,DEST CHARACTER*500 TEMPSTRING TEMPSTRING=' ' LENGTH=0 ISTART=1 ISTOP= LEN(SOURCE) IF (ISTOP .GT. LEN(TEMPSTRING) ) THEN WRITE(6,*)' STRTRIM: tempstring too short' STOP ENDIF I=1 DO WHILE (SOURCE(I:I) .EQ. ' ') I=I+1 IF (I .GT. LEN(SOURCE)) RETURN ENDDO ISTART=I I=LEN(SOURCE) DO WHILE (SOURCE(I:I) .EQ. ' ') I=I-1 IF (I .LE. 1)GOTO 10 ENDDO 10 ISTOP=I LENGTH=ISTOP-ISTART+1 TEMPSTRING(1:)= SOURCE(ISTART:ISTOP) DEST(1:LENGTH)=TEMPSTRING(1:LENGTH) DO I=LENGTH+1,LEN(DEST) DEST(I:I)=' ' ENDDO RETURN END C END STRTRIM C...................................................................... C...................................................................... C SUB STRPOS SUBROUTINE STRPOS(SOURCE,ISTART,ISTOP) C StrPos(Source,IStart,IStop): Finds the positions of the first and C last non-blank/non-TAB in Source. IStart=IStop=0 for empty Source CHARACTER*(*) SOURCE INTEGER ISTART,ISTOP ISTART=0 ISTOP=0 DO J=1,LEN(SOURCE) IF (SOURCE(J:J).NE.' ') THEN ISTART=J GOTO 20 ENDIF ENDDO RETURN 20 DO J=LEN(SOURCE),1,-1 IF (SOURCE(J:J).NE.' ') THEN ISTOP=J RETURN ENDIF ENDDO ISTART=0 ISTOP=0 RETURN END C END STRPOS C...................................................................... C...................................................................... C SUB STRUC_CLASS SUBROUTINE STRUC_CLASS(MAXSTRSTATES,STR_CLASSES, + STRUC,CLASS,ICLASS) C given struc, what is the class number ICLASS and class C representative CLASS ? C undefined states is set CLASS='U', ICLASS=0 INTEGER MAXSTRSTATES,ICLASS C---- br 99.03: watch hard_coded here, see maxhom.param CHARACTER*10 STR_CLASSES(MAXSTRSTATES) C---- --> REASON: the following produces warnings on SGI C CHARACTER*(*) STR_CLASSES(MAXSTRSTATES) CHARACTER STRUC,CLASS C INTERNAL INTEGER I CLASS='U' ICLASS=0 DO I=1,MAXSTRSTATES IF (INDEX(STR_CLASSES(I),STRUC) .NE. 0 ) THEN ICLASS=I CLASS=STR_CLASSES(I)(1:1) RETURN ENDIF ENDDO C WRITE(6,*)' SYMBOL NOT KNOWN IN STRUC_CLASS: ',STRUC RETURN END C END STRUC_CLASS C...................................................................... C...................................................................... C SUB STRTOINT SUBROUTINE STRTOINT(NRES,CSTR,LSTR,LDSSP) IMPLICIT NONE INTEGER NRES,LSTR(*) CHARACTER CSTR(*) LOGICAL LDSSP C internal INTEGER I CHARACTER*25 STRSTATES STRSTATES=' LTCSltcsEBAPMebapmHGIhgi' C======================================================================= IF (LDSSP) THEN DO I=1,NRES LSTR(I)=INDEX(STRSTATES,CSTR(I)) IF (LSTR(I).EQ.0) THEN WRITE(6,*)' UNKNOWN STRUCTURE IN STRTOINT: ',CSTR(I) STOP ENDIF ENDDO ELSE DO I=1,NRES LSTR(I)=0 CSTR(I)='U' ENDDO ENDIF RETURN END C END STRTOINT C...................................................................... C...................................................................... C SUB SWISSPROTRELEASE SUBROUTINE SWISSPROTRELEASE(KIN,INFILE,RELEASE,NENTRIES, + NRESIDUE) C IMPORT CHARACTER*(*) INFILE INTEGER KIN C EXPORT REAL RELEASE INTEGER NENTRIES,NRESIDUE c internal CHARACTER*200 LINE LOGICAL LERROR C...................................................................... C reads the latest version number and number of sequences of SWISSPROT C on VAX at EMBL logical filename is SWISS_PROT$RELEASE:RELNOTES.DOC C this file contains somwhere the following lines: C Release Date Number of entries Nb of amino acids C C 3.0 11/86 4160 969 641 C 4.0 04/87 4387 1 036 010 C 12.0 10/89 12305 3 797 482 C CSome 1466 new sequences have been added since the ............ C...................................................................... CALL OPEN_FILE(KIN,INFILE,'OLD,READONLY',lerror) IF (LERROR .EQV. .TRUE.) THEN RELEASE= 0.0 NENTRIES=0 NRESIDUE=0 WRITE(6,*)'Error: No SwissProt release info found ' RETURN ENDIF LINE=' ' DO WHILE( INDEX (LINE,'Release Date Number of').EQ.0 .AND. + INDEX (LINE,'Release Date ').EQ.0) READ(KIN,'(A)')LINE ENDDO DO I=1,3 READ(KIN,'(A)')LINE ENDDO DO WHILE (LINE .NE. ' ') READ(LINE,'(3X,F4.1,25X,I6,10X,I12)')RELEASE,NENTRIES,NRESIDUE READ(KIN,'(A)')LINE ENDDO CLOSE(KIN) RETURN END C END SWISSPROTRELEASE C...................................................................... C...................................................................... C SUB TRACE SUBROUTINE TRACE(ISET,ND1,ND2,LH2,IPOSBEG,JPOSBEG,VALUE, + II,JJ,NTEST,SDEV,IALIGN,NRECORD) C NOTE: TRACE will aplpy threshold, and return LCONSIDER=.FALSE. C if below threshold! C C coming in with protein IALIGN+1, i.e. IALIGN gives alignments C so far! C=================================================================== C LDIREC and LDEL are the traceback indices unpacked C from the LH matrix. C LDIREC=1 indicates an unmatched terminal sequence, C LDIREC=2 indicates a diagonal optimal path, C LDIREC=3 indicates a vertical path in the matrix, C LDIREC=4 indicates a horizonal traceback path. C LDEL is the length of the deletion/insertion for LDIREC=3 or 4 C CVAL accumulated similarity values C CMAXVAL accumulated self matches ==> similarity C------------------------------------------------------------------- C PROFILEMODE C 0: no profiles, just a simple sequence alignment C 1: profile for sequence 1 (and not for sequence 2) C 2: profile for sequence 2 (and not for sequence 1) C 3: full alignment of two profiles, without taking into account the C sequence (structure,I/O...) information C 4: take the sequences as a representative of the family C 5: take the maximal value at each position as a "consensus" sequence C------------------------------------------------------------------- C weighted gap: C here opening and elongation are weighted C LDIREC=4 : horizontal deletion C LDIREC=3 : vertical deletion C C =======================================> II (matrix position) C | \ C | \ \ C | \ \ C | \ ^ open: II C | \ | elongation: JJ-LDEL+1 ==> JJ-1 C V \ LDEL | C JJ \<----------------------- II,JJ C \ C open: II-LDEL \ C elongation: II-LDEL+1 ====> II-1 \ C \ C C horizontal gap-open: C GAPOPEN(II-LDEL) * cons-weight (II-LDEL) C horizontal gap-elong: C sum (GAPELONG(II-LDEL+1) * cons-weight (II-LDEL+1) ===> (II-1) ) C------------------------------------------------------------------- IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C import INTEGER ISET,ND1,ND2 C REAL LH1(0:*) INTEGER*2 LH2(0:ND1,0:ND2) C INTEGER*2 LH2(0:*) C REAL LH(0:*) REAL VALUE,SDEV INTEGER II,JJ,NTEST,IPOSBEG,JPOSBEG C export C alignment attributes of the MAXALIGNS best alignments INTEGER IALIGN,NRECORD C======================================================================= C internal INTEGER MAXTRACE_LOC PARAMETER (MAXTRACE_LOC= 6543210) C PARAMETER (MAXTRACE_LOC= 150000) C PARAMETER (MAXTRACE_LOC= 30000000) CHARACTER*1 SK_1,SK_2 INTEGER IDELETION INTEGER INSPOINTER_LOCAL, + ITEMP_NO1(MAXTRACE_LOC),ITEMP_NO2(MAXTRACE_LOC), + ENDMARK,INDELMARK, + I,J,K,M,ILAS,JLAS,LDEL_DIREC, + LDEL,LEM,IFIR,JFIR,IDAL, + IDSAL,NDEL,ICLASS,LEN1,LENOCC,LINELEN,IBLOCKLEN, + ISTART,IPOS,JPOS,NLINE,IBEG,IEND, + LINETHICK,LEN_INSSEQ CHARACTER CTEMP_CHAIN1(MAXTRACE_LOC), + CTEMP_CHAIN2(MAXTRACE_LOC), + CTEMP*20000, + LINE(4)*(MAXALSQ) CHARACTER*4 PID_1,PID_2 REAL MAXDEVIATION, + MAX1,MAX2,SUM,SIM,CVAL,CMAXVAL,SELFSIM, + OPENWEIGHT,ELONGWEIGHT,W,PER,HOM, + DISTANCE,RMS LOGICAL LERROR,LKINK,LCALPHA,LDBG_LOCAL C CHARACTER*200 ERRORFILE C==================================================================== C init c line(2)=' ' ; line(3)=' ' c do i=1,maxalsq ; al_agree(i)=' ' ; sal_agree(i)=' ' ; enddo C BR 99.09: to write out some dbg msg LDBG_LOCAL= .FALSE. C LDBG_LOCAL= .TRUE. OPENWEIGHT= 0.0 ELONGWEIGHT= 0.0 C ENDMARK : '<' C INDELMARK: '.' ENDMARK= 999 INDELMARK= 99 M= 0 NTEST= NTEST+1 ILAS= II-1 JLAS= JJ-1 SIM= 0.0 CVAL= 0.0 CMAXVAL= 0.0 DISTANCE= 0.0 IINS= 0 INSPOINTER_LOCAL=1 INSSEQ= ' ' C LEN_NAME= LEN(NAME_2) C LEN_COMPND= LEN(COMPND_2) C LEN_ACCESSION= LEN(ACCESSION_2) C LEN_PDBREF= LEN(PDBREF_2) C LEN_LINE= LEN( LINE(1) ) LEN_INSSEQ= LEN(INSSEQ) c check subscripts IF (II .LE. 0 .OR. JJ .LE. 0) THEN WRITE(6,*)' FATAL ERROR IN TRACE' WRITE(6,*)' SUBSCRIPT OF II or JJ OUT OF RANGE',II,JJ STOP ENDIF C===================== TRACE BACK =================================== C alignment via loop back to 100 100 IF ((II .GT. IPOSBEG) .AND. (JJ .GT. JPOSBEG)) THEN LDEL_DIREC =ABS( LH2(II,JJ) ) c if (.not. lbackward) then c call get_ldirec_fast(nd1,nd2,lh2,ii,jj,ldel_direc) c else c call get_ldirec(nd1,nd2,lh2,ii,jj,ldel_direc) c endif c======================================================================= C diagonal: LIKE H(II,JJ) AND SEQ(II-1,1) SEQ(JJ-1,1) C======================================================================= IF (LH2(II,JJ) .EQ. -1) THEN RETURN ELSE IF (LDEL_DIREC .EQ. 1 ) THEN M=M+1 II=II-1 JJ=JJ-1 CALL CHECKRANGE(II,1,ND1-1,'subscr II','TRACE') CALL CHECKRANGE(JJ,1,ND2-1,'subscr JJ','TRACE') SELFSIM = 0.0 c-------------------------------------------------------------------- c no profile c selfsim is match with master sequence c-------------------------------------------------------------------- IF (PROFILEMODE .LT. 1) THEN SIM = METRIC_1(II,LSQ_2(JJ)) SELFSIM = METRIC_1(II,LSQ_1(II)) c-------------------------------------------------------------------- c profile 1 c selfsim is match with best possible match at position ii c-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 1) THEN SIM = METRIC_1(II,LSQ_2(JJ)) SELFSIM=-9999 DO K=1,NTRANS IF ( METRIC_1(II,K) .GT. SELFSIM ) THEN SELFSIM = METRIC_1(II,K) ENDIF ENDDO c-------------------------------------------------------------------- c profile 2 c-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 2) THEN SIM = METRIC_2(JJ,LSQ_1(II)) SELFSIM = METRIC_2(JJ,LSQ_2(JJ)) C-------------------------------------------------------------------- C full profile alignment C selfsim: sum ( (metric_1(i,k) * metric_1(i,k))+ C (metric_2(j,k) * metric_2(j,k) ) /2 ) C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 3) THEN SUM=0.0 SELFSIM=0.0 DO K=1,NTRANS SUM = SUM + ( METRIC_1(II,K) * METRIC_2(JJ,K) ) SELFSIM= SELFSIM + (METRIC_1(II,K) * METRIC_1(II,K)) WRITE(6,*)K,SUM,METRIC_1(II,K),METRIC_2(JJ,K), + METRIC_1(II,K) * METRIC_2(JJ,K) ENDDO c sim = (sum/ntrans) SIM = SUM c WRITE(6,*)sim,selfsim,metric_1(ii,1),metric_2(jj,1) C-------------------------------------------------------------------- C take sequences as representatives of family C selfsim: factor * (metric_1(i,lsq_1(i)) + metric_2(j,lsq_2(i))/2) C-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 4) THEN SIM=(METRIC_1(II,LSQ_2(JJ))+METRIC_2( JJ,LSQ_1(II)) )*0.5 SELFSIM =METRIC_1( II,LSQ_1(II)) c-------------------------------------------------------------------- c take maximal value as consensus c selfsim ??? c-------------------------------------------------------------------- ELSE IF (PROFILEMODE .EQ. 5) THEN MAX1=-10000.0 DO K=1,NTRANS IF (METRIC_1(II,K) .GT. MAX1)MAX1=METRIC_1(II,K) ENDDO MAX2=-10000.0 DO K=1,NTRANS IF (METRIC_2(JJ,K) .GT. MAX2)MAX2=METRIC_2(JJ,K) ENDDO SIM = ( (MAX1 + MAX2) * 0.5 ) SELFSIM=MAX(MAX1,MAX2) ELSE IF (PROFILEMODE .EQ. 6) THEN SIM = SIMORG(LSQ_1(II),LSQ_2(JJ),LSTRCLASS_1(II), + LACC_1(II),LSTRCLASS_2(JJ),LACC_2(JJ) ) ENDIF CVAL = CVAL + SIM CMAXVAL = CMAXVAL + SELFSIM LAL_1(M)=LSQ_1(II) LSAL_1(M)=LSTRCLASS_1(II) LAL_2(M)=LSQ_2(JJ) LSAL_2(M)=LSTRCLASS_2(JJ) ITEMP_NO1(M)=PDBNO_1(II) ITEMP_NO2(M)=PDBNO_2(JJ) CTEMP_CHAIN1(M)=CHAINID_1(II) CTEMP_CHAIN2(M)=CHAINID_2(JJ) ITRACE(M)=II JTRACE(M)=JJ GOTO 100 c======================================================================= C horizontal deletion C======================================================================= ELSE IF (LDEL_DIREC .GT. 20000) THEN LDEL=LDEL_DIREC - 20000 IF (PROFILEMODE .LE. 1) THEN OPENWEIGHT = OPEN_GAP_1(II-LDEL) ELSE IF (PROFILEMODE .EQ. 2) THEN OPENWEIGHT = OPEN_GAP_2(JJ-1) ELSE IF (PROFILEMODE .EQ. 6) THEN OPENWEIGHT = OPEN_GAP_1(II-LDEL) ELSE IF (PROFILEMODE .GE. 3) THEN OPENWEIGHT=(OPEN_GAP_1(II-LDEL) + OPEN_GAP_2(JJ-1)) * 0.5 ENDIF W = OPENWEIGHT DO I=II-LDEL+1,II-1 IF (PROFILEMODE .LE. 1) THEN ELONGWEIGHT = ELONG_GAP_1(I) ELSE IF (PROFILEMODE .EQ. 2) THEN ELONGWEIGHT = ELONG_GAP_2(JJ-1) ELSE IF (PROFILEMODE .EQ. 6) THEN ELONGWEIGHT = ELONG_GAP_1(I) ELSE IF (PROFILEMODE .GE. 3) THEN ELONGWEIGHT=(ELONG_GAP_1(I) + ELONG_GAP_2(JJ-1)) * 0.5 ENDIF W = W + ELONGWEIGHT ENDDO CVAL = CVAL - W DO K=1,LDEL M=M+1 ITRACE(M)=II-K JTRACE(M)=JJ-1 LAL_1(M) = LSQ_1(II-K) LAL_2(M) =INDELMARK LSAL_1(M)= LSTRCLASS_1(II-K) LSAL_2(M)=INDELMARK ITEMP_NO1(M)=PDBNO_1(II-K) ITEMP_NO2(M)=0 CTEMP_CHAIN1(M)=CHAINID_1(II-K) CTEMP_CHAIN2(M)=' ' ENDDO II=II-LDEL GOTO 100 C======================================================================= C VERTICAL DELETION C======================================================================= ELSE IF (LDEL_DIREC .GT. 10000) THEN LDEL=LDEL_DIREC - 10000 IF (PROFILEMODE .LE. 1) THEN OPENWEIGHT = OPEN_GAP_1(II-1) ELSE IF (PROFILEMODE .EQ. 2) THEN OPENWEIGHT = OPEN_GAP_2(JJ-LDEL) ELSE IF (PROFILEMODE .EQ. 6) THEN OPENWEIGHT = OPEN_GAP_1(II-1) ELSE IF (PROFILEMODE .GE. 3) THEN OPENWEIGHT=(OPEN_GAP_1(II-1) + OPEN_GAP_2(JJ-LDEL)) * 0.5 ENDIF W = OPENWEIGHT DO J=JJ-LDEL+1,JJ-1 IF (PROFILEMODE .LE. 1) THEN ELONGWEIGHT = ELONG_GAP_1(II-1) ELSE IF (PROFILEMODE .EQ. 2) THEN ELONGWEIGHT = ELONG_GAP_2(J) ELSE IF (PROFILEMODE .EQ. 6) THEN ELONGWEIGHT = ELONG_GAP_1(II-1) ELSE IF (PROFILEMODE .GE. 3) THEN ELONGWEIGHT=(ELONG_GAP_1(II-1) +ELONG_GAP_2(J)) * 0.5 ENDIF W = W + ELONGWEIGHT ENDDO CVAL = CVAL - W C store insertions of seq2: C iins: insertion counter C inslen: length of insertion C insbeg_1: DSSP position of insertion (last matched residue) C insbeg_1: position of the insertion in the alignend sequence C inspointer_local: pointer in the one-dim array for insertions C *aVGHYTREe: * is divider between different insertions C lower case characters are the residues before and C after the insertions IF (IINS+1 .LT. MAXINS) THEN IINS=IINS+1 INSLEN_LOCAL(IINS)=LDEL INSBEG_1_LOCAL(IINS)=II-1 INSBEG_2_LOCAL(IINS)=JJ-LDEL K=INSPOINTER_LOCAL IF (K+LDEL+3 .GT. LEN_INSSEQ) THEN WRITE(6,*)' ERROR: MAXINSBUFFER_LOCAL OVERFLOW: ' WRITE(6,*)' increase: ',len_insseq STOP ENDIF INSSEQ(K:K)='*' INSSEQ(K+1:K+1)=CSQ_2(JJ-LDEL-1:JJ-LDEL-1) INSSEQ(K+2:K+LDEL+2)=CSQ_2(JJ-LDEL:JJ-1) INSSEQ(K+LDEL+2:K+LDEL+2)=CSQ_2(JJ:JJ) CALL UPTOLOW(INSSEQ(K+1:K+1),1) CALL UPTOLOW(INSSEQ(K+LDEL+2:K+LDEL+2),1) INSPOINTER_LOCAL=INSPOINTER_LOCAL+LDEL+3 ELSE WRITE(6,*)' WARNING: maxins overflow: ',maxins WRITE(6,*)' insertion ingnored in HSSP-output' ENDIF DO K=1,LDEL M=M+1 JTRACE(M)= JJ-K ITRACE(M)=II-1 LAL_1(M) = INDELMARK LAL_2(M) = LSQ_2 (JJ-K) LSAL_1(M)= INDELMARK LSAL_2(M)= LSTRCLASS_2(JJ-K) ITEMP_NO1(M)=0 ITEMP_NO2(M)=PDBNO_2(JJ-K) CTEMP_CHAIN1(M)=' ' CTEMP_CHAIN2(M)=CHAINID_2(JJ-K) ENDDO JJ=JJ-LDEL GOTO 100 c======================================================================= C unmatched terminal sequence C======================================================================= Caution if you change this: decrease/increase ISTART/ISTOP in SETPIECES CP unnecessary complication CP do not add < CP do not have length+1 CP do not replot last point C ENDMARK is '<') C======================================================================= ELSE IF (LDEL_DIREC .EQ. 0 ) THEN M=M+1 LAL_1(M) =ENDMARK LAL_2(M) =ENDMARK LSAL_1(M)=ENDMARK LSAL_2(M)=ENDMARK ITEMP_NO1(M)=0 ITEMP_NO2(M)=0 c replot last point ITRACE(M)=ITRACE(M-1) JTRACE(M)=JTRACE(M-1) ELSE WRITE(6,*)' FATAL ERROR IN TRACE' WRITE(6,*)' LDEL_DIREC NOT KNOWN',LDEL_DIREC,ii,jj STOP ENDIF ENDIF c======================================================================= c end of trace back c======================================================================= CVAL = CVAL CMAXVAL = CMAXVAL C======================================================================= C aligned optimum subsequences of length M are in integer array C LAL_1(I) and LAL_2(J) C convert back to characters CALL INT_TO_SEQ(LAL_1,AL_1_ARRAY,M,TRANS,INDELMARK,ENDMARK) CALL INT_TO_SEQ(LAL_2,AL_2_ARRAY,M,TRANS,INDELMARK,ENDMARK) CALL INT_TO_STRCLASS(MAXSTRSTATES,MAXALSQ,M,LSAL_1, + STR_CLASSES,INDELMARK,ENDMARK,SAL_1_ARRAY) CALL INT_TO_STRCLASS(MAXSTRSTATES,MAXALSQ,M,LSAL_2, + STR_CLASSES,INDELMARK,ENDMARK,SAL_2_ARRAY) C process alignments C for terminal '<' M=LEM+1 IF (LAL_1(M) .EQ. ENDMARK) THEN LEM=M-1 ELSE LEM=M ENDIF IFIR=ITRACE(LEM) JFIR=JTRACE(LEM) CALL CHECKRANGE(LEM,1,MAXALSQ,'alilen LEM','TRACE') C======================================================================= C evaluate the alignments. C======================================================================= IDAL=0 IDSAL=0 IDELETION=0 NDEL=0 c count number of deletions/insertions c only if there is a '.' and the next character is no '.' DO K=1,M IF (K .LT. M) THEN IF (LAL_1(K) .EQ. INDELMARK) THEN IDELETION=IDELETION+1 IF (LAL_1(K+1) .NE. INDELMARK ) THEN NDEL=NDEL+1 ENDIF ENDIF IF (LAL_2(K) .EQ. INDELMARK) THEN IDELETION=IDELETION+1 IF (LAL_2(K+1) .NE. INDELMARK) THEN NDEL=NDEL+1 ENDIF ENDIF ENDIF IF (LAL_1(K) .EQ. LAL_2(K) .AND. LAL_1(K) .NE. ENDMARK) THEN IDAL=IDAL+1 AL_AGREE(K)= '*' ELSE AL_AGREE(K)= ' ' ENDIF C translate to three states H,E,L IF (LDSSP_1 .AND. LDSSP_2 ) THEN CALL STRUC_CLASS(MAXSTRSTATES,STR_CLASSES, + SAL_1_ARRAY(K),SK_1,ICLASS) CALL STRUC_CLASS(MAXSTRSTATES,STR_CLASSES, + SAL_2_ARRAY(K),SK_2,ICLASS) IF (SK_1 .EQ. SK_2 ) THEN IDSAL=IDSAL+1 SAL_AGREE(K)='+' ELSE SAL_AGREE(K)=' ' ENDIF ENDIF ENDDO C======================================================================= C LEN1 : is ILAS-IFIR+1 C LENOCC: is occupied postions (no INSDEL, used in HSSP) C LEM : length in SEQuence 1 including gaps C HOM : is identical postion / LENOCC C======================================================================= LEN1=ILAS-IFIR+1 LENOCC=0 DO I=1,ILAS-IFIR+1 IF (LAL_2(I) .NE. INDELMARK )LENOCC=LENOCC+1 ENDDO PER=VALUE/LENOCC cx per=value/lem HOM=FLOAT(IDAL)/FLOAT(LENOCC) IF (CMAXVAL .GT. -0.00001 .AND. CMAXVAL .LT. 0.00001) THEN SIM=0.0 ELSE SIM=(CVAL/CMAXVAL) ENDIF c WRITE(6,*)'trace ',cval,cmaxval C======================================================================= C test if threshold criterion is fulfilled (if specified) C======================================================================= LCONSIDER=.TRUE. IF (LTHRESHOLD .OR. LALL) THEN IF (LNEWCURVE) THEN CALL CHECKHSSPCUT99(LENOCC,HOM*100.0,ISOLEN,ISOIDE,NSTEP, + LFORMULA,LALL,ISAFE,LCONSIDER,DISTANCE) ELSE CALL CHECKHSSPCUT(LENOCC,HOM*100.0,ISOLEN,ISOIDE,NSTEP, + LFORMULA,LALL,ISAFE,LCONSIDER,DISTANCE) ENDIF ENDIF IF (CUTVALUE1 .GT. 0.0) THEN IF (VALUE .LT. (CMAXVAL/CUTVALUE1) ) LCONSIDER=.FALSE. ENDIF IF (CUTVALUE2 .GT. 0.0) THEN IF (VALUE .LT. CUTVALUE2 ) LCONSIDER=.FALSE. ENDIF C BR 99.09: write out debug IF (LDBG_LOCAL) THEN IF (LCONSIDER) THEN WRITE(6,'(A,I5,A)')' trace: nprot=',IALIGN+1,' take!' ELSE WRITE(6,'(A,I5,A)')' trace: nprot=',IALIGN+1,' reject!' ENDIF END IF C======================================================================= C compare 3D-structures of alignend fragments C======================================================================= LCALPHA=.TRUE. RMS=-1.0 IF (LCOMPSTR .AND. LCONSIDER) THEN IF (LDSSP_1 .AND. LDSSP_2 ) THEN CALL GETPIDCODE(NAME_1,PID_1) CALL FINDBRKFILE(BRKFILE_1,PDBPATH,PID_1,KBRK,KLOG,LERROR) IF (.NOT.LERROR) THEN CALL GETPIDCODE(NAME_2,PID_2) CALL FINDBRKFILE(BRKFILE_2,PDBPATH,PID_2,KBRK,KLOG, + LERROR) IF (.NOT.LERROR) THEN I=1 DO K=M,1,-1 ALI_1(I)=AL_1_ARRAY(K) ALI_2(I)=AL_2_ARRAY(K) I=I+1 ENDDO CALL ALITOSTRUCRMS(MAXALSQ,MAXSQ,BRKFILE_1,BRKFILE_2, + KBRK,PDBNO_1,CHAINID_1,PDBNO_2,CHAINID_2, + ALI_1,ALI_2,M,IFIR,ILAS,JFIR, + JLAS,LCALPHA,RMS) ENDIF ENDIF ENDIF ENDIF C=================================================================== C PRINT ALIGNED SEQS AND HOMOLGY VALUES.. C=================================================================== c if (ntest.eq.1) then c WRITE(6,*)'No IFIR ILAS JFIR JLAS NPOS NDEL '// c + 'VAL VPER NIDE IDE SIM RMS DIST' c endif c WRITE(6,1016)ntest,ifir,ilas,jfir,jlas,lenocc,ideletion, c + value,per,idal,hom,sim,rms,distance c1016 format(I4,2(1X,I4,'-',I4),2(I5),F7.2,F6.2,I5,1X,3(F6.1),F6.2) C======================================================================= C check value from setmatrix and recalculated value from trace back C======================================================================= LERROR=.FALSE. MAXDEVIATION=0.3 IF (ABS(CVAL-VALUE) .GT. MAXDEVIATION) LERROR=.TRUE. IF (LERROR) THEN WRITE(6,*)' *** FATAL ERROR IN TRACE ****' WRITE(6,*)' CVAL .NE. VALUE : ',CVAL,VALUE WRITE(6,*)' WRITE MATRIX AND TRACE BACK IN ??_MATRIX.ERROR' c$$$ call getpidcode(name_1,pid_1) c$$$ call concat_strings(pid_1,'_MATRIX.ERROR',errorfile) c$$$ call open_file(99,errorfile,'NEW,RECL=2000',lerror) c$$$ write(99,'(a,f12.5)')' CVAL : ',CVAL c$$$ write(99,'(a,f12.5)')' VALUE : ',VALUE c$$$C debug: output the LH (values and trace-back)matrix c$$$ write(99,*) 'H-MATRIX Hij' c$$$ write(99,*)'Index i runs for Sequence 1' c$$$ write(99,*)'Index j runs for Sequence 2' c$$$ do i=2,nd1 c$$$ write(99,'(i6)')i-1 c$$$ write(99,'(2x,20(f7.1))')(lh1(i,j),j=2,nd2) c$$$ enddo c$$$ write(99,*) ; write(99,*)'TRACE-BACK MATRIX' c$$$ do i=2,nd1 c$$$ write(99,'(i6)')i-1 c$$$ write(99,'(2x,20(f7.1))')(lh2(i,j),j=2,nd2) c$$$ enddo c$$$ close(99) STOP ENDIF IF (IALIGN+1 .GT. MAXALIGNS) THEN WRITE(6,*)'*** OVERFLOW, ALIGNMENTS TERMINATED ***' LALIOVERFLOW=.TRUE. RETURN ENDIF IALIGN=IALIGN+1 c alignments will be sorted according to this value IF (CSORTMODE .EQ. 'DISTANCE' ) THEN ALISORTKEY(IALIGN)=DISTANCE ELSE IF (CSORTMODE.EQ.'VALUE' .OR. CSORTMODE .EQ. 'ZSCORE') THEN ALISORTKEY(IALIGN)=VALUE ELSE IF (CSORTMODE .EQ. 'WSIM' ) THEN ALISORTKEY(IALIGN)=SIM ELSE IF (CSORTMODE .EQ. 'SIM' ) THEN ALISORTKEY(IALIGN)=SIM ELSE IF (CSORTMODE .EQ. 'SIGMA' ) THEN ALISORTKEY(IALIGN)=VALUE/SDEV ELSE IF (CSORTMODE .EQ. 'IDENTITY' ) THEN ALISORTKEY(IALIGN)=HOM ELSE IF (CSORTMODE .EQ. 'VALPER' ) THEN ALISORTKEY(IALIGN)=PER ELSE IF (CSORTMODE .EQ. 'VALFORM' ) THEN ALISORTKEY(IALIGN)=VALUE*(LENOCC**(-0.56158)) ELSE IF (CSORTMODE .EQ. 'NO' ) THEN ALISORTKEY(IALIGN)=FLOAT(MAXALIGNS - IALIGN) ENDIF C====================================================================== C STORE ALIGNMENTS IN FILE C====================================================================== IFILEPOI(IALIGN)=-999 IRECPOI(IALIGN)=-999 IF (LCONSIDER) THEN IFILEPOI(IALIGN)=ISET NRECORD=NRECORD+1 IRECPOI(IALIGN)=NRECORD IALIGN_GOOD=IALIGN_GOOD+1 WRITE(KCORE,REC=NRECORD)'*',LCONSIDER,VALUE C====================================================================== C WRITE AL_2_ARRAY(.) AND SAL_2_ARRAY(.) C mark insertions in SEQuence 1 by lower case letters in AL_2_ARRAY(*) DO K=M-1,2,-1 IF (LAL_1(K) .EQ. INDELMARK) THEN IF (LAL_1(K-1) .NE. INDELMARK) THEN CALL UPTOLOW(AL_2_ARRAY(K-1),1) ENDIF IF (LAL_1(K+1) .NE. INDELMARK) THEN CALL UPTOLOW(AL_2_ARRAY(K+1),1) ENDIF ENDIF ENDDO IPOS=1 DO I=M,1,-1 IF ( (LAL_1(I) .NE. INDELMARK) .AND. + (LAL_1(I) .NE. ENDMARK) ) THEN LINE(2)(IPOS:IPOS)=AL_2_ARRAY(I) IF (LDSSP_2) THEN LINE(3)(IPOS:IPOS)=SAL_2_ARRAY(I) WRITE(LINE(4)(IPOS:IPOS),'(I1)')LACC_2(I) ENDIF IPOS=IPOS+1 ENDIF ENDDO C====================================================================== NRECORD=NRECORD+1 WRITE(KCORE,REC=NRECORD)NAME_2 NRECORD=NRECORD+1 WRITE(KCORE,REC=NRECORD)COMPND_2 NRECORD=NRECORD+1 WRITE(KCORE,REC=NRECORD)ACCESSION_2,PDBREF_2,LDSSP_2 NRECORD=NRECORD+1 WRITE(KCORE,REC=NRECORD)IFIR,LEN1,LENOCC,JFIR,JLAS, + N2IN,IDELETION,NDEL,NSHIFTED,RMS,HOM, + SIM,SDEV,DISTANCE,IINS C store alignment IF (MOD(FLOAT(LEN1),FLOAT(MAXRECORDLEN)).EQ. 0.0) THEN NLINE= LEN1/MAXRECORDLEN ELSE NLINE=(LEN1/MAXRECORDLEN ) +1 ENDIF IBEG=1 IEND=MAXRECORDLEN DO I=1,NLINE NRECORD=NRECORD+1 WRITE(KCORE,REC=NRECORD)LINE(2)(IBEG:IEND) IF (LDSSP_2) THEN NRECORD=NRECORD+1 WRITE(KCORE,REC=NRECORD)LINE(3)(IBEG:IEND) NRECORD=NRECORD+1 WRITE(KCORE,REC=NRECORD)LINE(4)(IBEG:IEND) ENDIF IBEG=IEND+1 IEND=IEND+MAXRECORDLEN ENDDO C store insertions IF (IINS .GT. 0) THEN DO I=1,IINS NRECORD=NRECORD+1 WRITE(KCORE,REC=NRECORD)INSLEN_LOCAL(I), + INSBEG_1_LOCAL(I),INSBEG_2_LOCAL(I) ENDDO IF (MOD(FLOAT(INSPOINTER_LOCAL),FLOAT(MAXRECORDLEN)) .EQ. + 0.0) THEN NLINE= INSPOINTER_LOCAL/MAXRECORDLEN ELSE NLINE=(INSPOINTER_LOCAL/MAXRECORDLEN ) +1 ENDIF IBEG=1 IEND=MAXRECORDLEN DO I=1,NLINE NRECORD=NRECORD+1 WRITE(KCORE,REC=NRECORD)INSSEQ(IBEG:IEND) IBEG=IEND+1 IEND=IEND+MAXRECORDLEN ENDDO ENDIF C===================================================================== C unmark insertions in SEQuence 1 by lower case letters in AL_2_ARRAY(*) c===================================================================== DO I=1,M CALL LOWTOUP(AL_2_ARRAY(I),1) ENDDO C===================================================================== C write long output file C===================================================================== IF (LONG_OUT) THEN WRITE(KLONG,*)' No IFIR ILAS JFIR JLAS NPOS NDEL '// + 'VAL VPER NIDE IDE SIM RMS '// + 'DIST ACCESSION NAME' WRITE(KLONG,1017)NTEST,IFIR,ILAS,JFIR,JLAS,LENOCC, + IDELETION,VALUE,PER,IDAL,HOM,SIM,RMS,DISTANCE, + ACCESSION_2,NAME_2(1:50) 1017 FORMAT(I4,2(1X,I4,'-',I4),2(I5),2(F7.2),I5,1X,4(F6.2),A,A) JPOS=M ISTART=1 CTEMP=' ' J=ISTART DO K=JPOS,1,-1 WRITE(CTEMP(J:J),'(A)')AL_AGREE(K) J=J+1 ENDDO WRITE(KLONG,'(A)')CTEMP(1:J) CTEMP=' ' J=ISTART DO K=JPOS,1,-1 WRITE(CTEMP(J:J),'(A)')AL_1_ARRAY(K) J=J+1 ENDDO WRITE(KLONG,'(A)')CTEMP(1:J) IF (SAL_1_ARRAY(1) .NE. 'U') THEN CTEMP=' ' J=ISTART DO K=JPOS,1,-1 WRITE(CTEMP(J:J),'(A)')SAL_1_ARRAY(K) J=J+1 ENDDO WRITE(KLONG,'(A)')CTEMP(1:J) ENDIF CTEMP=' ' J=ISTART DO K=JPOS,1,-1 WRITE(CTEMP(J:J),'(A)')AL_2_ARRAY(K) J=J+1 ENDDO WRITE(KLONG,'(A)')CTEMP(1:J) IF (SAL_2_ARRAY(1).NE.'U') THEN CTEMP=' ' J=ISTART DO K=JPOS,1,-1 WRITE(CTEMP(J:J),'(A)')SAL_2_ARRAY(K) J=J+1 ENDDO WRITE(KLONG,'(A)')CTEMP(1:J) CTEMP=' ' J=ISTART DO K=JPOS,1,-1 WRITE(CTEMP(J:J),'(A)')SAL_AGREE(K) J=J+1 ENDDO WRITE(KLONG,'(A)')CTEMP(1:J) ENDIF WRITE(KLONG,*)' ' J=ISTART DO K=JPOS,1,-1 WRITE(KLONG,'(I6,A,2X,I6,A)') + ITEMP_NO1(K),CTEMP_CHAIN1(K), + ITEMP_NO2(K),CTEMP_CHAIN2(K) C J=J+1 ENDDO c jpos=m ; ipos=m-100+1 ; linelen=100 ; iblocklen=11 c istart=1 c do while( jpos .ge. 1) c ipos=max(ipos,1) ; ctemp=' ' ; j=istart c do k=jpos,ipos,-1 ; if (mod(j,iblocklen) .eq. 0)j=j+1 c write(ctemp(j:j),'(a)')al_agree(k) ; j=j+1 c ENDDO; write(klong,'(a)')ctemp(1:j) c ctemp=' ' ; j=istart c do k=jpos,ipos,-1 ; if (mod(j,iblocklen) .eq. 0)j=j+1 c write(ctemp(j:j),'(a)')al_1_array(k) ; j=j+1 c ENDDO; write(klong,'(a)')ctemp(1:j) c if (sal_1_array(1) .ne. 'U') then ; ctemp=' ';j=istart c do k=jpos,ipos,-1 ; if (mod(j,iblocklen) .eq. 0)j=j+1 c write(ctemp(j:j),'(a)')sal_1_array(k) ; j=j+1 c ENDDO; write(klong,'(a)')ctemp(1:j) c endif c ctemp=' ' ; j=istart c do k=jpos,ipos,-1 ; if (mod(j,iblocklen) .eq. 0)j=j+1 c write(ctemp(j:j),'(a)')al_2_array(k) ; j=j+1 c ENDDO; write(klong,'(a)')ctemp(1:j) c if (sal_2_array(1).ne.'U') then ; ctemp=' ' ; j=istart c do k=jpos,ipos,-1 ; if (mod(j,iblocklen).eq.0)j=j+1 c write(ctemp(j:j),'(a)')sal_2_array(k) ; j=j+1 c ENDDO; write(klong,'(a)')ctemp(1:j) c ctemp=' ' ; j=istart c do k=jpos,ipos,-1 ; if (mod(j,iblocklen) .eq. 0)j=j+1 c write(ctemp(j:j),'(a)')sal_agree(k) ; j=j+1 c ENDDO; write(klong,'(a)')ctemp(1:j) c endif c write(klong,*)' ' c jpos=jpos-linelen ; ipos=ipos-linelen c enddo ENDIF C===================================================================== C output to PLOT file TRACE.X C===================================================================== IF (LTRACEOUT) THEN CALL OPEN_FILE(KPLOT,PLOTFILE,'UNKNOWN,APPEND',LERROR) CALL PUTHEADER(KPLOT,CSQ_1,CSQ_2,STRUC_1,STRUC_2,ND1-1, + ND2-1,NAME_1,NAME_2) WRITE(KPLOT,'(1X,I3,A)')NTEST,' TRACE' C if lall linethickness is value/residue; else its the distance from the C chosen threshold IF (LALL) THEN LINETHICK=NINT(PER) ELSE LINETHICK=NINT(DISTANCE) ENDIF WRITE(KPLOT,'(3(I4))')ITRACE(1),JTRACE(1),LINETHICK C output only straight line end points C so, plot beginning, end, and kink points C kink if there is a discontinuity in either I or J increments DO K=2,M-1 LKINK=ABS(ITRACE(K)-ITRACE(K+1)) .NE. + ABS(ITRACE(K)-ITRACE(K-1)) .OR. + ABS(JTRACE(K)-JTRACE(K+1)) .NE. + ABS(JTRACE(K)-JTRACE(K-1)) IF (LKINK) THEN WRITE(KPLOT,'(3(I4))')ITRACE(K),JTRACE(K),LINETHICK ENDIF ENDDO WRITE(KPLOT,'(3(I4))')ITRACE(M),JTRACE(M),LINETHICK C DEFINES END OF TRACE IN TRACE-HOMOLOGY ITRACE(M+1)=0 JTRACE(M+1)=0 WRITE(KPLOT,'(3(I4))')ITRACE(M+1),JTRACE(M+1),LINETHICK CLOSE(KPLOT) ENDIF ENDIF C end if lconsider C======================================================================= RETURN END C END TRACE C...................................................................... C...................................................................... C SUB WRITE_ALB SUBROUTINE WRITE_ALB(KOUT,OUTFILE,SEQ,NBLOCKS,HEADERLINE, 1 NAMELABEL,SEQSTART,SEQSTOP,CHBPOS,NBREAKS,ERROR) IMPLICIT NONE C IMPORT INTEGER KOUT INTEGER NBLOCKS INTEGER SEQSTART,SEQSTOP INTEGER NBREAKS,CHBPOS(*) CHARACTER*(*) SEQ CHARACTER*(*) HEADERLINE,NAMELABEL,OUTFILE C EXPORT C ( OUTPUT TO UNIT KOUT ) LOGICAL ERROR C INTERNAL INTEGER BLOCKSIZE PARAMETER (BLOCKSIZE= 10) INTEGER ISTART, ISTOP, FIRSTPOS, LASTPOS, ISEQPOS INTEGER BEGIN, END, LENGTH, ICHAIN CHARACTER*(250) OUTLINE, ALBHEADLINE LOGICAL NOCHAINBREAKS ERROR = .FALSE. C try to open outfile; return if unsuccessful CALL OPEN_FILE(KOUT,OUTFILE,'new,recl=250',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN NOCHAINBREAKS = .TRUE. LENGTH = SEQSTOP-SEQSTART+1 C make up standard alb headerline C 1GD1 .... C 1336 C ^ length in col 200 WRITE(ALBHEADLINE,'(I4)') LENGTH - NBREAKS CALL RIGHTADJUST(ALBHEADLINE(1:200),1,200) CALL STRPOS(NAMELABEL,ISTART,ISTOP) ALBHEADLINE(1:ISTOP-ISTART+2) = ' ' // 1 NAMELABEL(MAX(ISTART,1):MAX(1,ISTOP)) C WRITE SEQUENCE ISEQPOS = 0 DO ICHAIN = 1,NBREAKS+1 IF ( ICHAIN .EQ. 1 ) THEN FIRSTPOS = SEQSTART ELSE FIRSTPOS = CHBPOS(ICHAIN-1) + 1 WRITE(KOUT,'(A)') '=' ENDIF IF ( ICHAIN .EQ. NBREAKS+1 ) THEN LASTPOS = SEQSTOP ELSE LASTPOS = CHBPOS(ICHAIN) - 1 ENDIF CALL STRPOS(ALBHEADLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') ALBHEADLINE(1:MAX(1,ISTOP)) WRITE(OUTLINE,'(2I4)') LASTPOS-FIRSTPOS+1, ISEQPOS CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) C BEGIN = FIRSTPOS BEGIN = SEQSTART C "REPEAT UNTIL" 1 CONTINUE C WRITESEQLINE RETURNS END CALL WRITESEQLINE(SEQ,BEGIN,BLOCKSIZE,NBLOCKS,SEQSTOP, 1 NOCHAINBREAKS,OUTLINE,END,ERROR) C CALL WRITESEQLINE(SEQ,BEGIN,BLOCKSIZE,NBLOCKS,LASTPOS, C 1 NOCHAINBREAKS,OUTLINE,END,ERROR) IF ( ERROR ) STOP CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) BEGIN = END + 1 C IF ( BEGIN .LE. LASTPOS ) GOTO 1 IF ( BEGIN .LE. SEQSTOP ) GOTO 1 C END "REPEAT UNTIL" CALL STRPOS(HEADERLINE,ISTART,ISTOP) WRITE(KOUT,'(A,A)') ' ',HEADERLINE(1:MAX(1,ISTOP)) WRITE(KOUT,'(A)') '=' ISEQPOS = ISEQPOS + LASTPOS-FIRSTPOS+1 ENDDO CLOSE(KOUT) RETURN END C END WRITE_ALB C...................................................................... C...................................................................... C SUB WRITE_EMBL SUBROUTINE WRITE_EMBL(KOUT,SEQ,NBLOCKS,INFILE,OUTFILE, 1 HEADERLINE,SEQSTART,SEQSTOP,ERROR) IMPLICIT NONE C IMPORT INTEGER KOUT INTEGER NBLOCKS INTEGER SEQSTART,SEQSTOP CHARACTER*(*) SEQ CHARACTER*(*) HEADERLINE,INFILE,OUTFILE C EXPORT C ( OUTPUT TO UNIT KOUT ) LOGICAL ERROR C INTERNAL INTEGER BLOCKSIZE PARAMETER (BLOCKSIZE= 10) INTEGER ISTART, ISTOP INTEGER BEGIN, END, LENGTH CHARACTER*(250) OUTLINE LOGICAL NOCHAINBREAKS *----------------------------------------------------------------------* ERROR = .FALSE. C try to open outfile; return if unsuccessful CALL OPEN_FILE(KOUT,OUTFILE,'new',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN NOCHAINBREAKS = .FALSE. LENGTH = SEQSTOP-SEQSTART+1 OUTLINE = 'ID X' CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(MAX(ISTART,1):MAX(1,ISTOP)) C BEGIN AND END CALL STRPOS(INFILE,ISTART,ISTOP) WRITE(OUTLINE,'(A,A,1X,A,I4,1X,A,I4)') 1 'DE ',infile(max(istart,1):max(1,istop)),'from: ', 2 seqstart,'to: ',seqstop CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(MAX(ISTART,1):MAX(1,ISTOP)) C copy passed headerline ( mark it with "DE" - not necessary (?)) CALL STRPOS(HEADERLINE,ISTART,ISTOP) OUTLINE = 'DE ' // HEADERLINE(1:MAX(1,ISTOP)) CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(MAX(ISTART,1):MAX(1,ISTOP)) C make up standard embl headerline C SQ SEQUENCE 344 AA; write(outline,'(a,i6,a)') 'SQ SEQUENCE',length,' AA;' CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) C write sequence BEGIN = SEQSTART C "repeat until" 1 CONTINUE C writeseqline returns end CALL WRITESEQLINE(SEQ,BEGIN,BLOCKSIZE,NBLOCKS,SEQSTOP, 1 NOCHAINBREAKS,OUTLINE,END,ERROR) IF ( ERROR ) STOP CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) BEGIN = END + 1 IF ( BEGIN .LE. SEQSTOP ) GOTO 1 C end "repeat until" C standard end marker WRITE(KOUT,'(A)') '//' CLOSE(KOUT) RETURN END C END WRITE_EMBL C...................................................................... C...................................................................... C SUB WRITE_GCG SUBROUTINE WRITE_GCG(KOUT,SEQ,NBLOCKS,NBREAKS,INFILE,OUTFILE, 1 HEADERLINE,SEQSTART,SEQSTOP,ERROR) IMPLICIT NONE C IMPORT INTEGER KOUT INTEGER NBLOCKS, NBREAKS INTEGER SEQSTART,SEQSTOP CHARACTER*(*) SEQ CHARACTER*(*) HEADERLINE,INFILE, OUTFILE C EXPORT C ( OUTPUT TO UNIT KOUT ) LOGICAL ERROR C INTERNAL INTEGER BLOCKSIZE PARAMETER (BLOCKSIZE= 10) INTEGER ISTART, ISTOP INTEGER BEGIN, END INTEGER CHECK, LENGTH CHARACTER*8 CTMP CHARACTER*9 DATESTRING CHARACTER*(250) OUTLINE, SEQLINE LOGICAL NOCHAINBREAKS ERROR = .FALSE. C try to open outfile; return if unsuccessful CALL OPEN_FILE(KOUT,OUTFILE,'new',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN NOCHAINBREAKS = .TRUE. LENGTH = SEQSTOP-SEQSTART+1 C BEGIN AND END CALL STRPOS(INFILE,ISTART,ISTOP) WRITE(OUTLINE,'(1X,A,1X,A,I4,1X,A,I4)') 1 infile(max(istart,1):max(1,istop)),'from: ', 2 seqstart,'to: ',seqstop CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(MAX(ISTART,1):MAX(1,ISTOP)) WRITE(KOUT,'(A)') ' ' C COPY PASSED HEADERLINE CALL STRPOS(HEADERLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') HEADERLINE(1:MAX(ISTOP,1)) C MAKE UP STANDARD GCG HEADERLINE CALL STRPOS(OUTFILE,ISTART,ISTOP) CALL GETDATE(DATESTRING) CALL CHECKSEQ(SEQ,SEQSTART,SEQSTOP,CHECK) WRITE(OUTLINE,'(1X,A,10X,A,I5,3X,A,2X,A,I5,1X,A)') 1 outfile(max(istart,1):max(1,istop)),'Length:', 2 length-nbreaks,datestring,'Check:',check,'..' CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(ISTOP,1)) WRITE(KOUT,'(A)') ' ' C write sequence C 1 RPDFCLEPPY TGPCKARIIR YFYNAKAGLC QTFVYGGCRA KRNNFKSAED BEGIN = SEQSTART C "repeat until" 1 CONTINUE WRITE(CTMP,'(I8)') BEGIN-SEQSTART+1 C writeseqline returns end CALL WRITESEQLINE(SEQ,BEGIN,BLOCKSIZE,NBLOCKS,SEQSTOP, 1 NOCHAINBREAKS,SEQLINE,END,ERROR) IF ( ERROR ) STOP CALL STRPOS(SEQLINE,ISTART,ISTOP) C gcg sequence lines are preceeded by a number (first pos. of line ) OUTLINE = CTMP // ' ' // 1 SEQLINE(MAX(ISTART,1):MAX(ISTOP,1)) CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(ISTOP,1)) WRITE(KOUT,'(A)') ' ' BEGIN = END + 1 IF ( BEGIN .LE. SEQSTOP ) GOTO 1 C END "REPEAT UNTIL" CLOSE(KOUT) RETURN END C END WRITE_GCG C...................................................................... C...................................................................... C SUB WRITE_HSSP SUBROUTINE WRITE_HSSP(KOUT,MAXRES,NALIGN,NRES,EMBLID,STRID, + ACCESSION,IDE,SIM,IFIR,ILAS,JFIR,JLAS, + LALI,NGAP,LGAP,LENSEQ,PROTNAME,ALIPOINTER, + ALISEQ,PDBNO,CHAINID,PDBSEQ,SECSTR,COLS, + BP1,BP2,SHEETLABEL,ACC,NOCC,VAR,SEQPROF, + NDEL,NINS,ENTROPY,RELENT,CONSWEIGHT, + INSNUMBER,INSALI,INSPOINTER,INSLEN, + INSBEG_1,INSBEG_2,INSBUFFER,ISOLEN, + ISOIDE,NSTEP,LFORMULA,LALL,ISAFE, + EXCLUDEFLAG,LCONSERV,LHSSP_LONG_ID) IMPLICIT NONE C---- import INTEGER KOUT,MAXRES,NALIGN,NRES, + IFIR(*),ILAS(*),JFIR(*),JLAS(*),LALI(*), + NGAP(*),LGAP(*),LENSEQ(*),ALIPOINTER(*),PDBNO(*), + BP1(*),BP2(*), + ACC(*),NOCC(*),VAR(*),SEQPROF(MAXRES,*),NDEL(*), + NINS(*),RELENT(*), + ISOLEN(*),NSTEP,ISAFE, + INSNUMBER,INSALI(*),INSPOINTER(*), + INSLEN(*),INSBEG_1(*),INSBEG_2(*) CHARACTER*(*) EMBLID(*),STRID(*),ACCESSION(*),PROTNAME(*), + ALISEQ(*), + CHAINID(*),PDBSEQ(*),SECSTR(*), + EXCLUDEFLAG(*) CHARACTER*7 COLS(*) CHARACTER*1 SHEETLABEL(*),INSBUFFER(*) REAL IDE(*),SIM(*),ENTROPY(*),CONSWEIGHT(*),ISOIDE(*) LOGICAL LCONSERV,LFORMULA,LALL,LHSSP_LONG_ID C---- internal parameter INTEGER NBLOCKSIZE,NBLOCKINS, + MAXAA,MAXALIGNS_LOC PARAMETER (NBLOCKSIZE= 70) PARAMETER (NBLOCKINS= 100) C maximal number of symbols PARAMETER (MAXAA= 20) C maximal number of alignments PARAMETER (MAXALIGNS_LOC= 8765) C PARAMETER (MAXALIGNS_LOC= 12345) C PARAMETER (MAXALIGNS_LOC= 21987) C---- internal veriable INTEGER NALIGN_FILTER, + I,J,ILEN,LENLINE,K, + NBLOCK,IALIGN,JUMP,ISTART,ISTOP,IRUL,IBLOCK, + LPOS,JPOS,IPOS, + IAL,IBEG,IEND,IINS, + INS_NEW,INS_ORDER(MAXALIGNS_LOC),NRES2 LOGICAL LINSERTION,LCONSIDER CHARACTER PROFILESEQ*(MAXAA), + CRULER*(NBLOCKSIZE), + CTEMP*(NBLOCKSIZE),CTEMPINS*(NBLOCKINS), + LINE*512 REAL DISTANCE C---- ------------------------------------------------------------------ C---- C---- ------------------------------------------------------------------ C order of amino acid symbols in the HSSP sequence profile block PROFILESEQ='VLIMFWYGAPSTCHRKQEND' C---- C---- check local array dimension C---- IF (NALIGN .GT. MAXALIGNS_LOC) THEN WRITE(6,*)'*** ERROR WRITE_HSSP: MAXALIGNS_LOC overflow' WRITE(6,*)'*-> increase dimension !' STOP ENDIF C---- C---- 99.01 br changed C---- CC C get number of alignments after filtering CC NALIGN_FILTER=0 CC DO I=1,NALIGN CC INS_ORDER(I)=0 CC CALL CHECKHSSPCUT(LALI(I),IDE(I)*100,ISOLEN, CC + ISOIDE,NSTEP,LFORMULA,LALL,ISAFE,LCONSIDER,DISTANCE) CC IF ( LCONSIDER ) THEN CC IF ( EXCLUDEFLAG(I) .EQ. ' ') THEN CC NALIGN_FILTER=NALIGN_FILTER+1 CC INS_ORDER(I)=NALIGN_FILTER CC ENDIF CC ELSE CC EXCLUDEFLAG(I)='*' CC ENDIF CC ENDDO C---- C---- 99.01 br: new version C---- C get number of alignments after filtering NALIGN_FILTER=0 DO I=1,NALIGN INS_ORDER(I)=0 IF ( EXCLUDEFLAG(I) .NE. '*') THEN NALIGN_FILTER=NALIGN_FILTER+1 INS_ORDER(I)=NALIGN_FILTER ENDIF ENDDO C---- no alignment -> write last line ('//') IF (NALIGN_FILTER .EQ. 0) THEN WRITE(6,*)'-*- WARNING WRITE_HSSP file empty (no ali found)!' WRITE(KOUT,'(A)')'//' CLOSE(KOUT) RETURN ENDIF C======================================================================= C write the PROTEINS-block C======================================================================= C## PROTEINS : EMBL/SWISSPROT identifier and alignment statistics C NR. ID STRID %IDE %WSIM IFIR ILAS JFIR JLAS LALI NGAP LGAP LSEQ2 P C 1 : IATR$BOVIN 0.43 12345 1 56 1 56 56 0 0 123 A C1234AAA123456789012AAAAAAX1234512345X1234X1234X1234X1234X1234X1234X1234X1234XX1 C C NR. ID STRID %IDE %WSIM IFIR ILAS JFIR JLAS LALI NGAP LGAP LSEQ2 C 1 : IATR$BOVIN.............................. 0.43 12345 1 56 1 56 56 0 0 123 A C1234AAA1234567890123456789012345678901234567890AAAAAAX1234512345X1234X1234X1234X1234X1234X1234X1234X1234XX WRITE(KOUT,'(A)')'## PROTEINS : EMBL/SWISSPROT identifier '// + 'and alignment statistics' IF (LCONSERV) THEN IF ( LHSSP_LONG_ID ) THEN WRITE(KOUT,'(A)') + ' NR. ID '// + ' STRID %IDE %WSIM IFIR ILAS'// + ' JFIR JLAS LALI NGAP LGAP LSEQ2 ACCESSION'// + ' PROTEIN' ELSE WRITE(KOUT,'(A)') + ' NR. ID STRID %IDE %WSIM'// + ' IFIR ILAS JFIR JLAS LALI NGAP LGAP LSEQ2'// + ' ACCESSION PROTEIN' ENDIF ELSE IF ( LHSSP_LONG_ID ) THEN WRITE(KOUT,'(A)') + ' NR. ID '// + ' STRID %IDE %SIM IFIR ILAS'// + ' JFIR JLAS LALI NGAP LGAP LSEQ2 ACCESSION'// + ' PROTEIN' ELSE WRITE(KOUT,'(A)') + ' NR. ID STRID %IDE %SIM'// + ' IFIR ILAS JFIR JLAS LALI NGAP LGAP LSEQ2'// + ' ACCESSION PROTEIN' ENDIF ENDIF J=0 DO I=1,NALIGN IF ( EXCLUDEFLAG(I).EQ.' ') THEN C -------------------------------------------------- C terrible hack br 99-11: shorten too long proteins NRES2=LENSEQ(I) IF (NRES2.GT.9999) NRES2=9999 C end of terrible hack C -------------------------------------------------- J=J+1 IF (LHSSP_LONG_ID ) THEN WRITE(LINE,50)J,' : ',EMBLID(I),STRID(I),IDE(I),SIM(I), + IFIR(I),ILAS(I),JFIR(I),JLAS(I),LALI(I),NGAP(I), + LGAP(I),NRES2,ACCESSION(I),PROTNAME(I)(1:41) CALL STRPOS(LINE,ILEN,LENLINE) WRITE(KOUT,'(A)')LINE(1:LENLINE) ELSE WRITE(LINE,100)J,' : ',EMBLID(I),STRID(I),IDE(I),SIM(I), + IFIR(I),ILAS(I),JFIR(I),JLAS(I),LALI(I),NGAP(I), + LGAP(I),NRES2,ACCESSION(I),PROTNAME(I)(1:41) CALL STRPOS(LINE,ILEN,LENLINE) WRITE(KOUT,'(A)')LINE(1:LENLINE) ENDIF ENDIF ENDDO 50 FORMAT (1X,I4,A,A40,A6,1X,F5.2,1X,F5.2,8(1X,I4),2X,A10,1X,A) 100 FORMAT (1X,I4,A,A12,A6,1X,F5.2,1X,F5.2,8(1X,I4),2X,A10,1X,A) C number of ALIGNMENTS-blocks IF (MOD(FLOAT(NALIGN_FILTER),FLOAT(NBLOCKSIZE)).EQ. 0.0) THEN NBLOCK=NALIGN_FILTER/NBLOCKSIZE ELSE NBLOCK=NALIGN_FILTER/NBLOCKSIZE+1 ENDIF IALIGN=0 JUMP=0 ISTOP=IALIGN+NBLOCKSIZE IF (ISTOP.GT.NALIGN_FILTER) ISTOP=NALIGN_FILTER IRUL=1 C======================================================================= C loop over ALIGNMENTS-blocks C======================================================================= DO IBLOCK=1,NBLOCK C make ruler LPOS=1 DO K=1,(NBLOCKSIZE/10) IF (IRUL.EQ.10) IRUL=0 WRITE(CRULER(LPOS:LPOS+9),'(A9,I1)')'....:....',IRUL LPOS=LPOS+10 IRUL=IRUL+1 ENDDO WRITE(KOUT,'(2(A,I4))')'## ALIGNMENTS ', + IALIGN+1-JUMP,' - ',ISTOP WRITE(KOUT,'(A)')' SeqNo PDBNo AA STRUCTURE '// + 'BP1 BP2 ACC NOCC VAR '//cruler C======================================================================= C rearange alignment in vertical order C======================================================================= DO I=1,NRES CTEMP=' ' JPOS=1 IPOS=1 JUMP=0 CCCC parsytec bug c stupid parsytec has problems here c do while(ipos .le. nblocksize .and. C + (ialign+jpos) .le. nalign) DO WHILE(IPOS .LE. NBLOCKSIZE ) IF ( (IALIGN+JPOS) .GT. NALIGN) THEN GOTO 10 ENDIF IAL=IALIGN+JPOS JPOS=JPOS+1 IF ( EXCLUDEFLAG(IAL) .EQ. ' ' ) THEN IF (I .GE. IFIR(IAL) .AND. I .LE. ILAS(IAL)) THEN J=ALIPOINTER(IAL)+(I-IFIR(IAL)) CTEMP(IPOS:IPOS)=ALISEQ(J) IPOS=IPOS+1 ELSE CTEMP(IPOS:IPOS)=' ' IPOS=IPOS+1 ENDIF ELSE JUMP=JUMP+1 ENDIF ENDDO 10 LINE=' ' C======================================================================= C write ALIGNMENTS-block C======================================================================= WRITE(LINE,200)I,PDBNO(I),CHAINID(I),PDBSEQ(I),SECSTR(I), + COLS(I),BP1(I),BP2(I),SHEETLABEL(I),ACC(I), + NOCC(I),VAR(I),CTEMP IF (PDBNO(I).EQ.0) LINE(7:11)=' ' CALL STRPOS(LINE,IBEG,IEND) WRITE(KOUT,'(A)')LINE(1:IEND) ENDDO IALIGN=IALIGN+NBLOCKSIZE+JUMP ISTOP=IALIGN+NBLOCKSIZE IF (ISTOP.GT.NALIGN_FILTER) ISTOP=NALIGN_FILTER IF (IBLOCK.EQ.NBLOCK) THEN WRITE(KOUT,'(A)')'## SEQUENCE PROFILE AND ENTROPY' WRITE(KOUT,'(1X,A,20(3X,A1),A,A,A,A,A,A)')'SeqNo PDBNo', + (profileseq(I:I),I=1,maxaa),' NOCC',' NDEL', + ' NINS',' ENTROPY',' RELENT',' WEIGHT' ENDIF ENDDO 200 FORMAT(2X,2(I4,1X),A1,1X,A1,2X,A1,1X,A7,2(I4),A1,2(I4,1X),I4,2X,A) C======================================================================= C write SEQUENCE PROFILE-block C======================================================================= DO I=1,NRES LINE=' ' WRITE(LINE,300)I,PDBNO(I),CHAINID(I), + (SEQPROF(I,K),K=1,MAXAA),NOCC(I),NDEL(I),NINS(I), + ENTROPY(I),RELENT(I),CONSWEIGHT(I) IF (PDBNO(I).EQ.0) LINE(7:11)=' ' CALL STRPOS(LINE,IBEG,IEND) WRITE(KOUT,'(A)')LINE(1:IEND) ENDDO 300 FORMAT (2(1X,I4),1X,A1,20(I4),1X,3(1X,I4),1X,F7.3,3X,I4,2X,F4.2) C======================================================================= C write insertion block C======================================================================= LINSERTION=.FALSE. IINS=1 C WRITE(6,*)'IINS = ',IINS,' INSNUMBER = ',INSNUMBER DO WHILE (.NOT. LINSERTION .AND. IINS .LE. INSNUMBER) IF ( INSALI(IINS) .EQ. 0 ) WRITE(6,*)'IINS = ',IINS IF ( EXCLUDEFLAG (INSALI(IINS)) .EQ.' ')LINSERTION=.TRUE. IINS=IINS+1 ENDDO IF ( LINSERTION ) THEN WRITE(KOUT,'(A)')'## INSERTION LIST' WRITE(KOUT,'(A)')' AliNo IPOS JPOS Len Sequence' CTEMPINS=' ' DO IINS=1,INSNUMBER IF ( EXCLUDEFLAG (INSALI(IINS)) .EQ.' ') THEN JPOS=INSPOINTER(IINS) INS_NEW = INS_ORDER( INSALI(IINS) ) IF (INSLEN(IINS)+2 .LE. NBLOCKINS) THEN DO IPOS=1,INSLEN(IINS)+2 CTEMPINS(IPOS:IPOS)=INSBUFFER(JPOS) JPOS=JPOS+1 ENDDO WRITE(KOUT,'(4(I6),1X,A)')INS_NEW,INSBEG_1(IINS), + INSBEG_2(IINS),INSLEN(IINS),CTEMPINS(1:INSLEN(IINS)+2) ELSE DO IPOS=1,NBLOCKINS CTEMPINS(IPOS:IPOS)=INSBUFFER(JPOS) JPOS=JPOS+1 ENDDO WRITE(KOUT,'(4(I6),1X,A)')INS_NEW,INSBEG_1(IINS), + INSBEG_2(IINS),INSLEN(IINS),CTEMPINS(1:NBLOCKINS) IBEG=NBLOCKINS+1 DO WHILE (IBEG .LE. INSLEN(IINS)+2 ) IEND=MIN(IBEG+NBLOCKINS-1,INSLEN(IINS)+2 ) IPOS=0 DO J=IBEG,IEND IPOS=IPOS+1 CTEMPINS(IPOS:IPOS)=INSBUFFER(JPOS) JPOS=JPOS+1 ENDDO WRITE(KOUT,'(A,19X,A)')' +',CTEMPINS(1:IPOS) IBEG=IBEG+NBLOCKINS ENDDO ENDIF ENDIF ENDDO ENDIF C write last line ('//') WRITE(KOUT,'(A)')'//' CLOSE(KOUT) RETURN END C END WRITE_HSSP C...................................................................... C...................................................................... C SUB WRITE_KLEIN SUBROUTINE WRITE_KLEIN(KOUT,SEQ,NBLOCKS,NAME,INFILE,OUTFILE, 1 HEADERLINE,SEQSTART,SEQSTOP,ERROR) IMPLICIT NONE C IMPORT INTEGER KOUT INTEGER NBLOCKS INTEGER SEQSTART,SEQSTOP CHARACTER*(*) SEQ CHARACTER*(*) HEADERLINE, NAME, INFILE,OUTFILE C EXPORT C ( OUTPUT TO UNIT KOUT ) LOGICAL ERROR C INTERNAL INTEGER BLOCKSIZE PARAMETER (BLOCKSIZE= 10) INTEGER ISTART, ISTOP INTEGER BEGIN, END, LENGTH CHARACTER*(250) OUTLINE LOGICAL NOCHAINBREAKS ERROR = .FALSE. C TRY TO OPEN OUTFILE; RETURN IF UNSUCCESSFUL CALL OPEN_FILE(KOUT,OUTFILE,'new',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN NOCHAINBREAKS = .FALSE. LENGTH = SEQSTOP-SEQSTART+1 C BEGIN AND END CALL STRPOS(INFILE,ISTART,ISTOP) WRITE(OUTLINE,'(A,A,1X,A,I4,1X,A,I4)') 1 '; ',INFILE(MAX(ISTART,1):MAX(1,ISTOP)), 2 'from: ',seqstart,'to: ',seqstop CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(MAX(ISTART,1):MAX(1,ISTOP)) C headerline is a comment line: marked by ';' CALL STRPOS(HEADERLINE,ISTART,ISTOP) OUTLINE = '; ' // HEADERLINE(MAX(ISTART,1):MAX(1,ISTOP)) WRITE(KOUT,'(A)') OUTLINE(1:ISTOP+2) OUTLINE = ' ' C make up standard klein headerline C 1GD1 1339 CALL STRPOS(NAME,ISTART,ISTOP) WRITE(OUTLINE,'(1X,A,10X,I4)') 1 NAME(MAX(ISTART,1):MIN(MAX(ISTOP,1),6)),LENGTH CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) C write sequence BEGIN = SEQSTART C "repeat until" 1 CONTINUE C writeseqline returns end CALL WRITESEQLINE(SEQ,BEGIN,BLOCKSIZE,NBLOCKS,SEQSTOP, 1 NOCHAINBREAKS,OUTLINE,END,ERROR) IF ( ERROR ) STOP CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) BEGIN = END + 1 IF ( BEGIN .LE. SEQSTOP ) GOTO 1 C END "REPEAT UNTIL" CLOSE(KOUT) RETURN END C END WRITE_KLEIN C...................................................................... C...................................................................... C SUB WRITE_MSF SUBROUTINE WRITE_MSF(KUNIT,INFILE,OUTFILE,MAXALIGNS,MAXRES, 1 MAXCORE,MAXINS,MAXINSBUF,BEGIN,END,NBLOCKS,ALISEQ, 2 ALIPOINTER,IFIR,ILAS,TYPE,SEQNAMES,WEIGHT,SEQCHECK, 3 MSFCHECK,ALILEN,NSEQ,INSNUMBER,INSALI,INSPOINTER, 4 INSLEN,INSBEG_1,INSBUFFER,LDOEXP,ERROR) IMPLICIT NONE C 3.6.93 insertion lists C 4.11.93 C Import INTEGER MAXALIGNS, MAXRES, MAXCORE, MAXINS, MAXINSBUF INTEGER KUNIT, BEGIN, END, NBLOCKS, NSEQ INTEGER ALIPOINTER(MAXALIGNS) INTEGER ALILEN INTEGER IFIR(MAXALIGNS), ILAS(MAXALIGNS) INTEGER INSNUMBER,INSALI(MAXINS),INSPOINTER(MAXINS) INTEGER INSLEN(MAXINS),INSBEG_1(MAXINS) CHARACTER*(*) INFILE, OUTFILE C 'P' = PROTEIN SEQUENCES, 'N' = NUCLEOTIDE SEQ CHARACTER*1 TYPE CHARACTER*(*) SEQNAMES(MAXALIGNS) CHARACTER ALISEQ(MAXCORE) CHARACTER INSBUFFER(MAXINSBUF) REAL WEIGHT(MAXALIGNS) LOGICAL LDOEXP C EXPORT INTEGER MSFCHECK INTEGER SEQCHECK(MAXALIGNS) LOGICAL ERROR C INTERNAL INTEGER BLOCKSIZE INTEGER CODELEN INTEGER MAXALIGNS_LOC,MAXRES_LOC INTEGER LINELEN PARAMETER (BLOCKSIZE= 10) PARAMETER (MAXALIGNS_LOC= 8765) C PARAMETER (MAXALIGNS_LOC= 12345) C PARAMETER (MAXALIGNS_LOC= 21987) PARAMETER (MAXRES_LOC= 9999) C PARAMETER (MAXRES_LOC= 10000) C PARAMETER (MAXRES_LOC= 19876) PARAMETER (LINELEN= 200) INTEGER*2 MAXLEN(MAXRES_LOC) INTEGER*2 INSLIST_POINTER(MAXRES_LOC) INTEGER*2 TOTALINSLEN(MAXRES_LOC) INTEGER POS1, POS2 INTEGER I, J, K, KK, IPOS, JPOS, ISEQ, IINS INTEGER LASTPOS, IAP INTEGER ISTART, ISTOP, I1START, I1STOP, I2START, I2STOP INTEGER ILEN, THISWIDTH INTEGER EFFECTIVE_BEGIN,EFFECTIVE_END INTEGER LENGTH(MAXALIGNS_LOC), IOUTPOS, NSPACES INTEGER LAST_INSERTION(MAXALIGNS_LOC) INTEGER NTRANS_INS(MAXALIGNS_LOC) C INTEGER INSPOS(MAXALIGNS_LOC) INTEGER LASTLEN(MAXALIGNS_LOC) INTEGER*2 IAPS(MAXRES_LOC) CHARACTER*1 C CHARACTER*1 CGAPCHAR CHARACTER*8 TIMESTRING CHARACTER*9 DATESTRING CHARACTER*64 DATE_TIME CHARACTER*(LINELEN) LINE CHARACTER*(MAXRES_LOC) STRAND LOGICAL NOCHAINBREAKS, NO_INS_HERE LOGICAL PARTIAL_INSERTION(MAXALIGNS_LOC) C REFORMAT OF: *.FRAG C C Nfi.Msf MSF: 594 Type: P February 17, 1992 14:37 Check: 1709 .. C C Name: Cnfi02 Len: 594 Check: 7754 Weight: 1.00 C Name: Cnfi03 Len: 594 Check: 4932 Weight: 1.00 C C// C C 1 50 CCnfi02 MMYSPICLTQ DEFHPFIEAL LPHVRAIAYT WFNLQARKRK YFKKHEKRMS CCnfi03 MMYSPICLTQ DEFHPFIEAL LPHVRAIAYT WFNLQARKRK YFKKHEKRMS ERROR = .FALSE. IINS=0 C try to open outfile; return if unsuccessful CALL OPEN_FILE(KUNIT,OUTFILE,'new,recl=200',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN IF ( NSEQ .GT. MAXALIGNS .OR. 1 NSEQ .GT. MAXALIGNS_LOC ) THEN WRITE(6,'(1X,A)') + 'ERROR: MAXALIGNS overflow in write_msf !' ERROR = .TRUE. RETURN ENDIF IF ( ALILEN .GT. MAXRES .OR. 1 ALILEN .GT. MAXRES_LOC ) THEN WRITE(6,'(1X,A)') + 'ERROR: MAXRES overflow in write_msf !' ERROR = .TRUE. RETURN ENDIF CGAPCHAR = '.' NOCHAINBREAKS = .FALSE. CODELEN = 1 DO I=1,NSEQ CALL STRPOS(SEQNAMES(I),ISTART,ISTOP) IF (ISTOP .GT. CODELEN)CODELEN=ISTOP+2 ENDDO IF (CODELEN .GT. LEN(SEQNAMES(1)) )CODELEN=LEN(SEQNAMES(1)) IF ( LDOEXP ) THEN CALL PREPARE_INSERTIONS(MAXRES,MAXALIGNS, 1 ALILEN,NSEQ, 2 IFIR,ILAS,INSNUMBER,INSALI,INSLEN, 3 INSBEG_1,MAXLEN,INSLIST_POINTER, 4 TOTALINSLEN,ERROR) ELSE CALL INIT_INT2_ARRAY(1,ALILEN,MAXLEN,0) CALL INIT_INT2_ARRAY(1,ALILEN,TOTALINSLEN,0) CALL INIT_INT2_ARRAY(1,NSEQ,INSLIST_POINTER,0) ENDIF EFFECTIVE_BEGIN = BEGIN + TOTALINSLEN(BEGIN) EFFECTIVE_END = END + TOTALINSLEN(END)-TOTALINSLEN(BEGIN) CALL STRPOS(INFILE,ISTART,ISTOP) WRITE(LINE,'(1X,A,A,1X,A,I4,1X,A,I4)') 'MSF of: ', 1 INFILE(ISTART:ISTOP),'from: ',begin, 2 'to: ',effective_end CALL STRPOS(LINE,ISTART,ISTOP) WRITE(KUNIT,'(A)') LINE(ISTART:ISTOP) C calculate single sequence checksums DO ISEQ = 1, NSEQ C j counts positions in "strand"; i counts alignment positions J = 0 DO I = 1,IFIR(ISEQ)-1 C .. no need to check whether this insertion belongs to the current seq C .. - this is impossible inside the region of the n-terminal gap DO K = 1,MAXLEN(I) J = J + 1 STRAND(J:J) = CGAPCHAR ENDDO J = J + 1 STRAND(J:J) = CGAPCHAR ENDDO IPOS = ALIPOINTER(ISEQ) DO I = IFIR(ISEQ),ILAS(ISEQ) J = J + 1 C = ALISEQ(IPOS) IF ( LDOEXP ) CALL LOWTOUP(C,1) STRAND(J:J) = C IF ( ( MAXLEN(I) .GT. 0 ) .AND. 1 ( INSLIST_POINTER(ISEQ) .NE. 0 ) ) THEN IINS = INSLIST_POINTER(ISEQ) DO WHILE ( INSBEG_1(IINS) .NE. I .AND. 1 INSALI(IINS) .EQ. ISEQ ) IINS = IINS + 1 ENDDO IF ( INSALI(IINS) .NE. ISEQ ) THEN NO_INS_HERE = .TRUE. ELSE NO_INS_HERE = .FALSE. ENDIF ELSE NO_INS_HERE = .TRUE. ENDIF IF ( .NOT. NO_INS_HERE ) THEN C WRITE(6,'(1x,3(i4,1x))') iseq, insali(iins), iins C .. this insertion belongs to current sequence - copy missing symbols C .. from INSBUFFER KK = INSPOINTER(IINS) C .. insertions are stored as lowercaseINSERTIONlowercase; with C .. INSPOINTER pointing to the leading "lowercase". this symbol C .. ( also in lowercase ) preceeds the insertion in aliseq; C .. this can be used as a check. C = INSBUFFER(KK) IF ( C .NE. ALISEQ(IPOS) ) THEN IF ( ALISEQ(IPOS) .NE. CGAPCHAR ) THEN ERROR = .TRUE. STOP 'MIST' ENDIF ENDIF KK = KK + 1 DO K = 1,INSLEN(IINS) J = J + 1 STRAND(J:J) = INSBUFFER(KK) KK = KK + 1 ENDDO DO K = INSLEN(IINS)+1,MAXLEN(I) J = J + 1 STRAND(J:J) = CGAPCHAR ENDDO ELSE C .. this insertion does not belong to current sequence - fill with C gap symbols DO K = 1,MAXLEN(I) J = J + 1 STRAND(J:J) = CGAPCHAR ENDDO ENDIF IPOS = IPOS + 1 ENDDO DO I = ILAS(ISEQ)+1,ALILEN C .. no need to check whether this insertion belongs to the current seq C .. - this is impossible inside the region of the n-terminal gap DO K = 1,MAXLEN(I) J = J + 1 STRAND(J:J) = CGAPCHAR ENDDO J = J + 1 STRAND(J:J) = CGAPCHAR ENDDO CALL CHECKSEQ(STRAND,1,J, 1 SEQCHECK(ISEQ)) ENDDO C calculate total checksum CALL MSFCHECKSEQ(SEQCHECK,NSEQ,MSFCHECK) C Write MSF identification line C get current date CALL GETDATE(DATESTRING) C get current time CALL GETTIME(TIMESTRING) C date + time DATE_TIME = DATESTRING // ' ' // TIMESTRING CALL STRPOS(OUTFILE,I1START,I1STOP) CALL STRPOS(DATE_TIME,I2START,I2STOP) WRITE(KUNIT,'(1X,A,2X,A,1X,I4,2X,A,A,1X,A,2X,A,I5,2X,A)') 1 OUTFILE(I1START:I1STOP),'MSF:', 2 EFFECTIVE_END-EFFECTIVE_BEGIN+1, 3 'Type: ',type,date_time(i2start:i2stop), 4 'Check:',msfcheck,'..' WRITE(KUNIT,'(A)') ' ' WRITE(KUNIT,'(A)') ' ' C Write sequence identification section DO ISEQ = 1,NSEQ WRITE(KUNIT,'(A,A,2X,A,I5,2X,A,I4,2X,A,F5.2)') 1 ' Name: ',seqnames(iseq)(1:codelen),'Len: ', 2 effective_end-effective_begin+1,'Check: ', 3 seqcheck(iseq), 'Weight: ', weight(iseq) C divider ENDDO WRITE(KUNIT,'(A)') ' ' WRITE(KUNIT,'(A)') '//' WRITE(KUNIT,'(A)') ' ' WRITE(KUNIT,'(A)') ' ' C WRITE ALIGNMENT DO ISEQ = 1,NSEQ LENGTH(ISEQ) = EFFECTIVE_BEGIN IAPS(ISEQ) = BEGIN-1 NTRANS_INS(ISEQ) = 0 LASTLEN(ISEQ) = 0 LAST_INSERTION(ISEQ) = 0 PARTIAL_INSERTION(ISEQ) = .FALSE. ENDDO ILEN = 0 DO WHILE ( ILEN .LT. EFFECTIVE_END-EFFECTIVE_BEGIN+1 ) C new block LASTPOS = 1 MIN(ILEN+NBLOCKS*BLOCKSIZE,EFFECTIVE_END-BEGIN+1) THISWIDTH = MIN(NBLOCKS*BLOCKSIZE,LASTPOS-ILEN) C write scale line IF ( MOD(THISWIDTH,BLOCKSIZE) .EQ. 0 ) THEN NSPACES = THISWIDTH / BLOCKSIZE - 1 ELSE NSPACES = THISWIDTH / BLOCKSIZE ENDIF CALL WRITESCALELINE(CODELEN+1,CODELEN+THISWIDTH+NSPACES, 1 ILEN+1,LASTPOS,LINE) WRITE(KUNIT,'(A)') LINE(1:CODELEN+THISWIDTH+NBLOCKS+1) C provide as many symbols in "strand" as writescalline will need to C transfer to next output line C .. steps : C ... - find alignment position x which is greater or equal the C ..... desired end point, INCLUDING INSERTIONS C ... - in case of "greater", there is an insertion crossing the C ..... boundary of the line to be output. C ..... SPLIT this insertion DO ISEQ = 1,NSEQ IOUTPOS = 1 IAP = IAPS(ISEQ) IPOS = LENGTH(ISEQ) IF ( PARTIAL_INSERTION(ISEQ) ) THEN IF ( LAST_INSERTION(ISEQ) .NE. 0 ) THEN JPOS = 1 INSPOINTER(LAST_INSERTION(ISEQ))+NTRANS_INS(ISEQ)+1 DO WHILE ( 1 IPOS .LE. EFFECTIVE_END 2 .AND. 3 IPOS .LE. LENGTH(ISEQ)+NBLOCKS*BLOCKSIZE-1 4 .AND. 5 JPOS .LE. INSPOINTER(LAST_INSERTION(ISEQ)) + 6 INSLEN(LAST_INSERTION(ISEQ)) 7 ) STRAND(IOUTPOS:IOUTPOS) = INSBUFFER(JPOS) IOUTPOS = IOUTPOS + 1 IPOS = IPOS + 1 NTRANS_INS(ISEQ) = NTRANS_INS(ISEQ) + 1 JPOS = JPOS + 1 ENDDO ENDIF DO WHILE ( 1 IPOS .LE. EFFECTIVE_END 2 .AND. 3 IPOS .LE. LENGTH(ISEQ)+NBLOCKS*BLOCKSIZE-1 4 .AND. 5 NTRANS_INS(ISEQ) .LT. LASTLEN(ISEQ) 6 ) STRAND(IOUTPOS:IOUTPOS) = CGAPCHAR IOUTPOS = IOUTPOS + 1 IPOS = IPOS + 1 NTRANS_INS(ISEQ) = NTRANS_INS(ISEQ) + 1 ENDDO IF ( NTRANS_INS(ISEQ) .EQ. LASTLEN(ISEQ) ) THEN PARTIAL_INSERTION(ISEQ) = .FALSE. NTRANS_INS(ISEQ) = 0 ENDIF ENDIF DO WHILE ( 1 IPOS .LE. EFFECTIVE_END .AND. 2 IPOS .LE. LENGTH(ISEQ)+NBLOCKS*BLOCKSIZE-1 3 ) IAP = IAP + 1 IF ( IAP .LT. IFIR(ISEQ) .OR. 1 IAP .GT. ILAS(ISEQ) ) THEN STRAND(IOUTPOS:IOUTPOS) = CGAPCHAR ELSE C = ALISEQ( 1 ALIPOINTER(ISEQ)+IAP- 2 IFIR(ISEQ) 3 ) IF ( LDOEXP ) CALL LOWTOUP(C,1) STRAND(IOUTPOS:IOUTPOS) = C ENDIF IOUTPOS = IOUTPOS + 1 IPOS = IPOS + 1 IF ( ( MAXLEN(IAP) .GT. 0 ) .AND. 1 ( INSLIST_POINTER(ISEQ) .NE. 0 ) ) THEN IINS = INSLIST_POINTER(ISEQ) DO WHILE ( INSBEG_1(IINS) .NE. IAP .AND. 1 INSALI(IINS) .EQ. ISEQ ) IINS = IINS + 1 ENDDO IF ( INSALI(IINS) .NE. ISEQ ) THEN NO_INS_HERE = .TRUE. ELSE NO_INS_HERE = .FALSE. ENDIF ELSE NO_INS_HERE = .TRUE. ENDIF IF ( .NOT. NO_INS_HERE ) THEN JPOS = INSPOINTER(IINS)+1 DO WHILE ( 1 IPOS .LE. EFFECTIVE_END 2 .AND. 3 IPOS .LE. (LENGTH(ISEQ)+NBLOCKS*BLOCKSIZE -1) 4 .AND. 5 JPOS .LE. (INSPOINTER(IINS)+INSLEN(IINS)) 6 ) STRAND(IOUTPOS:IOUTPOS) = INSBUFFER(JPOS) IOUTPOS = IOUTPOS + 1 NTRANS_INS(ISEQ) = NTRANS_INS(ISEQ) + 1 IPOS = IPOS + 1 JPOS = JPOS + 1 ENDDO DO WHILE ( 1 IPOS .LE. EFFECTIVE_END 2 .AND. 3 IPOS .LE. (LENGTH(ISEQ)+NBLOCKS*BLOCKSIZE-1) 4 .AND. 5 NTRANS_INS(ISEQ) .LT. MAXLEN(IAP) 6 ) STRAND(IOUTPOS:IOUTPOS) = CGAPCHAR IPOS = IPOS + 1 IOUTPOS = IOUTPOS + 1 NTRANS_INS(ISEQ) = NTRANS_INS(ISEQ) + 1 ENDDO IF ( NTRANS_INS(ISEQ) .EQ. MAXLEN(IAP) ) THEN PARTIAL_INSERTION(ISEQ) = .FALSE. NTRANS_INS(ISEQ) = 0 ELSE PARTIAL_INSERTION(ISEQ) = .TRUE. LAST_INSERTION(ISEQ) = IINS LASTLEN(ISEQ) = MAXLEN(IAP) ENDIF ELSE DO WHILE ( 1 IPOS .LE. EFFECTIVE_END 2 .AND. 3 IPOS .LE. LENGTH(ISEQ)+NBLOCKS*BLOCKSIZE-1 4 .AND. 5 NTRANS_INS(ISEQ) .LT. MAXLEN(IAP) 6 ) STRAND(IOUTPOS:IOUTPOS) = CGAPCHAR IOUTPOS = IOUTPOS + 1 NTRANS_INS(ISEQ) = NTRANS_INS(ISEQ) + 1 IPOS = IPOS + 1 ENDDO IF ( NTRANS_INS(ISEQ) .EQ. MAXLEN(IAP) ) THEN PARTIAL_INSERTION(ISEQ) = .FALSE. NTRANS_INS(ISEQ) = 0 ELSE PARTIAL_INSERTION(ISEQ) = .TRUE. LAST_INSERTION(ISEQ) = 0 LASTLEN(ISEQ) = MAXLEN(IAP) ENDIF ENDIF ENDDO IOUTPOS = IOUTPOS - 1 POS1 = 1 C writeseqline returns pos2 ( position of last transferred symbol ) CALL WRITESEQLINE(STRAND,POS1,BLOCKSIZE,NBLOCKS,IOUTPOS, 1 NOCHAINBREAKS,LINE,POS2,ERROR) IF ( ERROR ) STOP CALL STRPOS(LINE,ISTART,ISTOP) LENGTH(ISEQ) = LENGTH(ISEQ) + POS2 IAPS(ISEQ) = IAP LINE = SEQNAMES(ISEQ)(1:CODELEN) // LINE(ISTART:ISTOP) CALL STRPOS(LINE,ISTART,ISTOP) WRITE(KUNIT,'(A)') LINE(ISTART:ISTOP) ENDDO WRITE(KUNIT,'(A)') ' ' ILEN = ILEN + NBLOCKS*BLOCKSIZE ENDDO CLOSE(KUNIT) RETURN END C END WRITE_MSF C...................................................................... C...................................................................... C SUB WRITE_PEARSON SUBROUTINE WRITE_PEARSON(KOUT,OUTFILE,SEQ,NBLOCKS,IDENTIFIER, 1 HEADERLINE,SEQSTART,SEQSTOP,ERROR) IMPLICIT NONE C IMPORT INTEGER KOUT INTEGER NBLOCKS INTEGER SEQSTART,SEQSTOP CHARACTER*(*) SEQ CHARACTER*(*) OUTFILE,HEADERLINE,IDENTIFIER C EXPORT C ( OUTPUT TO UNIT KOUT ) LOGICAL ERROR C INTERNAL INTEGER BLOCKSIZE PARAMETER (BLOCKSIZE= 10) INTEGER ISTART, ISTOP, JSTART, JSTOP INTEGER BEGIN, END C INTEGER LENGTH CHARACTER*(250) OUTLINE LOGICAL NOCHAINBREAKS ERROR = .FALSE. C TRY TO OPEN OUTFILE; RETURN IF UNSUCCESSFUL CALL OPEN_FILE(KOUT,OUTFILE,'unknown,append',error) C ERROR MESSAGES ARE ALREDY ISSUED BY OPEN_FILE IF ( ERROR ) RETURN NOCHAINBREAKS = .FALSE. C LENGTH = SEQSTOP-SEQSTART+1 C headerline is a comment line: marked by '>' CALL STRPOS(IDENTIFIER,ISTART,ISTOP) CALL STRPOS(HEADERLINE,JSTART,JSTOP) OUTLINE = '>' // IDENTIFIER(MAX(ISTART,1):MAX(1,ISTOP)) // 1 ' , ' // 2 HEADERLINE(MAX(JSTART,1):MAX(1,JSTOP)) CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:ISTOP) C WRITE SEQUENCE BEGIN = SEQSTART C "repeat until" 1 CONTINUE C writeseqline returns end CALL WRITESEQLINE(SEQ,BEGIN,BLOCKSIZE,NBLOCKS,SEQSTOP, 1 NOCHAINBREAKS,OUTLINE,END,ERROR) IF ( ERROR ) STOP IF ( END .EQ. SEQSTOP ) THEN CALL STRPOS(OUTLINE,ISTART,ISTOP) C OUTLINE = OUTLINE(1:MAX(1,ISTOP)) // '*' OUTLINE = OUTLINE(1:MAX(1,ISTOP)) ENDIF CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) BEGIN = END + 1 IF ( BEGIN .LE. SEQSTOP ) GOTO 1 C END "REPEAT UNTIL" CLOSE(KOUT) RETURN END C END WRITE_PEARSON C...................................................................... C...................................................................... C SUB WRITE_PHYLIP SUBROUTINE WRITE_PHYLIP(KOUT,MAXALIGNS,MAXCORE,BEGIN, 1 END,NBLOCKS,ALISEQ,ALIPOINTER, 2 IFIR,ILAS,SEQNAMES,NSEQ,ERROR) IMPLICIT NONE C This routine writes an "sequential" phylip format, i.e. one sequence C from begin to end, then next one C IMPORT INTEGER KOUT INTEGER MAXALIGNS, MAXCORE INTEGER BEGIN, END, NBLOCKS INTEGER NSEQ INTEGER ALIPOINTER(MAXALIGNS) INTEGER IFIR(MAXALIGNS), ILAS(MAXALIGNS) CHARACTER ALISEQ(MAXCORE) CHARACTER*(*) SEQNAMES(MAXALIGNS) C EXPORT C ( OUTPUT TO UNIT KOUT ) LOGICAL ERROR C INTERNAL INTEGER BLOCKSIZE INTEGER MAXRES_LOC INTEGER CODELEN_LOC INTEGER LINELEN PARAMETER (BLOCKSIZE= 10) PARAMETER (LINELEN= 250) PARAMETER (MAXRES_LOC= 9999) C PARAMETER (MAXRES_LOC= 10000) C PARAMETER (MAXRES_LOC= 19876) PARAMETER (CODELEN_LOC= 9) INTEGER IPOS,POS1,POS2,ISEQ,IOUTPOS INTEGER ISTART,ISTOP,ACTNBLOCKS C INTEGER I1START,I1STOP CHARACTER CGAPCHAR CHARACTER*(MAXRES_LOC) STRAND CHARACTER*(LINELEN) OUTLINE, TMPSTRING LOGICAL NOCHAINBREAKS CGAPCHAR = '-' NOCHAINBREAKS = .FALSE. IF ( LEN(SEQNAMES(1)) .LT. CODELEN_LOC ) THEN ERROR = .TRUE. WRITE(6,'(A)') ' CODELEN TOO SHORT IN WRITE_PHYLIP !' RETURN ENDIF C PHYLIP headerline: " nseq end-begin+1" WRITE(OUTLINE,'(I4,1X,I4)') NSEQ, END-BEGIN+1 CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:ISTOP) C Write alignment C provide one whole sequence in "strand" for w_scaleline to C transfer it DO ISEQ = 1,NSEQ IOUTPOS = 1 DO IPOS = BEGIN,END IF ( IPOS .LT. IFIR(ISEQ) .OR. 1 IPOS .GT. ILAS(ISEQ) ) THEN STRAND(IOUTPOS:IOUTPOS) = CGAPCHAR ELSE STRAND(IOUTPOS:IOUTPOS) = ALISEQ( 1 ALIPOINTER(ISEQ)+IPOS-IFIR(ISEQ) 2 ) ENDIF IOUTPOS = IOUTPOS + 1 ENDDO DO IPOS=1,IOUTPOS-1 IF ( STRAND(IPOS:IPOS) .EQ. '.') THEN STRAND(IPOS:IPOS)= CGAPCHAR ENDIF ENDDO C CALL CHARARRAYREPL(STRAND,IOUTPOS-1,'.',CGAPCHAR) POS1 = BEGIN C "REPEAT UNTIL" 1 CONTINUE IF ( POS1 .EQ. BEGIN ) THEN ACTNBLOCKS = NBLOCKS - 1 ELSE ACTNBLOCKS = NBLOCKS ENDIF C writeseqline returns pos2 CALL WRITESEQLINE(STRAND,POS1,BLOCKSIZE,ACTNBLOCKS,END, 1 NOCHAINBREAKS,TMPSTRING,POS2,ERROR) IF ( ERROR ) STOP CALL STRPOS(TMPSTRING,ISTART,ISTOP) IF ( POS1 .EQ. BEGIN ) THEN C SEQUENCE NAME APPEARS ONLY ONCE ( IN FIRST LINE ) OUTLINE = SEQNAMES(ISEQ)(1:CODELEN_LOC) // ' ' // 1 TMPSTRING(ISTART:ISTOP) ELSE OUTLINE = TMPSTRING(ISTART:ISTOP) ENDIF CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:ISTOP) POS1 = POS2 + 1 IF ( POS1 .LT. END ) GOTO 1 C END "REPEAT UNTIL" ENDDO RETURN END C END WRITE_PHYLIP C...................................................................... C...................................................................... C SUB WRITE_PIR SUBROUTINE WRITE_PIR(KOUT,SEQ,INFILE,OUTFILE,ACCESSION, 1 IDENTIFIER,NSYMBOLS,SEQSTART,SEQSTOP,ERROR) IMPLICIT NONE C IMPORT INTEGER KOUT INTEGER NSYMBOLS INTEGER SEQSTART, SEQSTOP CHARACTER*(*) SEQ CHARACTER*(*) INFILE, OUTFILE,ACCESSION, IDENTIFIER C EXPORT C ( OUTPUT TO UNIT KOUT ) LOGICAL ERROR C INTERNAL INTEGER ISTART, ISTOP INTEGER I1START, I1STOP, I2START, I2STOP, I3START, I3STOP INTEGER BEGIN, END C INTEGER LENGTH INTEGER MAX_LINE_LEN PARAMETER (MAX_LINE_LEN= 1000) CHARACTER*(MAX_LINE_LEN) OUTLINE LOGICAL NOCHAINBREAKS ERROR = .FALSE. C try to open outfile; return if unsuccessful CALL OPEN_FILE(KOUT,OUTFILE,'unknown,append',error) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN NOCHAINBREAKS = .FALSE. c length = seqstop-seqstart+1 CALL STRPOS(ACCESSION,ISTART,ISTOP) IF (ISTART.LT.1 .OR. ISTOP.LT.1 .OR. (ISTOP-ISTART).GT.10) THEN ACCESSION=' ' END IF WRITE(OUTLINE,'(A,A)') 1 '>P1; ',ACCESSION(MAX(1,ISTART):MAX(1,ISTOP)) C WRITE(OUTLINE,'(A,A)') C 1 '>',ACCESSION(MAX(1,ISTART):MAX(1,ISTOP)) CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) c call strpos(infile,i1start,i1stop) c call strpos(outfile,i2start,i2stop) c write(outline,'(a,1x,a,1x,a,1x,a,i4,1x,a,i4,1x,a)') c 1 outfile(i2start:i2stop),'(', infile(i1start:i1stop), c 2 'from: ',seqstart,'to: ', seqstop,')' CALL STRPOS(IDENTIFIER,ISTART,ISTOP) ISTOP=MIN(ISTOP,MAX_LINE_LEN) WRITE(OUTLINE,'(A)') 1 IDENTIFIER(MAX(1,ISTART):MAX(1,ISTOP)) CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) BEGIN = SEQSTART C "REPEAT UNTIL" 1 CONTINUE C WRITESEQLINE RETURNS END CALL WRITESEQLINE(SEQ,BEGIN,1,NSYMBOLS,SEQSTOP, 1 NOCHAINBREAKS,OUTLINE,END,ERROR) IF ( ERROR ) STOP IF ( END .EQ. SEQSTOP ) THEN CALL STRPOS(OUTLINE,ISTART,ISTOP) OUTLINE = OUTLINE(1:MAX(1,ISTOP)) // ' *' ENDIF CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) BEGIN = END + 1 IF ( BEGIN .LE. SEQSTOP ) GOTO 1 C END "REPEAT UNTIL" CLOSE(KOUT) RETURN END C END WRITE_PIR C...................................................................... C...................................................................... C SUB WRITE_STAR SUBROUTINE WRITE_STAR(KOUT,SEQ,NBLOCKS,INFILE,OUTFILE, 1 HEADERLINE,SEQSTART,SEQSTOP,ERROR) IMPLICIT NONE C IMPORT INTEGER KOUT INTEGER NBLOCKS INTEGER SEQSTART,SEQSTOP CHARACTER*(*) SEQ CHARACTER*(*) INFILE,OUTFILE,HEADERLINE C EXPORT C ( OUTPUT TO UNIT KOUT ) LOGICAL ERROR C INTERNAL INTEGER BLOCKSIZE PARAMETER (BLOCKSIZE= 10) INTEGER ISTART, ISTOP INTEGER BEGIN, END C INTEGER LENGTH CHARACTER*(250) OUTLINE LOGICAL NOCHAINBREAKS ERROR = .FALSE. C try to open outfile; return if unsuccessful CALL OPEN_FILE(KOUT,OUTFILE,'NEW',ERROR) C error messages are alredy issued by OPEN_FILE IF ( ERROR ) RETURN NOCHAINBREAKS = .FALSE. c length = seqstop-seqstart+1 C begin and end CALL STRPOS(INFILE,ISTART,ISTOP) WRITE(OUTLINE,'(A,A,1X,A,I4,1X,A,I4)') 1 '* ',infile(istart:istop),'from: ', 2 seqstart,'to: ',seqstop CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(MAX(ISTART,1):MAX(1,ISTOP)) C headerline is a comment line: marked by '*' CALL STRPOS(HEADERLINE,ISTART,ISTOP) OUTLINE = '* ' // HEADERLINE(MAX(ISTART,1):MAX(1,ISTOP)) WRITE(KOUT,'(A)') OUTLINE(1:ISTOP+2) C write sequence BEGIN = SEQSTART C "repeat until" 1 CONTINUE C writeseqline returns end CALL WRITESEQLINE(SEQ,BEGIN,BLOCKSIZE,NBLOCKS,SEQSTOP, 1 NOCHAINBREAKS,OUTLINE,END,ERROR) IF ( ERROR ) STOP CALL STRPOS(OUTLINE,ISTART,ISTOP) WRITE(KOUT,'(A)') OUTLINE(1:MAX(1,ISTOP)) BEGIN = END + 1 IF ( BEGIN .LE. SEQSTOP ) GOTO 1 C END "REPEAT UNTIL" CLOSE(KOUT) RETURN END C END WRITE_STAR C...................................................................... C...................................................................... C SUB WRITELINES SUBROUTINE WRITELINES(CSTRING) C if 'cstring' contains '/n' (new line) this routine writes cstring C line by line on screen; called by GETINT,GETREAL..... CHARACTER*(*) CSTRING INTEGER ICUTBEGIN(30),ICUTEND(30) CALL STRPOS(CSTRING,ISTART,ISTOP) ILINE=1 ICUTBEGIN(ILINE)=1 ICUTEND(ILINE)=ISTOP DO I=1,ISTOP-1 IF (CSTRING(I:I+1).EQ.'/n') THEN ILINE=ILINE+1 ICUTBEGIN(ILINE)=I+2 ICUTEND(ILINE-1)=I-1 ICUTEND(ILINE)=ISTOP ENDIF ENDDO DO I=1,ILINE WRITE(6,*)CSTRING(ICUTBEGIN(I):ICUTEND(I)) ENDDO RETURN END C END WRITELINES C...................................................................... C...................................................................... C SUB WRITEPROFILE SUBROUTINE WRITEPROFILE(KPROF,PROFILENAME,MAXRES, + NRES,NCHAIN,HSSPID,HEADER,COMPOUND,SOURCE,AUTHOR, + SMIN,SMAX,MAPLOW,MAPHIGH,METRICFILE, + PDBNO,CHAINID,SEQ,STRUC,ACC,COLS,SHEETLABEL, + BP1,BP2,NOCC,GAPOPEN,GAPELONG,CONSWEIGHT, + PROFILEMETRIC,MAXBOX,NBOX,PROFILEBOX,LDSSP) IMPLICIT NONE INTEGER nacid PARAMETER (nacid=20) INTEGER kprof,maxres,nres,acc(*),bp1(*),bp2(*),nocc(*) INTEGER NCHAIN,pdbno(*) INTEGER MAXBOX,NBOX,PROFILEBOX(MAXBOX,2) REAL profilemetric(maxres,*),gapopen(*),gapelong(*) REAL consweight(*) REAL smin,smax,maplow,maphigh CHARACTER*(*) hsspid,header,compound,source,author,metricfile CHARACTER*(*) profilename,seq(*),struc(*) CHARACTER*(*) chainid(*) character*7 cols(*) character sheetlabel(*) LOGICAL LDSSP C internal CHARACTER*500 line INTEGER ilen,i,j,ibox,istart,istop LOGICAL lerror C====================================================================== CALL OPEN_FILE(KPROF,PROFILENAME,'NEW,RECL=350',LERROR) IF (LDSSP) THEN WRITE(KPROF,'(A)') + '****** MAXHOM-PROFILE WITH SECONDARY-STRUCTURE V1.0 ******' ELSE WRITE(KPROF,'(A)')'****** MAXHOM-PROFILE V1.0 ******' ENDIF WRITE(KPROF,'(A)')'# ' CALL STRPOS(HSSPID,I,J) WRITE(KPROF,'(A,A)') 'ID : ',HSSPID(I:J) CALL STRPOS(HEADER,I,J) IF (I .GT. 0 .AND. J .GT. 0) THEN WRITE(KPROF,'(A,A)') 'HEADER : ',HEADER(I:J) ELSE WRITE(KPROF,'(A)') 'HEADER : ' ENDIF CALL STRPOS(COMPOUND,I,J) IF (I .GT. 0 .AND. J .GT. 0) THEN WRITE(KPROF,'(A,A)') 'COMPOUND : ',COMPOUND(I:J) ELSE WRITE(KPROF,'(A)') 'COMPOUND : ' ENDIF CALL STRPOS(SOURCE,I,J) IF (I .GT. 0 .AND. J .GT. 0) THEN WRITE(KPROF,'(A,A)') 'SOURCE : ',SOURCE(I:J) ELSE WRITE(KPROF,'(A)') 'SOURCE : ' ENDIF CALL STRPOS(AUTHOR,I,J) IF (I .GT. 0 .AND. J .GT. 0) THEN WRITE(KPROF,'(A,A)') 'AUTHOR : ',AUTHOR(I:J) ELSE WRITE(KPROF,'(A)') 'AUTHOR : ' ENDIF WRITE(KPROF,'(A,I4)') 'NRES : ',NRES WRITE(KPROF,'(A,I4)') 'NCHAIN : ',NCHAIN WRITE(KPROF,'(A,F7.2)')'SMIN : ',SMIN WRITE(KPROF,'(A,F7.2)')'SMAX : ',SMAX WRITE(KPROF,'(A,F7.2)')'MAPLOW : ',MAPLOW WRITE(KPROF,'(A,F7.2)')'MAPHIGH : ',MAPHIGH CALL STRPOS(METRICFILE,I,J) IF (I .GT. 0 .AND. J .GT. 0) THEN WRITE(KPROF,'(A,A)')'METRIC : ',METRICFILE(I:J) ELSE WRITE(KPROF,'(A)')'METRIC : ' ENDIF IF (NBOX.GT.1) THEN WRITE(KPROF,'(A,I6)')'NBOX : ',NBOX DO IBOX=1,NBOX WRITE(KPROF,'(A,I4,A,I4,A,I4)')'BOX',IBOX,' : ', + PROFILEBOX(IBOX,1),'-',PROFILEBOX(IBOX,2) ENDDO ENDIF write(kprof,'(a)')'#========================================='// + '======================================================='// + '======================================================='// + '===================================================' CSeqNo PDBNo AA STRUCTURE BP1 BP2 ACC NOCC open elong V L ... C line=' SeqNo PDBNo AA STRUCTURE BP1 BP2 ACC NOCC '// + 'OPEN ELONG WEIGHT '// + 'V L I M F W Y '// + 'G A P S T C H '// + 'R K Q E N D' CALL STRPOS(LINE,ISTART,ISTOP) WRITE(KPROF,'(A)')LINE(:ISTOP) DO I=1,NRES IF (I.GT.MAXRES) THEN WRITE(6,*)' *** ERROR IN WRITEPROFILE: NRES.GT.MAXRES' STOP ENDIF IF (STRUC(I).EQ.'U')STRUC(I)=' ' WRITE(LINE,100)I,PDBNO(I),CHAINID(I),SEQ(I),STRUC(I), + COLS(I),BP1(I),BP2(I),SHEETLABEL(I),ACC(I),NOCC(I) IF (PDBNO(I).EQ.0)LINE(7:11)=' ' CALL STRPOS(LINE,ISTART,ISTOP) WRITE(LINE(ISTOP+1:),'(2(F6.2),F7.2,20(F8.3))') + GAPOPEN(I),GAPELONG(I),CONSWEIGHT(I), + (PROFILEMETRIC(I,J),J=1,NACID) CALL STRPOS(LINE,ISTART,ISTOP) WRITE(KPROF,'(A)')LINE(:ISTOP) 100 FORMAT(2X,2(I4,1X),A1,1X,A1,2X,A1,1X,A7,2(I4),A1,2(I4,1X)) ENDDO WRITE(KPROF,'(A)')'//' CLOSE(KPROF) RETURN END C END WRITEPROFILE C...................................................................... C...................................................................... C SUB WRITESCALELINE SUBROUTINE WRITESCALELINE(ISTART,ISTOP,LABEL1,LABEL2,OUTLINE) IMPLICIT NONE C 4.11.93 C ISTART: POSITION AFTER WHICH TO PLACE LABEL1 C ISTOP : POSITION AT WHICH LABEL2 SHOULD END C IMPORT INTEGER ISTART,ISTOP,LABEL1,LABEL2 C EXPORT CHARACTER*(*) OUTLINE C INTERNAL INTEGER LABELLEN PARAMETER (LABELLEN= 4) CHARACTER*16 FORM CHARACTER*(LABELLEN) CTMP *----------------------------------------------------------------------* C PREPARE LABEL OUTPUT FORMAT CTMP=' ' WRITE(CTMP,'(I2)') LABELLEN CALL LEFTADJUST(CTMP,1,LABELLEN) FORM = '( I4' // ')' C BUILD UP OUTLINE OUTLINE = ' ' WRITE(CTMP,FORM) LABEL1 CALL LEFTADJUST(CTMP,1,LABELLEN) OUTLINE = OUTLINE(1:ISTART-1) // CTMP WRITE(CTMP,FORM) LABEL2 CALL RIGHTADJUST(CTMP,1,LABELLEN) OUTLINE = OUTLINE(1:ISTOP-LABELLEN) // CTMP RETURN END C END WRITESCALELINE C...................................................................... C...................................................................... C SUB WRITESEQLINE SUBROUTINE WRITESEQLINE(SEQ,ISTART,BLOCKSIZE,NBLOCKS,NRES, 1 NOCHAINBREAKS,OUTLINE,ISTOP,ERROR) IMPLICIT NONE C 4.11.93 C CCCCCCCCCC CCCCCCCCCC CCCCCCCCCC CCCCCCCCCC C ^ ^ ^ C istart: blocksize: istop: C first 10 here last C seq.pos. seq.pos. C to be transferred C transferred C C C nblocks: 4 here C line starts with 1 blank C istart is given, istop is returned ( if ( nochainbreaks ) maybe C some symbols are not transferred )) C C IMPORT INTEGER ISTART, ISTOP INTEGER BLOCKSIZE INTEGER NBLOCKS, NRES CHARACTER*(*) SEQ LOGICAL NOCHAINBREAKS C EXPORT CHARACTER*(*) OUTLINE LOGICAL ERROR C INTERNAL INTEGER ISEQPOS, ILINEPOS,IBLOCKPOS,IBLOCK ERROR = .FALSE. OUTLINE = ' ' ILINEPOS = 1 IBLOCKPOS = 0 IBLOCK = 1 ISEQPOS = ISTART - 1 DO WHILE ( ILINEPOS .LT. NBLOCKS*BLOCKSIZE+NBLOCKS .AND. 1 ISEQPOS .LT. NRES ) ISEQPOS = ISEQPOS + 1 IF ( IBLOCK .LT. NBLOCKS .AND. 1 IBLOCKPOS .EQ. BLOCKSIZE ) THEN IBLOCKPOS = 0 IBLOCK = IBLOCK + 1 ILINEPOS = ILINEPOS + 1 OUTLINE(ILINEPOS:ILINEPOS) = ' ' ENDIF IF ( .NOT. NOCHAINBREAKS .OR. 1 ( NOCHAINBREAKS .AND. SEQ(ISEQPOS:ISEQPOS) .NE. '!' ) 2 ) THEN ILINEPOS = ILINEPOS + 1 IBLOCKPOS = IBLOCKPOS + 1 OUTLINE(ILINEPOS:ILINEPOS) = SEQ(ISEQPOS:ISEQPOS) ENDIF ENDDO ISTOP = ISEQPOS RETURN END C END WRITESEQLINE C...................................................................... C...................................................................... C SUB U3B SUBROUTINE U3B(W,X,Y,N,MODE,RMS,U,T,IER) C this version copied July 1986. DO NOT REDISTRIBUTE. C If you want this routine, ask Wolfgang Kabsch C**** CALCULATES A BEST ROTATION & TRANSLATION BETWEEN TWO VECTOR SETS C**** SUCH THAT U*X+T IS THE CLOSEST APPROXIMATION TO Y. C**** THE CALCULATED BEST SUPERPOSITION MAY NOT BE UNIQUE AS INDICATED C**** BY A RESULT VALUE IER=-1. HOWEVER IT IS GARANTIED THAT WITHIN C**** NUMERICAL TOLERANCES NO OTHER SUPERPOSITION EXISTS GIVING A C**** SMALLER VALUE FOR RMS. C**** THIS VERSION OF THE ALGORITHM IS OPTIMIZED FOR THREE-DIMENSIONAL C**** REAL VECTOR SPACE. C**** USE OF THIS ROUTINE IS RESTRICTED TO NON-PROFIT ACADEMIC C**** APPLICATIONS. C**** PLEASE REPORT ERRORS TO C**** PROGRAMMER: W.KABSCH MAX-PLANCK-INSTITUTE FOR MEDICAL RESEARCH C JAHNSTRASSE 29, 6900 HEIDELBERG, FRG. C**** REFERENCES: W.KABSCH ACTA CRYST.(1978).A34,827-828 C W.KABSCH ACTA CRYST.(1976).A32,922-923 C C W - W(M) IS WEIGHT FOR ATOM PAIR # M (GIVEN) C X - X(I,M) ARE COORDINATES OF ATOM # M IN SET X (GIVEN) C Y - Y(I,M) ARE COORDINATES OF ATOM # M IN SET Y (GIVEN) C N - N IS NUMBER OF ATOM PAIRS (GIVEN) C MODE - 0:CALCULATE RMS ONLY (GIVEN) C 1:CALCULATE RMS,U,T (TAKES LONGER) C RMS - SUM OF W*(UX+T-Y)**2 OVER ALL ATOM PAIRS (RESULT) C U - U(I,J) IS ROTATION MATRIX FOR BEST SUPERPOSITION (RESULT) C T - T(I) IS TRANSLATION VECTOR FOR BEST SUPERPOSITION (RESULT) C IER - 0: A UNIQUE OPTIMAL SUPERPOSITION HAS BEEN DETERMINED(RESULT) C -1: SUPERPOSITION IS NOT UNIQUE BUT OPTIMAL C -2: NO RESULT OBTAINED BECAUSE OF NEGATIVE WEIGHTS W C OR ALL WEIGHTS EQUAL TO ZERO. C C----------------------------------------------------------------------- INTEGER IP(9),IP2312(4),I,J,K,L,M1,M,IER,N,MODE REAL W(*),X(3,*),Y(3,*),U(3,*),T(*),RMS,SIGMA c REAL*16 R(3,3),XC(3),YC(3),WC,A(3,3),B(3,3),E0, c 1 E(3),E1,E2,E3,D,H,G,SPUR,DET,COF,CTH,STH,SQRTH,P,TOL, c 2 RR(6),RR1,RR2,RR3,RR4,RR5,RR6,SS(6),SS1,SS2,SS3,SS4,SS5,SS6, c 3 ZERO,ONE,TWO,THREE,SQRT3 C most UNIX machines know only real*8 C on VAX compile it with /G_Floating DOUBLE PRECISION R(3,3),XC(3),YC(3),WC,A(3,3),B(3,3),E0, 1 E(3),E1,E2,E3,D,H,G,SPUR,DET,COF,CTH,STH,SQRTH,P,TOL, 2 RR(6),RR1,RR2,RR3,RR4,RR5,RR6,SS(6),SS1,SS2,SS3,SS4,SS5,SS6, 3 ZERO,ONE,TWO,THREE,SQRT3 EQUIVALENCE (RR1,RR(1)),(RR2,RR(2)),(RR3,RR(3)), 1 (RR4,RR(4)),(RR5,RR(5)),(RR6,RR(6)), 2 (SS1,SS(1)),(SS2,SS(2)),(SS3,SS(3)), 3 (SS4,SS(4)),(SS5,SS(5)),(SS6,SS(6)), 4 (E1,E(1)),(E2,E(2)),(E3,E(3)) DATA SQRT3,TOL/1.73205080756888D+00, 1.0D-2/ DATA ZERO,ONE,TWO,THREE/0.0D+00, 1.0D+00, 2.0D+00, 3.0D+00/ DATA IP/1,2,4, 2,3,5, 4,5,6/ DATA IP2312/2,3,1,2/ WC=ZERO RMS=0.0 E0=ZERO DO 1 I=1,3 XC(I)=ZERO YC(I)=ZERO T(I)=0.0 DO 1 J=1,3 D=ZERO IF (I.EQ.J)D=ONE U(I,J)=real(D) A(I,J)=D 1 R(I,J)=ZERO IER=-1 IF (N.LT.1)RETURN C**** DETERMINE CENTROIDS OF BOTH VECTOR SETS X AND Y IER=-2 DO 2 M=1,N IF (W(M).LT.0.0)RETURN WC=WC+W(M) DO 2 I=1,3 XC(I)=XC(I)+W(M)*X(I,M) 2 YC(I)=YC(I)+W(M)*Y(I,M) IF (WC.LE.ZERO)RETURN DO 3 I=1,3 XC(I)=XC(I)/WC 3 YC(I)=YC(I)/WC C**** DETERMINE CORRELATION MATRIX R BETWEEN VECTOR SETS Y AND X DO 4 M=1,N DO 4 I=1,3 E0=E0+W(M)*((X(I,M)-XC(I))**2+(Y(I,M)-YC(I))**2) D=W(M)*(Y(I,M)-YC(I)) DO 4 J=1,3 4 R(I,J)=R(I,J)+D*(X(J,M)-XC(J)) C**** CALCULATE DETERMINANT OF R(I,J) DET=R(1,1)*(R(2,2)*R(3,3)-R(2,3)*R(3,2)) 1 -R(1,2)*(R(2,1)*R(3,3)-R(2,3)*R(3,1)) 2 +R(1,3)*(R(2,1)*R(3,2)-R(2,2)*R(3,1)) SIGMA=real(DET) C**** FORM UPPER TRIANGLE OF TRANSPOSED(R)*R M=0 DO 5 J=1,3 DO 5 I=1,J M=M+1 5 RR(M)=R(1,I)*R(1,J)+R(2,I)*R(2,J)+R(3,I)*R(3,J) C***************** EIGENVALUES ***************************************** C**** FORM CHARACTERISTIC CUBIC X**3-3*SPUR*X**2+3*COF*X-DET=0 SPUR=(RR1+RR3+RR6)/THREE COF=(RR3*RR6-RR5*RR5+RR1*RR6-RR4*RR4+RR1*RR3-RR2*RR2)/THREE DET=DET*DET DO 6 I=1,3 6 E(I)=SPUR IF (SPUR.LE.ZERO)GOTO 40 C**** REDUCE CUBIC TO STANDARD FORM Y**3-3HY+2G=0 BY PUTTING X=Y+SPUR D=SPUR*SPUR H=D-COF G=(SPUR*COF-DET)/TWO-SPUR*H C**** SOLVE CUBIC. ROOTS ARE E1,E2,E3 IN DECREASING ORDER IF (H.LE.ZERO)GOTO 8 SQRTH=DSQRT(H) c SQRTH=QSQRT(H) D=H*H*H-G*G IF (D.LT.ZERO)D=ZERO D=DATAN2(DSQRT(D),-G)/THREE CTH=SQRTH*DCOS(D) STH=SQRTH*SQRT3*DSIN(D) c D=QATAN2(QSQRT(D),-G)/THREE c CTH=SQRTH*QCOS(D) c STH=SQRTH*SQRT3*QSIN(D) E1=SPUR+CTH+CTH E2=SPUR-CTH+STH E3=SPUR-CTH-STH IF (MODE)10,50,10 C HANDLE SPECIAL CASE OF 3 IDENTICAL ROOTS 8 IF (MODE)30,50,30 C**************** EIGENVECTORS ***************************************** 10 DO 15 L=1,3,2 D=E(L) SS1=(D-RR3)*(D-RR6)-RR5*RR5 SS2=(D-RR6)*RR2+RR4*RR5 SS3=(D-RR1)*(D-RR6)-RR4*RR4 SS4=(D-RR3)*RR4+RR2*RR5 SS5=(D-RR1)*RR5+RR2*RR4 SS6=(D-RR1)*(D-RR3)-RR2*RR2 J=1 IF (DABS(SS1).GE.DABS(SS3))GOTO 12 c IF (QABS(SS1).GE.QABS(SS3))GOTO 12 J=2 IF (DABS(SS3).GE.DABS(SS6))GOTO 13 c IF (QABS(SS3).GE.QABS(SS6))GOTO 13 11 J=3 GOTO 13 12 IF (DABS(SS1).LT.DABS(SS6))GOTO 11 c12 IF (QABS(SS1).LT.QABS(SS6))GOTO 11 13 D=ZERO J=3*(J-1) DO 14 I=1,3 K=IP(I+J) A(I,L)=SS(K) 14 D=D+SS(K)*SS(K) IF (D.GT.ZERO)D=ONE/DSQRT(D) c IF (D.GT.ZERO)D=ONE/QSQRT(D) DO 15 I=1,3 15 A(I,L)=A(I,L)*D D=A(1,1)*A(1,3)+A(2,1)*A(2,3)+A(3,1)*A(3,3) M1=3 M=1 IF ((E1-E2).GT.(E2-E3))GOTO 16 M1=1 M=3 16 P=ZERO DO 17 I=1,3 A(I,M1)=A(I,M1)-D*A(I,M) 17 P=P+A(I,M1)**2 IF (P.LE.TOL)GOTO 19 P=ONE/DSQRT(P) c P=ONE/QSQRT(P) DO 18 I=1,3 18 A(I,M1)=A(I,M1)*P GOTO 21 19 P=ONE DO 20 I=1,3 IF (P.LT.DABS(A(I,M)))GOTO 20 P=DABS(A(I,M)) c IF (P.LT.QABS(A(I,M)))GOTO 20 c P=QABS(A(I,M)) J=I 20 CONTINUE K=IP2312(J) L=IP2312(J+1) P=DSQRT(A(K,M)**2+A(L,M)**2) c P=QSQRT(A(K,M)**2+A(L,M)**2) IF (P.LE.TOL)GOTO 40 A(J,M1)=ZERO A(K,M1)=-A(L,M)/P A(L,M1)=A(K,M)/P 21 A(1,2)=A(2,3)*A(3,1)-A(2,1)*A(3,3) A(2,2)=A(3,3)*A(1,1)-A(3,1)*A(1,3) A(3,2)=A(1,3)*A(2,1)-A(1,1)*A(2,3) C****************** ROTATION MATRIX ************************************ 30 DO 32 L=1,2 D=ZERO DO 31 I=1,3 B(I,L)=R(I,1)*A(1,L)+R(I,2)*A(2,L)+R(I,3)*A(3,L) 31 D=D+B(I,L)**2 IF (D.GT.ZERO)D=ONE/DSQRT(D) c IF (D.GT.ZERO)D=ONE/QSQRT(D) DO 32 I=1,3 32 B(I,L)=B(I,L)*D D=B(1,1)*B(1,2)+B(2,1)*B(2,2)+B(3,1)*B(3,2) P=ZERO DO 33 I=1,3 B(I,2)=B(I,2)-D*B(I,1) 33 P=P+B(I,2)**2 IF (P.LE.TOL)GOTO 35 P=ONE/DSQRT(P) c P=ONE/QSQRT(P) DO 34 I=1,3 34 B(I,2)=B(I,2)*P GOTO 37 35 P=ONE DO 36 I=1,3 IF (P.LT.DABS(B(I,1)))GOTO 36 P=DABS(B(I,1)) c IF (P.LT.QABS(B(I,1)))GOTO 36 c P=QABS(B(I,1)) J=I 36 CONTINUE K=IP2312(J) L=IP2312(J+1) P=DSQRT(B(K,1)**2+B(L,1)**2) c P=QSQRT(B(K,1)**2+B(L,1)**2) IF (P.LE.TOL)GOTO 40 B(J,2)=ZERO B(K,2)=-B(L,1)/P B(L,2)= B(K,1)/P 37 B(1,3)=B(2,1)*B(3,2)-B(2,2)*B(3,1) B(2,3)=B(3,1)*B(1,2)-B(3,2)*B(1,1) B(3,3)=B(1,1)*B(2,2)-B(1,2)*B(2,1) DO 39 I=1,3 DO 39 J=1,3 39 U(I,J)=real( B(I,1)*A(J,1)+B(I,2)*A(J,2)+B(I,3)*A(J,3) ) C****************** TRANSLATION VECTOR ********************************* 40 DO 41 I=1,3 41 T(I)=real ( YC(I)-U(I,1)*XC(1)-U(I,2)*XC(2)-U(I,3)*XC(3) ) C********************** RMS ERROR ************************************** 50 DO 51 I=1,3 IF (E(I).LT.ZERO)E(I)=ZERO 51 E(I)=DSQRT(E(I)) c51 E(I)=QSQRT(E(I)) IER=0 IF (E2.LE.(E1*1.0D-05))IER=-1 D=E3 IF (SIGMA.GE.0.0)GOTO 52 D=-D IF ((E2-E3).LE.(E1*1.0D-05))IER=-1 52 D=D+E2+E1 RMS=real( E0-D-D ) IF (RMS.LT.0.0)RMS=0.0 C next line added June 1989 by Georg Tuparev RMS=SQRT(RMS/N) RETURN END C END U3B C...................................................................... C...................................................................... C SUB UNTAB SUBROUTINE UNTAB(STRING) C removes 'tabs' from a string PARAMETER (LINESIZE= 300) CHARACTER STRING*(*) CHARACTER TEMPLINE*(LINESIZE) INTEGER LENGTH,I,J,TABSIZE *----------------------------------------------------------------------* TABSIZE=8 J=0 I=1 LENGTH=LEN(STRING) IF (LENGTH .GT. LINESIZE) THEN WRITE(6,*)'*** UNTAB: string truncated' LENGTH=LINESIZE ENDIF DO WHILE(I .LE. LENGTH) J=J+1 IF (J .LE. LINESIZE) THEN IF (STRING(I:I) .NE. CHAR(9) ) THEN TEMPLINE(J:J)=STRING(I:I) ELSE TEMPLINE(J:J)=' ' DO WHILE( MOD(J,TABSIZE) .NE. 0) J=J+1 IF (J .LE. LINESIZE)TEMPLINE(J:J)=' ' ENDDO ENDIF ENDIF I=I+1 ENDDO STRING(1:LENGTH)=TEMPLINE(1:LENGTH) RETURN END C END UNTAB C...................................................................... C...................................................................... C SUB UPTOLOW SUBROUTINE UPTOLOW(STRING,LENGTH) CHARACTER*(*) STRING INTEGER LENGTH cx CHARACTER UPPER*26, LOWER*26, STRING*(*) cx CHARACTER UPPER*26, LOWER*26, STRING*(*) cx DATA UPPER/'ABCDEFGHIJKLMNOPQRSTUVWXYZ'/ cx DATA LOWER/'abcdefghijklmnopqrstuvwxyz'/ DO I=1,LENGTH IF (STRING(I:I).GE.'A' .AND. STRING(I:I).LE.'Z') THEN STRING(I:I)=CHAR( ICHAR(STRING(I:I))+32 ) C X K=INDEX(UPPER,STRING(I:I)) C X IF (K.NE.0) STRING(I:I)=LOWER(K:K) ENDIF ENDDO RETURN END C END UPTOLOW C...................................................................... C...................................................................... C SUB MP_INIT_FARM C init a farmer worker model C VAX/VMS dummy version ; does nothing ; just init the stuff SUBROUTINE MP_INIT_FARM() IMPLICIT NONE C import INTEGER MAXPROC CHARACTER*200 HOST_FILE,HOST_NAME,NODE_NAME C export INTEGER IDPROC,NWORKER,NP,NWORKSET, + IDTOP,LINK(1:100),ID_HOST, + LINK_HOST,LINK_NODE_SENDER,LINK_NODE_RECEIVER, + SENDER_NODE(1:100),RECEIVER_NODE(1:100), + WORKSETSIZE(1:100),WORKSETBEG(1:100),WORKSETEND(1:100) CHARACTER*20 MP_MODEL LOGICAL LMIXED_ARCH C init MP_MODEL='NIX' ID_HOST=0 IDPROC=0 IDTOP=0 LINK_HOST=0 LINK_NODE_SENDER=0 LINK_NODE_RECEIVER=0 NWORKER=0 NWORKSET=0 LINK(1) = 0 LMIXED_ARCH=.FALSE. RETURN END C END MP_INIT_FARM C C...................................................................... C SUB MP_INIT_NODE SUBROUTINE MP_INIT_NODE(NODE_NAME,IDPROC) CHARACTER*(*) NODE_NAME INTEGER IDPROC RETURN END C end mp_init_node C...................................................................... C...................................................................... C sub mp_getmyid C get ID of process C VAX/VMS dummy version ; return id=0 SUBROUTINE MP_GETMYID(ID) INTEGER ID ID=0 RETURN END C END MP_GETMYID C...................................................................... C...................................................................... C SUB MP_NPROCS C get number of processors C VAX/VMS dummy version ; nprocessor=1 SUBROUTINE MP_NPROCS(NPROCESSOR) INTEGER NPROCESSOR NPROCESSOR=1 RETURN END C END MP_NPROCS C...................................................................... C...................................................................... C SUB MP_SELECT C is there somewhere a message for me ? C VAX/VMS dummy version ; SUBROUTINE MP_SELECT(MSGTYPE,WORKSETBEG,WORKSETEND,LINK,IFLAG) C import INTEGER MSGTYPE,WORKSETBEG,WORKSETEND,LINK(*) C export INTEGER IFLAG IFLAG=0 RETURN END C END MP_SELECT C...................................................................... C...................................................................... C SUB MP_SELECT_SUBSET C is there somewhere a message for me ? C VAX/VMS dummy version ; SUBROUTINE MP_SELECT_SUBSET(MSGTYPE,NWORKSET,SENDER_NODE, + LINK,IFLAG) C import INTEGER MSGTYPE,NWORKSET,SENDER_NODE(*),LINK(*) INTEGER IFLAG IFLAG=0 RETURN END C end mp_select_subset C...................................................................... C...................................................................... C sub mp_init_send C dummy version SUBROUTINE MP_INIT_SEND() RETURN END C end mp_init_send C...................................................................... C...................................................................... C sub mp_send_data C dummy version SUBROUTINE MP_SEND_DATA(MSGTYPE,RECEIVER_NAME) C input INTEGER MSGTYPE,LINK CHARACTER*(*) RECEIVER_NAME RETURN END C end mp_send_data C...................................................................... C...................................................................... C sub mp_init_receive C dummy version SUBROUTINE MP_INIT_RECEIVE(MSGTYPE) INTEGER MSGTYPE RETURN END C end mp_init_receive C...................................................................... C...................................................................... C sub mp_receive_data C dummy version SUBROUTINE MP_RECEIVE_DATA(MSGTYPE,LINK) C input INTEGER MSGTYPE C output INTEGER LINK RETURN END C end mp_receive_data C...................................................................... C...................................................................... C sub mp_put_int4 C VAX/VMS dummy version ; does nothing SUBROUTINE MP_PUT_INT4(IDTOP,LINK,DATA,NBYTE) INTEGER IDTOP,LINK,DATA,NBYTE RETURN END C end mp_put_int4 C...................................................................... C...................................................................... C sub mp_get_int4 C VAX/VMS dummy version ; does nothing SUBROUTINE MP_GET_INT4(IDTOP,LINK,DATA,NBYTE) INTEGER IDTOP,LINK,NBYTE,DATA RETURN END C end mp_get_int4 C...................................................................... C...................................................................... C sub mp_put_real4 C VAX/VMS dummy version ; does nothing SUBROUTINE MP_PUT_REAL4(IDTOP,LINK,DATA,NBYTE) INTEGER IDTOP,LINK,NBYTE REAL DATA RETURN END C end mp_put_real4 C...................................................................... C...................................................................... C sub mp_get_real4 C VAX/VMS dummy version ; does nothing SUBROUTINE MP_GET_REAL4(IDTOP,LINK,DATA,NBYTE) INTEGER IDTOP,LINK,NBYTE REAL DATA RETURN END C end mp_get_real4 C...................................................................... C...................................................................... C sub mp_put_string_array C dummy version SUBROUTINE MP_PUT_STRING_ARRAY(IDTOP,LINK,DATA,NDIM) INTEGER IDTOP,LINK,NDIM,INFO CHARACTER*(*) DATA(NDIM) RETURN END C end mp_put_string_array C...................................................................... C...................................................................... C sub mp_put_string C dummy version SUBROUTINE MP_PUT_STRING(IDTOP,LINK,DATA,ILEN) INTEGER IDTOP,LINK,ILEN,INFO CHARACTER*(*) DATA RETURN END C end mp_put_string C...................................................................... C...................................................................... C sub mp_get_string_array C dummy version SUBROUTINE MP_GET_STRING_ARRAY(IDTOP,LINK,DATA,NDIM) INTEGER IDTOP,LINK,NDIM,INFO CHARACTER*(*) DATA(NDIM) RETURN END C end mp_get_string_array C...................................................................... C...................................................................... C sub mp_get_string C dummy version SUBROUTINE MP_GET_STRING(IDTOP,LINK,DATA,ILEN) INTEGER IDTOP,LINK,ILEN,INFO CHARACTER*(*) DATA RETURN END C end mp_get_string C...................................................................... C...................................................................... C SUB MP_LEAVE SUBROUTINE MP_LEAVE() RETURN END C END MP_LEAVE C...................................................................... C...................................................................... C sub mp_probe C is there somewhere a message for me ? C if not, return C PVM version SUBROUTINE MP_PROBE(MSGTYPE,IFLAG) C import INTEGER MSGTYPE C export INTEGER IFLAG IFLAG=0 RETURN END C end mp_probe C...................................................................... C...................................................................... C sub mp_get_int4 C PVM version SUBROUTINE MP_GET_INT4_ARRAY(IDTOP,LINK,DATA,NDATA) INTEGER IDTOP,LINK,NDATA,DATA(*) INTEGER INFO RETURN END C end mp_get_int4 C...................................................................... C...................................................................... C sub mp_put_int4 C PVM version SUBROUTINE MP_PUT_INT4_ARRAY(IDTOP,LINK,DATA,NDATA) INTEGER IDTOP,LINK,DATA(*),NDATA INTEGER INFO RETURN END C end mp_put_int4 C...................................................................... C...................................................................... C sub mp_get_real4 C PVM version SUBROUTINE MP_GET_REAL4_ARRAY(IDTOP,LINK,DATA,NDATA) INTEGER IDTOP,LINK,NDATA REAL DATA(*) INTEGER INFO RETURN END C end mp_get_real4 C...................................................................... C...................................................................... C sub mp_put_real4 C PVM version SUBROUTINE MP_PUT_REAL4_ARRAY(IDTOP,LINK,DATA,NDATA) INTEGER IDTOP,LINK,NDATA REAL DATA(*) INTEGER INFO RETURN END C end mp_put_real4 C...................................................................... C...................................................................... C sub mp_cast C PVM version SUBROUTINE MP_CAST(NTASKS,MSGTYPE,LINK) C input INTEGER NTASKS,MSGTYPE,LINK(*) C internal INTEGER INFO RETURN END C end mp_cast C...................................................................... C vim:et:ts=2: profphd-utils-1.0.10/lib-metr-sys.f0000644015075101507510000001266712012371465016402 0ustar lkajanlkajanC...................................................................... C SUB OPEN_FILE SUBROUTINE OPEN_FILE(IUNIT,INNAME,CSTRING,LERROR) C IMPLICIT NONE C INPUT C CSTATUS: 'OLD' OR 'NEW' OR 'UNKNOWN' C CACCESS: 'APPEND' 'DIRECT' C FORM: 'FORMATTED' OR 'UNFORMATTED' C IRECLEN: RECORD LENGTH C NOTE: AFTER OPENING A "OLD" OR "UNKNOWN" FILE (NO DIRECT ACESS): C REWIND THE FILE, BECAUSE SOME STRANGE COMPILERS PUT THE FILE C POINTER AT THE END ! C CHARACTER*(*) INNAME,CSTRING INTEGER IUNIT,IRECLEN c output: lerror is true if open error LOGICAL LERROR c internal CHARACTER*200 TEMPSTRING,CTEMP,FILENAME CHARACTER*10 CNUMBER LOGICAL LNEW,LAPPEND,LUNKNOWN LOGICAL LUNFORMATTED,LDIRECT LOGICAL LOPENDONE,LSILENT INTEGER LENGTH,I,J,K c init TEMPSTRING=' ' FILENAME=' ' LNEW=.FALSE. LAPPEND=.FALSE. LERROR=.FALSE. LUNKNOWN=.FALSE. LUNFORMATTED=.FALSE. LDIRECT=.FALSE. LOPENDONE=.FALSE. LSILENT=.FALSE. IRECLEN=137 TEMPSTRING(1:)=CSTRING(1:) CNUMBER='0123456789' c LENGTH=LEN(TEMPSTRING) CALL LOWTOUP(TEMPSTRING,LENGTH) IF (INDEX(TEMPSTRING,'NEW').NE.0) THEN LNEW=.TRUE. ENDIF IF (INDEX(TEMPSTRING,'UNKNOWN').NE.0) THEN LUNKNOWN=.TRUE. ENDIF IF (INDEX(TEMPSTRING,'UNFORMATTED').NE.0) THEN LUNFORMATTED=.TRUE. ENDIF IF (INDEX(TEMPSTRING,'DIRECT').NE.0) THEN LDIRECT=.TRUE. ENDIF IF (INDEX(TEMPSTRING,'APPEND').NE.0) THEN LAPPEND=.TRUE. ENDIF IF (INDEX(TEMPSTRING,'SILENT').NE.0) THEN LSILENT=.TRUE. ENDIF IF (INDEX(TEMPSTRING,'RECL=').NE.0) THEN CTEMP=' ' K=INDEX(TEMPSTRING,'RECL=')+5 CTEMP(1:)=TEMPSTRING(K:) CALL STRPOS(CTEMP,I,J) J=I DO WHILE (INDEX(CNUMBER,CTEMP(J:J)).NE.0 ) J=J+1 ENDDO J=J-1 CALL READ_INT_FROM_STRING(CTEMP(I:J),IRECLEN) ENDIF CALL STRPOS(INNAME,IBEG,IEND) FILENAME(1:)=INNAME(IBEG:IEND) IF (LNEW) THEN CALL DEL_OLDFILE(IUNIT,FILENAME) ENDIF IF (LNEW .AND. LUNFORMATTED .AND. LDIRECT) THEN OPEN(IUNIT,FILE=FILENAME,STATUS='NEW',FORM='UNFORMATTED', + ACCESS='DIRECT',RECL=IRECLEN,ERR=999) C OPEN(IUNIT,FILE=FILENAME,STATUS='NEW',FORM='UNFORMATTED', C + ACCESS='DIRECT',RECL=IRECLEN) REWIND(IUNIT) LOPENDONE=.TRUE. ELSEIF (LNEW .AND. LUNFORMATTED ) THEN OPEN(IUNIT,FILE=FILENAME,STATUS='NEW',FORM='UNFORMATTED', + ERR=999) C OPEN(IUNIT,FILE=FILENAME,STATUS='NEW',FORM='UNFORMATTED') REWIND(IUNIT) LOPENDONE=.TRUE. ELSEIF (LNEW .AND. (.NOT. LUNFORMATTED) .AND. LDIRECT ) THEN OPEN(IUNIT,FILE=FILENAME,ACCESS='DIRECT',STATUS='NEW', + FORM='FORMATTED',RECL=IRECLEN,ERR=999) C OPEN(IUNIT,FILE=FILENAME,ACCESS='DIRECT',STATUS='NEW', C + FORM='FORMATTED',RECL=IRECLEN) LOPENDONE=.TRUE. ELSEIF (.NOT. LNEW .AND. LUNFORMATTED .AND. LDIRECT) THEN OPEN(IUNIT,FILE=FILENAME,STATUS='OLD',FORM='UNFORMATTED', + ACCESS='DIRECT',RECL=IRECLEN,ERR=999) C OPEN(IUNIT,FILE=FILENAME,STATUS='OLD',FORM='UNFORMATTED', C + ACCESS='DIRECT',RECL=IRECLEN) LOPENDONE=.TRUE. ELSEIF (.NOT. LNEW .AND. .NOT. LUNFORMATTED .AND. LDIRECT) THEN OPEN(IUNIT,FILE=FILENAME,STATUS='OLD',FORM='FORMATTED', + ACCESS='DIRECT',RECL=IRECLEN,ERR=999) C OPEN(IUNIT,FILE=FILENAME,STATUS='OLD',FORM='FORMATTED', C + ACCESS='DIRECT',RECL=IRECLEN) LOPENDONE=.TRUE. ELSEIF (.NOT. LNEW .AND. LUNFORMATTED ) THEN OPEN(IUNIT,FILE=FILENAME,STATUS='OLD',FORM='UNFORMATTED', + ERR=999) C OPEN(IUNIT,FILE=FILENAME,STATUS='OLD',FORM='UNFORMATTED') REWIND(IUNIT) LOPENDONE=.TRUE. ELSEIF (LNEW) THEN OPEN(IUNIT,FILE=FILENAME,STATUS='NEW',ERR=999) C OPEN(IUNIT,FILE=FILENAME,STATUS='NEW') REWIND(IUNIT) LOPENDONE=.TRUE. ELSEIF (LUNKNOWN .AND. LAPPEND) THEN OPEN(IUNIT,FILE=FILENAME,STATUS='UNKNOWN',ACCESS='APPEND', + ERR=999) C OPEN(IUNIT,FILE=FILENAME,STATUS='UNKNOWN',ACCESS='APPEND') C OPEN(IUNIT,FILE=FILENAME,STATUS='UNKNOWN') LOPENDONE=.TRUE. ELSEIF (.NOT. LNEW .AND. LAPPEND) THEN OPEN(IUNIT,FILE=FILENAME,STATUS='OLD',ACCESS='APPEND',ERR=999) C OPEN(IUNIT,FILE=FILENAME,STATUS='OLD',ACCESS='APPEND') LOPENDONE=.TRUE. ELSEIF (.NOT. LNEW) THEN OPEN(IUNIT,FILE=FILENAME,STATUS='OLD',ERR=999) C OPEN(IUNIT,FILE=FILENAME,STATUS='OLD') REWIND(IUNIT) LOPENDONE=.TRUE. ELSE OPEN(IUNIT,FILE=FILENAME,STATUS='UNKNOWN',ERR=999) C OPEN(IUNIT,FILE=FILENAME,STATUS='UNKNOWN') REWIND(IUNIT) LOPENDONE=.TRUE. ENDIF IF (.NOT. LOPENDONE) THEN WRITE(*,*)' ERROR IN OPEN_FILE: FILE NOT OPENED' WRITE(*,*)' unknown specifier combination !' STOP ENDIF RETURN 999 IF (.NOT. LSILENT) THEN WRITE(*,*)' ERROR: open file error for file: ' WRITE(*,*)'name: ',FILENAME WRITE(*,*)'unit: ',iunit ENDIF LERROR=.TRUE. RETURN END C END OPEN_FILE C...................................................................... profphd-utils-1.0.10/lib-metr.f0000644015075101507510000005155512012371465015565 0ustar lkajanlkajanC...................................................................... C SUB CONCAT_STRINGS SUBROUTINE CONCAT_STRINGS(STRING1,STRING2,RESULT) C concatenate "string1" and "string2" into "result" CHARACTER*(*) STRING1,STRING2,RESULT INTEGER IBEG,IEND,JBEG,JEND,ILEN RESULT=' ' CALL STRPOS(STRING1,IBEG,IEND) CALL STRPOS(STRING2,JBEG,JEND) ILEN= (IEND-IBEG+1) + (JEND-JBEG+1) IF (ILEN .GT. LEN(RESULT) ) THEN ILEN=LEN(RESULT) write(*,*)' WARNING: in concat_strings: length overflow' write(*,*)' cut string at: ',ilen ENDIF RESULT(1:ILEN)=STRING1(IBEG:IEND)//STRING2(JBEG:JEND) RETURN END C END CONCAT_STRINGS C...................................................................... C...................................................................... C SUB CONCAT_INT_STRING SUBROUTINE CONCAT_INT_STRING(INUMBER,STRING,RESULT) C concatenate "inumber" and "string2" into "result" C import/export CHARACTER*(*) STRING,RESULT INTEGER INUMBER C internal CHARACTER TEMP*64,CFORMAT*100 INTEGER IBEG,IEND,JBEG,JEND,ILEN,ILOG C init TEMP=' ' RESULT=' ' ILOG=1 C get size of number C CAUTION can produce wrong results with very high opt-levels c xnumber=float( inumber ) c if (xnumber .gt. 0.0) then c ilog = nint( log10(xnumber) + 0.5 ) c else if (xnumber .lt. 0.0) then c ilog = nint( log10( abs(xnumber) ) + 1.5 ) c endif IF (INUMBER .GT. 0) THEN IF (INUMBER .LT. 10) THEN ILOG=1 ELSE IF (INUMBER .LT. 100) THEN ILOG=2 ELSE IF (INUMBER .LT. 1000) THEN ILOG=3 ELSE IF (INUMBER .LT. 10000) THEN ILOG=4 ELSE IF (INUMBER .LT. 100000) THEN ILOG=5 ELSE IF (INUMBER .LT. 1000000) THEN ILOG=6 ELSE IF (INUMBER .LT. 10000000) THEN ILOG=7 C too big for INT4 ? c else if (inumber .lt. 100000000) then c ilog=8 ELSE write(*,*)' ERROR in CONCAT_INT_STRING: update plus' CALL FLUSH_UNIT(6) ENDIF ELSE IF (INUMBER .LT. 0) THEN IF (INUMBER .GT. -10) THEN ILOG=2 ELSE IF (INUMBER .GT. -100) THEN ILOG=3 ELSE IF (INUMBER .GT. -1000) THEN ILOG=4 ELSE IF (INUMBER .GT. -10000) THEN ILOG=5 ELSE IF (INUMBER .GT. -100000) THEN ILOG=6 ELSE IF (INUMBER .GT. -1000000) THEN ILOG=7 c else if (inumber .gt. -10000000) then c ilog=8 c else if (inumber .gt. -100000000) then c ilog=9 ELSE write(*,*)' ERROR in CONCAT_INT_STRING: update minus' CALL FLUSH_UNIT(6) ENDIF ENDIF CALL CONCAT_STRING_INT('(I',ILOG,TEMP) CALL CONCAT_STRINGS(TEMP,')',CFORMAT) TEMP=' ' WRITE(TEMP(1:),CFORMAT)INUMBER CALL STRPOS(TEMP,IBEG,IEND) CALL STRPOS(STRING,JBEG,JEND) IEND=IBEG+ILOG-1 ILEN= (IEND-IBEG+1) + (JEND-JBEG+1) IF (ILEN .GT. LEN(RESULT) ) THEN ILEN=LEN(RESULT) write(*,*)' WARNING: in concat_int_string: length overflow' write(*,*)' cut string at: ',ilen ENDIF RESULT(1:ILEN)=TEMP(IBEG:IEND)//STRING(JBEG:JEND) RETURN END C END CONCAT_INT_STRING C...................................................................... C...................................................................... C SUB CONCAT_STRING_INT SUBROUTINE CONCAT_STRING_INT(STRING,INUMBER,RESULT) C concatenate "inumber" and "string2" into "result" C import/export CHARACTER*(*) STRING,RESULT INTEGER INUMBER C internal CHARACTER TEMP*64,CFORMAT*100 INTEGER IBEG,IEND,JBEG,JEND,ILEN,ILOG C init TEMP=' ' RESULT=' ' ILOG=1 C get size of number c with some agressive optimizations, this can go wrong c xnumber=float( inumber ) c if (xnumber .gt. 0.0) then c ilog = nint( log10(xnumber) ) + 1 c else if (xnumber .lt. 0.0) then c ilog = nint( log10( abs(xnumber) ) ) + 2 c endif IF (INUMBER .GT. 0) THEN IF (INUMBER .LT. 10) THEN ILOG=1 ELSE IF (INUMBER .LT. 100) THEN ILOG=2 ELSE IF (INUMBER .LT. 1000) THEN ILOG=3 ELSE IF (INUMBER .LT. 10000) THEN ILOG=4 ELSE IF (INUMBER .LT. 100000) THEN ILOG=5 ELSE IF (INUMBER .LT. 1000000) THEN ILOG=6 ELSE IF (INUMBER .LT. 10000000) THEN ILOG=7 C too big for INT4 ? c else if (inumber .lt. 100000000) then c ilog=8 c else if (inumber .lt. 1000000000) then c ilog=9 c else if (inumber .lt. 10000000000) then c ilog=10 ELSE write(*,*)' ERROR in CONCAT_STRING_INT: update plus' CALL FLUSH_UNIT(6) ENDIF ELSE IF (INUMBER .LT. 0) THEN IF (INUMBER .GT. -10) THEN ILOG=2 ELSE IF (INUMBER .GT. -100) THEN ILOG=3 ELSE IF (INUMBER .GT. -1000) THEN ILOG=4 ELSE IF (INUMBER .GT. -10000) THEN ILOG=5 ELSE IF (INUMBER .GT. -100000) THEN ILOG=6 ELSE IF (INUMBER .GT. -1000000) THEN ILOG=7 c else if (inumber .gt. -10000000) then c ilog=8 c else if (inumber .gt. -100000000) then c ilog=9 c else if (inumber .gt. -1000000000) then c ilog=10 ELSE write(*,*)' ERROR in CONCAT_STRING_INT: update minus' CALL FLUSH_UNIT(6) ENDIF ENDIF CALL MAKE_FORMAT_INT(ILOG,CFORMAT) WRITE(TEMP(1:),CFORMAT)INUMBER CALL STRPOS(TEMP,IBEG,IEND) CALL STRPOS(STRING,JBEG,JEND) IEND=IBEG+ILOG-1 ILEN= (IEND-IBEG+1) + (JEND-JBEG+1) IF (ILEN .GT. LEN(RESULT) ) THEN ILEN=LEN(RESULT) write(*,*)' WARNING: in concat_int_string: length overflow' write(*,*)' cut string at: ',ilen ENDIF RESULT(1:ILEN)=STRING(JBEG:JEND)//TEMP(IBEG:IEND) RETURN END C END CONCAT_STRING_INT C...................................................................... C...................................................................... C SUBR DEL_OLDFILE SUBROUTINE DEL_OLDFILE(IUNIT,FILENAME) CHARACTER*(*) FILENAME INTEGER IUNIT LOGICAL LEXIST,LOPEN INTEGER IBEG,IEND CHARACTER*100 TEMPNAME TEMPNAME=' ' CALL STRPOS(FILENAME,IBEG,IEND) TEMPNAME(1:)=FILENAME(IBEG:IEND) INQUIRE(FILE=TEMPNAME,OPENED=LOPEN) IF (LOPEN) THEN CLOSE(IUNIT) ENDIF INQUIRE(FILE=TEMPNAME,EXIST=LEXIST) IF (LEXIST) THEN OPEN(IUNIT,FILE=TEMPNAME,STATUS='OLD') CLOSE(IUNIT,STATUS='DELETE') ENDIF RETURN END C END DEL_OLDFILE C...................................................................... C...................................................................... C SUB FLUSH_unit SUBROUTINE FLUSH_UNIT(IUNIT) INTEGER IUNIT CALL FLUSH(IUNIT) RETURN END C END FLUSH C...................................................................... C...................................................................... C SUB GET_ARG_NUMBER C returns number of arguments C UNIX version SUBROUTINE GET_ARG_NUMBER(INUMBER) INTEGER INUMBER INUMBER=0 INUMBER=IARGC() RETURN END C END GET_ARG_NUMBER C...................................................................... C...................................................................... C SUB GET_ARGUMENT C returns the content of x-th argument C UNIX version SUBROUTINE GET_ARGUMENT(INUMBER,ARGUMENT) CHARACTER*(*) ARGUMENT INTEGER INUMBER CALL GETARG(INUMBER,ARGUMENT) RETURN END C END GET_ARGUMENT C...................................................................... C...................................................................... C SUB GETSIMMETRIC SUBROUTINE GETSIMMETRIC(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES_1,NIOSTATES_1,NSTRSTATES_2,NIOSTATES_2, + CSTRSTATES,CIOSTATES, + IORANGE,KSIM,SIMFILE,SIMMETRIC) IMPLICIT NONE C import INTEGER NTRANS CHARACTER*(*) TRANS INTEGER MAXSTRSTATES,MAXIOSTATES INTEGER KSIM CHARACTER*(*) SIMFILE c export INTEGER NSTRSTATES_1,NIOSTATES_1,NSTRSTATES_2,NIOSTATES_2 REAL IORANGE(MAXSTRSTATES,MAXIOSTATES) CHARACTER*(*) CSTRSTATES,CIOSTATES REAL SIMMETRIC(NTRANS,NTRANS,MAXSTRSTATES,MAXIOSTATES, + MAXSTRSTATES,MAXIOSTATES) c internal INTEGER I,J,K,L,I1,I2,J1,J2,ITRANS,IBEG,IEND INTEGER NSTR,NIO,ISTR1,IO1,ISTR2,IO2 INTEGER MATRIXPOS CHARACTER CSTR,CIO,LINE*250 CHARACTER*250 TESTSTRING CHARACTER*30 CTRANS LOGICAL LERROR C====================================================================== C init C====================================================================== MATRIXPOS=22 I= (NTRANS * NTRANS) * (MAXSTRSTATES * MAXSTRSTATES) * + (MAXIOSTATES * MAXIOSTATES) CALL INIT_REAL_ARRAY(1,I,SIMMETRIC,0.0) C accessibility cut to 200% = take all I= MAXSTRSTATES * MAXIOSTATES CALL INIT_REAL_ARRAY(1,I,IORANGE,200.0) TESTSTRING=' ' LINE=' ' CSTRSTATES=' ' CIOSTATES=' ' ITRANS=0 NSTRSTATES_1=1 NIOSTATES_1=1 NSTRSTATES_2=1 NIOSTATES_2=1 NSTR=0 NIO=0 c----------------------------------------------------------------------- TESTSTRING='AA STR I/O V L I M '// + 'F W Y G A P S T C '// + 'H R K Q E N D B Z' write(*,'(a,a)')' GETSIMMATRIX open metric: ',simfile(1:50) CALL OPEN_FILE(KSIM,SIMFILE,'READONLY,OLD',LERROR) IF (LERROR) GOTO 99 C---------------------------------------------------------------------- DO WHILE (INDEX(LINE,TESTSTRING).EQ.0) READ(KSIM,'(A)',END=99)LINE IF (INDEX(LINE,'STRUCTURE-STATES_1:') .NE. 0) THEN I=INDEX(LINE,':')+1 CALL STRPOS(LINE,IBEG,IEND) CALL READ_INT_FROM_STRING(LINE(I:IEND),NSTRSTATES_1) ELSEIF (INDEX(LINE,'STRUCTURE-STATES_2:') .NE. 0) THEN I=INDEX(LINE,':')+1 CALL STRPOS(LINE,IBEG,IEND) CALL READ_INT_FROM_STRING(LINE(I:IEND),NSTRSTATES_2) ELSEIF (INDEX(LINE,'I/O-STATES_1:') .NE. 0) THEN I=INDEX(LINE,':')+1 CALL STRPOS(LINE,IBEG,IEND) CALL READ_INT_FROM_STRING(LINE(I:IEND),NIOSTATES_1) ELSEIF (INDEX(LINE,'I/O-STATES_2:') .NE. 0) THEN I=INDEX(LINE,':')+1 CALL STRPOS(LINE,IBEG,IEND) CALL READ_INT_FROM_STRING(LINE(I:IEND),NIOSTATES_2) ELSEIF (INDEX(LINE,'DSSP-STRUCTURE') .NE. 0) THEN DO I=1,NSTRSTATES_1 DO J=1,NIOSTATES_1 READ(KSIM,'(A)')LINE READ(LINE,'(4X,A1,13X,A1)')CSTR,CIO K=INDEX(CSTRSTATES,CSTR) IF (K.EQ.0) THEN NSTR=NSTR+1 K=NSTR IF (NSTR .GT. MAXSTRSTATES) THEN write(*,*)'*** ERROR: struct-states overflow' STOP ENDIF CALL STRPOS(CSTRSTATES,IBEG,IEND) IF (IEND+1 .GT. LEN(CSTRSTATES)) THEN write(*,*) + '*** ERROR: CSTRSTATES string too short' STOP ENDIF WRITE(CSTRSTATES(IEND+1:IEND+1),'(A1)')CSTR ENDIF L=INDEX(CIOSTATES,CIO) IF (L.EQ.0) THEN NIO=NIO+1 L=NIO IF (NIO .GT. MAXIOSTATES) THEN write(*,*)'*** ERROR: I/O-states overflow' STOP ENDIF CALL STRPOS(CIOSTATES,IBEG,IEND) IF (IEND+1 .GT. LEN(CSTRSTATES)) THEN write(*,*) + '*** ERROR: CIOSTATES string too short' STOP ENDIF WRITE(CIOSTATES(IEND+1:IEND+1),'(A1)')CIO ENDIF READ(LINE,'(26X,F3.0)')IORANGE(K,L) ENDDO ENDDO ENDIF ENDDO C---------------------------------------------------------------------- write(*,*)' STRUCTURE-STATES_1: ',cstrstates,nstrstates_1 write(*,*)' I/O-STATES_1 : ',ciostates,niostates_1 write(*,*)' STRUCTURE-STATES_2: ',cstrstates,nstrstates_2 write(*,*)' I/O-STATES_2 : ',ciostates,niostates_2 IF (NSTRSTATES_1 .EQ. 1)NSTR=1 IF (NIOSTATES_1 .EQ. 1)NIO=1 IF (NSTR .NE. NSTRSTATES_1 .OR. NIO .NE. NIOSTATES_1 ) THEN write(*,*)'*** ERROR: number of structure-states .ne. NSTR' write(*,*)' OR number of I/O-states .ne. NIO' STOP ENDIF C---------------------------------------------------------------------- DO WHILE (.TRUE.) ITRANS=ITRANS+1 DO ISTR1=1,NSTRSTATES_1 DO IO1=1,NIOSTATES_1 DO ISTR2=1,NSTRSTATES_2 DO IO2=1,NIOSTATES_2 READ(KSIM,'(A)',END=11)LINE I1=INDEX(CSTRSTATES,LINE(5:5)) J1=INDEX(CIOSTATES,LINE(8:8)) I2=INDEX(CSTRSTATES,LINE(6:6)) J2=INDEX(CIOSTATES,LINE(9:9)) IF (I1.EQ.0.OR.I2.EQ.0.OR.J1.EQ.0.OR.J2.EQ.0) THEN IF (I1.EQ.0)I1=1 IF (J1.EQ.0)J1=1 IF (I2.EQ.0)I2=1 IF (J2.EQ.0)J2=1 ENDIF READ(LINE,'(1X,A1,7X,22(1X,F5.2))') + CTRANS(ITRANS:ITRANS), + (SIMMETRIC(ITRANS,K,I1,J1,I2,J2), + K=1,MATRIXPOS) ENDDO ENDDO ENDDO ENDDO ENDDO 11 CLOSE(KSIM) ITRANS=ITRANS-1 C======================================================================= C reset value for chain breaks etc... C add 'X' ITRANS=ITRANS+1 CTRANS(ITRANS:ITRANS)='X' I=INDEX(TRANS,'X') DO J=1,NTRANS DO ISTR1=1,NSTRSTATES_1 DO IO1=1,NIOSTATES_1 DO ISTR2=1,NSTRSTATES_2 DO IO2=1,NIOSTATES_2 SIMMETRIC(I,J,ISTR1,IO1,ISTR2,IO2)=0.0 SIMMETRIC(J,I,ISTR1,IO1,ISTR2,IO2)=0.0 ENDDO ENDDO ENDDO ENDDO ENDDO c add '!' ITRANS=ITRANS+1 CTRANS(ITRANS:ITRANS)='!' I=INDEX(TRANS,'!') DO J=1,NTRANS DO ISTR1=1,NSTRSTATES_1 DO IO1=1,NIOSTATES_1 DO ISTR2=1,NSTRSTATES_2 DO IO2=1,NIOSTATES_2 SIMMETRIC(I,J,ISTR1,IO1,ISTR2,IO2)=0.0 SIMMETRIC(J,I,ISTR1,IO1,ISTR2,IO2)=0.0 ENDDO ENDDO ENDDO ENDDO ENDDO c add '-' ITRANS=ITRANS+1 CTRANS(ITRANS:ITRANS)='-' I=INDEX(TRANS,'-') DO J=1,NTRANS DO ISTR1=1,NSTRSTATES_1 DO IO1=1,NIOSTATES_1 DO ISTR2=1,NSTRSTATES_2 DO IO2=1,NIOSTATES_2 SIMMETRIC(I,J,ISTR1,IO1,ISTR2,IO2)=0.0 SIMMETRIC(J,I,ISTR1,IO1,ISTR2,IO2)=0.0 ENDDO ENDDO ENDDO ENDDO ENDDO c add '.' ITRANS=ITRANS+1 CTRANS(ITRANS:ITRANS)='.' I=INDEX(TRANS,'.') DO J=1,NTRANS DO ISTR1=1,NSTRSTATES_1 DO IO1=1,NIOSTATES_1 DO ISTR2=1,NSTRSTATES_2 DO IO2=1,NIOSTATES_2 SIMMETRIC(I,J,ISTR1,IO1,ISTR2,IO2)=0.0 SIMMETRIC(J,I,ISTR1,IO1,ISTR2,IO2)=0.0 ENDDO ENDDO ENDDO ENDDO ENDDO C---------------------------------------------------------------------- C check input order of amino acids C======================================================================= IF (TRANS(1:NTRANS) .NE. CTRANS(1:ITRANS)) THEN write(*,*)' *** ERROR: CTRANS from metric-file and TRANS'// + ' are not the same' write(*,*)'GETSIMMATRIX: ',ctrans,itrans write(*,*)'GETSIMMATRIX: ',trans,ntrans STOP ENDIF C======================================================================= C debug C======================================================================= c do istr1=1,nstrstates_1 c do io1=1,niostates_1 c do istr2=1,nstrstates_2 c do io2=1,niostates_2 c write(*,*)(simmetric(1,j,istr1,io1,istr2,io2),j=1,26) c enddo c enddo c enddo c enddo C======================================================================= RETURN C======================================================================= C unknown metric or read error C======================================================================= 99 CLOSE(KSIM) write(*,'(a)') + '** ERROR reading metric in GETSIMMATRIX **' STOP END ***** end of GETSIMMETRIC C END GETSIMMETRIC C...................................................................... C...................................................................... C SUB INIT_REAL_ARRAY SUBROUTINE INIT_REAL_ARRAY(IBEG,IEND,ARRAY,VALUE) IMPLICIT NONE INTEGER I,IBEG,IEND REAL ARRAY DIMENSION ARRAY(IBEG:IEND) REAL VALUE DO I=IBEG,IEND ARRAY(I)=VALUE ENDDO RETURN END ***** end of INIT_REAL_ARRAY C END INIT_REAL_ARRAY C...................................................................... C...................................................................... C SUB LOWTOUP SUBROUTINE LOWTOUP(STRING,LENGTH) C LOWTOUP.......CONVERTS STRING......CHRIS SANDER JULY 1983 C changed by RS (speed up) CHARACTER*(*) STRING INTEGER LENGTH CX CHARACTER UPPER*26, LOWER*26, STRING*(*) CX DATA UPPER/'ABCDEFGHIJKLMNOPQRSTUVWXYZ'/ CX DATA LOWER/'abcdefghijklmnopqrstuvwxyz'/ C DO I=1,LENGTH IF (STRING(I:I) .GE. 'a' .AND. STRING(I:I) .LE. 'z') THEN STRING(I:I)=CHAR( ICHAR(STRING(I:I))-32 ) CX K=INDEX(LOWER,STRING(I:I)) CX IF (K.NE.0) STRING(I:I)=UPPER(K:K) ENDIF ENDDO RETURN END C END LOWTOUP C...................................................................... C...................................................................... C SUB MAKE_FORMAT_INT SUBROUTINE MAKE_FORMAT_INT(ILEN,CFORMAT) INTEGER ILEN CHARACTER*(*) CFORMAT CHARACTER*20 CTEMP,CINT CFORMAT=' ' CINT=' ' WRITE(CINT,'(I20)')ILEN CALL CONCAT_STRINGS('(I',CINT,CTEMP) CALL CONCAT_STRINGS(CTEMP,')',CFORMAT) RETURN END C END MAKE_FORMAT_INT C...................................................................... C...................................................................... C SUB READ_INT_FROM_STRING SUBROUTINE READ_INT_FROM_STRING(CSTRING,INUMBER) C import CHARACTER*(*) CSTRING C export INTEGER INUMBER C internal CHARACTER*100 CFORMAT,CTEMP CHARACTER*12 CNUMBER CNUMBER='-=0123456789' CFORMAT=' ' INUMBER=0 CALL STRPOS(CSTRING,ISTART,ISTOP) ITOTAL=ISTOP-ISTART+1 DO I=ISTART,ISTOP J=INDEX(CNUMBER,CSTRING(I:I)) IF ( J .LE. 0) THEN ITOTAL=I-ISTART WRITE(*,*)' *** NOT AN INTEGER:',CSTRING(ISTART:ISTOP) ENDIF ENDDO CALL CONCAT_STRING_INT('(I',ITOTAL,CTEMP) CALL CONCAT_STRINGS(CTEMP,')',CFORMAT) READ(CSTRING(ISTART:ISTOP),CFORMAT)INUMBER RETURN END ***** end of READ_INT_FROM_STRING C END READ_INT_FROM_STRING C...................................................................... C...................................................................... C SUB STRPOS SUBROUTINE STRPOS(SOURCE,ISTART,ISTOP) C StrPos(Source,IStart,IStop): Finds the positions of the first and C last non-blank/non-TAB in Source. IStart=IStop=0 for empty Source CHARACTER*(*) SOURCE INTEGER ISTART,ISTOP ISTART=0 ISTOP=0 DO J=1,LEN(SOURCE) IF (SOURCE(J:J).NE.' ') THEN ISTART=J GOTO 20 ENDIF ENDDO RETURN 20 DO J=LEN(SOURCE),1,-1 IF (SOURCE(J:J).NE.' ') THEN ISTOP=J RETURN ENDIF ENDDO ISTART=0 ISTOP=0 RETURN END ***** end of STRPOS C END STRPOS C...................................................................... profphd-utils-1.0.10/lib-sys-LINUX.f0000755015075101507510000003422612012371465016330 0ustar lkajanlkajanC---- ------------------------------------------------------------------ C---- contains now all previously needed libraries for Schneider stuff C---- ------------------------------------------------------------------ C this library contains subroutines which are system specific things, C like get the actual date, time, open a file etc. C ===> have one system-lib.for for VMS, UNIX is nix (SGI/PARIX).... machines C and link them. C====================================================================== C...................................................................... C SUB GETDATE C returns date in a string of implied length C UNIX version SUBROUTINE GETDATE(DATE) CHARACTER*(*) DATE CHARACTER CTEMP*24 CHARACTER DAY*2, MONTH*3, YEAR*2 DATE=' ' CTEMP=' ' CALL FDATE(CTEMP) MONTH = CTEMP(5:7) DAY = CTEMP(9:10) YEAR = CTEMP(23:24) DATE = (((DAY // '-') // MONTH) // '-') // YEAR RETURN END C END GETDATE C...................................................................... C...................................................................... C SUB GETTIME(CTIME) SUBROUTINE GETTIME(CTIME) c returns time in a string of implied length CHARACTER CTIME*(*) CHARACTER*24 CTEMP CTEMP = ' ' CALL FDATE(CTEMP) CTIME(1:)=CTEMP(11:22) RETURN END C END GETTIME(CTIME) C...................................................................... C...................................................................... C SUB GET_CURRENT_DIR C returns path name of the current directory C SGI/UNIX version SUBROUTINE GET_CURRENT_DIR(DIR_NAME) CHARACTER*(*) DIR_NAME INTEGER ILEN CHARACTER*128 TEMP_NAME TEMP_NAME=' ' C CALL GETCWD(TEMP_NAME) DIR_NAME=' ' ILEN=LEN(DIR_NAME) IEND=MIN(128,ILEN) DIR_NAME=TEMP_NAME(1:IEND) RETURN END C END GET_CURRENT_DIR C...................................................................... C...................................................................... C SUB GET_ENVIROMENT_VARIBALE C RETURNS VALUE OF ENVIROMENT VARIABLE C UNIX version SUBROUTINE GET_ENVIROMENT_VARIABLE(ENV_VAR,VARIABLE) CHARACTER*(*) ENV_VAR,VARIABLE VARIABLE=' ' CALL STRPOS(ENV_VAR,IBEG,IEND) CALL GETENV(ENV_VAR(IBEG:IEND),VARIABLE) RETURN END C END GET_ENVIROMENT_VARIABLE C...................................................................... C...................................................................... C SUB GET_MACHINE_NAME c returns current machine name c UNIX version SUBROUTINE GET_MACHINE_NAME(MACHINE_NAME) CHARACTER*(*) MACHINE_NAME INTEGER HOSTNM,ITEST ITEST=0 MACHINE_NAME=' ' ITEST=HOSTNM(MACHINE_NAME) IF (ITEST .NE. 0) THEN WRITE(*,*)' hostnm failed !!' WRITE(*,*)' error code is: ',itest ENDIF RETURN END C END GET_MACHINE_NAME C...................................................................... C...................................................................... C SUB GET_ARG_NUMBER C returns number of arguments C UNIX version SUBROUTINE GET_ARG_NUMBER(INUMBER) INTEGER INUMBER INUMBER=0 INUMBER=IARGC() RETURN END C END GET_ARG_NUMBER C...................................................................... C...................................................................... C SUB GET_ARGUMENT C returns the content of x-th argument C UNIX version SUBROUTINE GET_ARGUMENT(INUMBER,ARGUMENT) CHARACTER*(*) ARGUMENT INTEGER INUMBER CALL GETARG(INUMBER,ARGUMENT) RETURN END C END GET_ARGUMENT C...................................................................... C...................................................................... C lkajan 20110618: chmod_ is not present in earlier libgfortran so this fails to link, however this sub is not used anywhere - commenting all out CC SUB CHANGE_MODE C SUBROUTINE CHANGE_MODE(FILENAME,MODE,IRETURN_VAL) Cc changes the mode of a file via the integer function "chmod" Cc import C CHARACTER*(*) FILENAME,MODE Cc export C INTEGER IRETURN_VAL Cc internal C INTEGER CHMOD C C CALL STRPOS(FILENAME,IBEG,IEND) C CALL STRPOS(MODE,JBEG,JEND) C IRETURN_VAL=CHMOD(FILENAME(IBEG:IEND),MODE(JBEG:JEND)) C IF (IRETURN_VAL .NE. 0) THEN C WRITE(*,*)' *** ERROR IN CHANGE_MODE:***' C WRITE(*,*)' MODE CHANGE NOT PERFORMED !' C WRITE(*,*)' MODE: ',MODE(JBEG:JEND) C WRITE(*,*)' FILE: ',FILENAME(IBEG:IEND) C ENDIF C RETURN C END CC END CHANGE_MODE C...................................................................... C...................................................................... C SUB GET_CPU_TIME C get elapsed CPU time between two calls of GET_CPU_TIME C total_time = elapsed CPU time C user_time = used user CPU time C system_time = used system CPU time SUBROUTINE GET_CPU_TIME(CSTRING,IDPROC,ITIME_OLD, + ITIME_NEW,TOTAL_TIME,LOGSTRING) C import export CHARACTER CSTRING*(*) CHARACTER LOGSTRING*(*) INTEGER IDPROC INTEGER ITIME_OLD(*),ITIME_NEW(*) REAL TOTAL_TIME C internal INTEGER IBEG,IEND REAL XTIME(2),TOTAL XTIME(1)=0.0 XTIME(2)=0.0 TOTAL = DTIME(XTIME) TOTAL_TIME = TOTAL_TIME + TOTAL CALL STRPOS(CSTRING,IBEG,IEND) WRITE(LOGSTRING,'(A,I6,4(F10.2))')CSTRING(IBEG:IEND), + IDPROC,TOTAL,XTIME(1),XTIME(2),TOTAL_TIME RETURN END C END GET_CPU_TIME C...................................................................... C...................................................................... C SUB INIT_CPU_TIME SUBROUTINE INIT_CPU_TIME(ITIME_OLD) C import export INTEGER ITIME_OLD(*) C internal REAL XTIME(2),TOTAL XTIME(1)=0.0 XTIME(2)=0.0 TOTAL = DTIME(XTIME) RETURN END C END INIT_CPU_TIME C...................................................................... C...................................................................... C SUBROUTINE GET_CPU_TIME C get elapsed CPU time between two calls of GET_CPU_TIME C total_time = elapsed CPU time C user_time = used user CPU time C system_time = used system CPU time c subroutine get_cpu_time(total_time,user_time,system_time) C export c real total_time, user_time,system_time C internal c real xtime(2) c xtime(1)=0.0 c xtime(2)=0.0 c total_time = dtime(xtime) c user_time = xtime(1) c system_time = xtime(2) c write(*,*)' total user system' c write(*,'(3(1x,f7.2))')total_time,user_time,system_time c return c end C END GET_CPU_TIME C...................................................................... C...................................................................... C SUB OPEN_FILE SUBROUTINE OPEN_FILE(IUNIT,INNAME,CSTRING,LERROR) C IMPLICIT NONE C INPUT C CSTATUS: 'OLD' OR 'NEW' OR 'UNKNOWN' C CACCESS: 'APPEND' 'DIRECT' C FORM: 'FORMATTED' OR 'UNFORMATTED' C IRECLEN: RECORD LENGTH C NOTE: AFTER OPENING A "OLD" OR "UNKNOWN" FILE (NO DIRECT ACESS): C REWIND THE FILE, BECAUSE SOME STRANGE COMPILERS PUT THE FILE C POINTER AT THE END ! C CHARACTER*(*) INNAME,CSTRING INTEGER IUNIT,IRECLEN c output: lerror is true if open error LOGICAL LERROR c internal CHARACTER*200 TEMPSTRING,CTEMP,FILENAME CHARACTER*10 CNUMBER LOGICAL LNEW,LAPPEND,LUNKNOWN LOGICAL LUNFORMATTED,LDIRECT LOGICAL LOPENDONE,LSILENT INTEGER LENGTH,I,J,K c init TEMPSTRING=' ' FILENAME=' ' LNEW=.FALSE. LAPPEND=.FALSE. LERROR=.FALSE. LUNKNOWN=.FALSE. LUNFORMATTED=.FALSE. LDIRECT=.FALSE. LOPENDONE=.FALSE. LSILENT=.FALSE. IRECLEN=137 TEMPSTRING(1:)=CSTRING(1:) CNUMBER='0123456789' c LENGTH=LEN(TEMPSTRING) CALL LOWTOUP(TEMPSTRING,LENGTH) IF (INDEX(TEMPSTRING,'NEW').NE.0) THEN LNEW=.TRUE. ENDIF IF (INDEX(TEMPSTRING,'UNKNOWN').NE.0) THEN LUNKNOWN=.TRUE. ENDIF IF (INDEX(TEMPSTRING,'UNFORMATTED').NE.0) THEN LUNFORMATTED=.TRUE. ENDIF IF (INDEX(TEMPSTRING,'DIRECT').NE.0) THEN LDIRECT=.TRUE. ENDIF IF (INDEX(TEMPSTRING,'APPEND').NE.0) THEN LAPPEND=.TRUE. ENDIF IF (INDEX(TEMPSTRING,'SILENT').NE.0) THEN LSILENT=.TRUE. ENDIF IF (INDEX(TEMPSTRING,'RECL=').NE.0) THEN CTEMP=' ' K=INDEX(TEMPSTRING,'RECL=')+5 CTEMP(1:)=TEMPSTRING(K:) CALL STRPOS(CTEMP,I,J) J=I DO WHILE (INDEX(CNUMBER,CTEMP(J:J)).NE.0 ) J=J+1 ENDDO J=J-1 CALL READ_INT_FROM_STRING(CTEMP(I:J),IRECLEN) ENDIF CALL STRPOS(INNAME,IBEG,IEND) FILENAME(1:)=INNAME(IBEG:IEND) IF (LNEW) THEN CALL DEL_OLDFILE(IUNIT,FILENAME) ENDIF IF (LNEW .AND. LUNFORMATTED .AND. LDIRECT) THEN OPEN(IUNIT,FILE=FILENAME,STATUS='NEW',FORM='UNFORMATTED', + ACCESS='DIRECT',RECL=IRECLEN,ERR=999) C OPEN(IUNIT,FILE=FILENAME,STATUS='NEW',FORM='UNFORMATTED', C + ACCESS='DIRECT',RECL=IRECLEN) REWIND(IUNIT) LOPENDONE=.TRUE. ELSEIF (LNEW .AND. LUNFORMATTED ) THEN OPEN(IUNIT,FILE=FILENAME,STATUS='NEW',FORM='UNFORMATTED', + ERR=999) C OPEN(IUNIT,FILE=FILENAME,STATUS='NEW',FORM='UNFORMATTED') REWIND(IUNIT) LOPENDONE=.TRUE. ELSEIF (LNEW .AND. (.NOT. LUNFORMATTED) .AND. LDIRECT ) THEN OPEN(IUNIT,FILE=FILENAME,ACCESS='DIRECT',STATUS='NEW', + FORM='FORMATTED',RECL=IRECLEN,ERR=999) C OPEN(IUNIT,FILE=FILENAME,ACCESS='DIRECT',STATUS='NEW', C + FORM='FORMATTED',RECL=IRECLEN) LOPENDONE=.TRUE. ELSEIF (.NOT. LNEW .AND. LUNFORMATTED .AND. LDIRECT) THEN OPEN(IUNIT,FILE=FILENAME,STATUS='OLD',FORM='UNFORMATTED', + ACCESS='DIRECT',RECL=IRECLEN,ERR=999) C OPEN(IUNIT,FILE=FILENAME,STATUS='OLD',FORM='UNFORMATTED', C + ACCESS='DIRECT',RECL=IRECLEN) LOPENDONE=.TRUE. ELSEIF (.NOT. LNEW .AND. .NOT. LUNFORMATTED .AND. LDIRECT) THEN OPEN(IUNIT,FILE=FILENAME,STATUS='OLD',FORM='FORMATTED', + ACCESS='DIRECT',RECL=IRECLEN,ERR=999) C OPEN(IUNIT,FILE=FILENAME,STATUS='OLD',FORM='FORMATTED', C + ACCESS='DIRECT',RECL=IRECLEN) LOPENDONE=.TRUE. ELSEIF (.NOT. LNEW .AND. LUNFORMATTED ) THEN OPEN(IUNIT,FILE=FILENAME,STATUS='OLD',FORM='UNFORMATTED', + ERR=999) C OPEN(IUNIT,FILE=FILENAME,STATUS='OLD',FORM='UNFORMATTED') REWIND(IUNIT) LOPENDONE=.TRUE. ELSEIF (LNEW) THEN OPEN(IUNIT,FILE=FILENAME,STATUS='NEW',ERR=999) C OPEN(IUNIT,FILE=FILENAME,STATUS='NEW') REWIND(IUNIT) LOPENDONE=.TRUE. ELSEIF (LUNKNOWN .AND. LAPPEND) THEN OPEN(IUNIT,FILE=FILENAME,STATUS='UNKNOWN',ACCESS='APPEND', + ERR=999) C OPEN(IUNIT,FILE=FILENAME,STATUS='UNKNOWN',ACCESS='APPEND') C OPEN(IUNIT,FILE=FILENAME,STATUS='UNKNOWN') LOPENDONE=.TRUE. ELSEIF (.NOT. LNEW .AND. LAPPEND) THEN OPEN(IUNIT,FILE=FILENAME,STATUS='OLD',ACCESS='APPEND',ERR=999) C OPEN(IUNIT,FILE=FILENAME,STATUS='OLD',ACCESS='APPEND') LOPENDONE=.TRUE. ELSEIF (.NOT. LNEW) THEN OPEN(IUNIT,FILE=FILENAME,STATUS='OLD',ERR=999) C OPEN(IUNIT,FILE=FILENAME,STATUS='OLD') REWIND(IUNIT) LOPENDONE=.TRUE. ELSE OPEN(IUNIT,FILE=FILENAME,STATUS='UNKNOWN',ERR=999) C OPEN(IUNIT,FILE=FILENAME,STATUS='UNKNOWN') REWIND(IUNIT) LOPENDONE=.TRUE. ENDIF IF (.NOT. LOPENDONE) THEN WRITE(*,*)' ERROR IN OPEN_FILE: FILE NOT OPENED' WRITE(*,*)' unknown specifier combination !' STOP ENDIF RETURN 999 IF (.NOT. LSILENT) THEN WRITE(*,*)' ERROR: open file error for file: ' WRITE(*,*)'name: ',FILENAME WRITE(*,*)'unit: ',iunit ENDIF LERROR=.TRUE. RETURN END C END OPEN_FILE C...................................................................... C...................................................................... C SUB CLOSE_FILE SUBROUTINE CLOSE_FILE(IUNIT,FILENAME) INTEGER IUNIT CHARACTER*(*) FILENAME LOGICAL LOPEN INQUIRE(FILE=FILENAME,OPENED=LOPEN) IF (LOPEN) THEN CLOSE(IUNIT) ENDIF RETURN END C END CLOSE_FILE C...................................................................... C...................................................................... C SUBR DEL_OLDFILE SUBROUTINE DEL_OLDFILE(IUNIT,FILENAME) CHARACTER*(*) FILENAME INTEGER IUNIT LOGICAL LEXIST,LOPEN INTEGER IBEG,IEND CHARACTER*100 TEMPNAME TEMPNAME=' ' CALL STRPOS(FILENAME,IBEG,IEND) TEMPNAME(1:)=FILENAME(IBEG:IEND) INQUIRE(FILE=TEMPNAME,OPENED=LOPEN) IF (LOPEN) THEN CLOSE(IUNIT) ENDIF INQUIRE(FILE=TEMPNAME,EXIST=LEXIST) IF (LEXIST) THEN OPEN(IUNIT,FILE=TEMPNAME,STATUS='OLD') CLOSE(IUNIT,STATUS='DELETE') ENDIF RETURN END C END DEL_OLDFILE C...................................................................... C...................................................................... C SUB FLUSH SUBROUTINE FLUSH_UNIT(IUNIT) INTEGER IUNIT CALL FLUSH(IUNIT) RETURN END C END FLUSH C...................................................................... C...................................................................... C SUB C_PARIOINIT SUBROUTINE C_PARIOINIT(INODE,ILINK,HOSTNAME,ILEN) INTEGER INODE,ILEN,ILINK(*) CHARACTER*(*) HOSTNAME HOSTNAME=' ' RETURN END C END C_PARIOINIT C...................................................................... profphd-utils-1.0.10/long.msf0000644015075101507510000024141312012371464015342 0ustar lkajanlkajanMSF of: COPF-tmp9813264.msf_tmp from: 1 to: 26926 COPF-tmp9813264.msf_tmpSelf MSF: 26926 Type: P Jul-11-1961 14:00 Check: 1933 .. Name: Q10466 Len: 26926 Check: 2222 Weight: 1.00 Name: Q1046x Len: 26926 Check: 2222 Weight: 1.00 // Q10466 MTTQAPTFTQ PLQSVVVLEG STATFEAHIS GFPVPEVSWF RDGQVISTST Q1046x MTTQAPTFTQ PLQSVVVLEG STATFEAHIS GFPVPEVSWF RDGQVISTST Q10466 LPGVQISFSD GRAKLTIPAV TKANSGRYSL KATNGSGQAT STAELLVKAE Q1046x LPGVQISFSD GRAKLTIPAV TKANSGRYSL KATNGSGQAT STAELLVKAE Q10466 TAPPNFVQRL QSMTVRQGSQ VRLQVRVTGI PNPVVKFYRD GAEIQSSLDF Q1046x TAPPNFVQRL QSMTVRQGSQ VRLQVRVTGI PNPVVKFYRD GAEIQSSLDF Q10466 QISQEGDLYS LLIAEAYPED SGTYSVNATN SVGRATSTAE LLVQGEEEVP Q1046x QISQEGDLYS LLIAEAYPED SGTYSVNATN SVGRATSTAE LLVQGEEEVP Q10466 AKKTKTIVST AQISESRQTR IEKKIEAHFD ARSIATVEMV IDGAAGQQLP Q1046x AKKTKTIVST AQISESRQTR IEKKIEAHFD ARSIATVEMV IDGAAGQQLP Q10466 HKTPPRIPPK PKSRSPTPPS IAAKAQLARQ QSPSPIRHSP SPVRHVRAPT Q1046x HKTPPRIPPK PKSRSPTPPS IAAKAQLARQ QSPSPIRHSP SPVRHVRAPT Q10466 PSPVRSVSPA ARISTSPIRS VRSPLLMRKT QASTVATGPE VPPPWKQEGY Q1046x PSPVRSVSPA ARISTSPIRS VRSPLLMRKT QASTVATGPE VPPPWKQEGY Q10466 VASSSEAEMR ETTLTTSTQI RTEERWEGRY GVQEQVTISG AAGAAASVSA Q1046x VASSSEAEMR ETTLTTSTQI RTEERWEGRY GVQEQVTISG AAGAAASVSA Q10466 SASYAAEAVA TGAKEVKQDA DKSAAVATVV AAVDMARVRE PVISAVEQTA Q1046x SASYAAEAVA TGAKEVKQDA DKSAAVATVV AAVDMARVRE PVISAVEQTA Q10466 QRTTTTAVHI QPAQEQVRKE AEKTAVTKVV VAADKAKEQE LKSRTKEIIT Q1046x QRTTTTAVHI QPAQEQVRKE AEKTAVTKVV VAADKAKEQE LKSRTKEIIT Q10466 TKQEQMHVTH EQIRKETEKT FVPKVVISAA KAKEQETRIS EEITKKQKQV Q1046x TKQEQMHVTH EQIRKETEKT FVPKVVISAA KAKEQETRIS EEITKKQKQV Q10466 TQEAIMKETR KTVVPKVIVA TPKVKEQDLV SRGREGITTK REQVQITQEK Q1046x TQEAIMKETR KTVVPKVIVA TPKVKEQDLV SRGREGITTK REQVQITQEK Q10466 MRKEAEKTAL STIAVATAKA KEQETILRTR ETMATRQEQI QVTHGKVDVG Q1046x MRKEAEKTAL STIAVATAKA KEQETILRTR ETMATRQEQI QVTHGKVDVG Q10466 KKAEAVATVV AAVDQARVRE PREPGHLEES YAQQTTLEYG YKERISAAKV Q1046x KKAEAVATVV AAVDQARVRE PREPGHLEES YAQQTTLEYG YKERISAAKV Q10466 AEPPQRPASE PHVVPKAVKP RVIQAPSETH IKTTDQKGMH ISSQIKKTTD Q1046x AEPPQRPASE PHVVPKAVKP RVIQAPSETH IKTTDQKGMH ISSQIKKTTD Q10466 LTTERLVHVD KRPRTASPHF TVSKISVPKT EHGYEASIAG SAIATLQKEL Q1046x LTTERLVHVD KRPRTASPHF TVSKISVPKT EHGYEASIAG SAIATLQKEL Q10466 SATSSAQKIT KSVKAPTVKP SETRVRAEPT PLPQFPFADT PDTYKSEAGV Q1046x SATSSAQKIT KSVKAPTVKP SETRVRAEPT PLPQFPFADT PDTYKSEAGV Q10466 EVKKEVGVSI TGTTVREERF EVLHGREAKV TETARVPAPV EIPVTPPTLV Q1046x EVKKEVGVSI TGTTVREERF EVLHGREAKV TETARVPAPV EIPVTPPTLV Q10466 SGLKNVTVIE GESVTLECHI SGYPSPTVTW YREDYQIESS IDFQITFQSG Q1046x SGLKNVTVIE GESVTLECHI SGYPSPTVTW YREDYQIESS IDFQITFQSG Q10466 IARLMIREAF AEDSGRFTCS AVNEAGTVST SCYLAVQVSE EFEKETTAVT Q1046x IARLMIREAF AEDSGRFTCS AVNEAGTVST SCYLAVQVSE EFEKETTAVT Q10466 EKFTTEEKRF VESRDVVMTD TSLTEEQAGP GEPAAPYFIT KPVVQKLVEG Q1046x EKFTTEEKRF VESRDVVMTD TSLTEEQAGP GEPAAPYFIT KPVVQKLVEG Q10466 GSVVFGCQVG GNPKPHVYWK KSGVPLTTGY RYKVSYNKQT GECKLVISMT Q1046x GSVVFGCQVG GNPKPHVYWK KSGVPLTTGY RYKVSYNKQT GECKLVISMT Q10466 FADDAGEYTI VVRNKHGETS ASASLLEEAD YELLMKSQQE MLYQTQVTAF Q1046x FADDAGEYTI VVRNKHGETS ASASLLEEAD YELLMKSQQE MLYQTQVTAF Q10466 VQEPEVGETA PGFVYSEYEK EYEKEQALIR KKMAKDTVVV RTYVEDQEFH Q1046x VQEPEVGETA PGFVYSEYEK EYEKEQALIR KKMAKDTVVV RTYVEDQEFH Q10466 ISSFEERLIK EIEYRIIKTT LEELLEEDGE EKMAVDISES EAVESGFDLR Q1046x ISSFEERLIK EIEYRIIKTT LEELLEEDGE EKMAVDISES EAVESGFDLR Q10466 IKNYRILEGM GVTFHCKMSG YPLPKIAWYK DGKRIKHGER YQMDFLQDGR Q1046x IKNYRILEGM GVTFHCKMSG YPLPKIAWYK DGKRIKHGER YQMDFLQDGR Q10466 ASLRIPVVLP EDEGIYTAFA SNIKGNAICS GKLYVEPAAP LGAPTYIPTL Q1046x ASLRIPVVLP EDEGIYTAFA SNIKGNAICS GKLYVEPAAP LGAPTYIPTL Q10466 EPVSRIRSLS PRSVSRSPIR MSPARMSPAR MSPARMSPAR MSPGRRLEET Q1046x EPVSRIRSLS PRSVSRSPIR MSPARMSPAR MSPARMSPAR MSPGRRLEET Q10466 DESQLERLYK PVFVLKPVSF KCLEGANCRF DLKVVGRPMP ETFWFHDGQQ Q1046x DESQLERLYK PVFVLKPVSF KCLEGANCRF DLKVVGRPMP ETFWFHDGQQ Q10466 IVNDYTHKVV IKEDGTQSLI IVPATPSDSG EWTVVAQNRA GRSSISVILT Q1046x IVNDYTHKVV IKEDGTQSLI IVPATPSDSG EWTVVAQNRA GRSSISVILT Q10466 VEAVEHQVKP MFVEKLKNVN IKEGSRLEMK VRATGNPNPD IVWLKNSDII Q1046x VEAVEHQVKP MFVEKLKNVN IKEGSRLEMK VRATGNPNPD IVWLKNSDII Q10466 VPHKYPKIRI EGTKGEAALK IDSTVSQDSA WYTATAINKA GRDTTRCKVN Q1046x VPHKYPKIRI EGTKGEAALK IDSTVSQDSA WYTATAINKA GRDTTRCKVN Q10466 VEVEFAEPEP ERKLIIPRGT YRAKEIAAPE LEPLHLRYGQ EQWEEGDLYD Q1046x VEVEFAEPEP ERKLIIPRGT YRAKEIAAPE LEPLHLRYGQ EQWEEGDLYD Q10466 KEKQQKPFFK KKLTSLRLKR FGPAHFECRL TPISDPTMVV EWLHDGKPLE Q1046x KEKQQKPFFK KKLTSLRLKR FGPAHFECRL TPISDPTMVV EWLHDGKPLE Q10466 AANRLRMINE FGYCSLDYGV AYSRDSGIIT CRATNKYGTD HTSATLIVKD Q1046x AANRLRMINE FGYCSLDYGV AYSRDSGIIT CRATNKYGTD HTSATLIVKD Q10466 EKSLVEESQL PEGRKGLQRI EELERMAHEG ALTGVTTDQK EKQKPDIVLY Q1046x EKSLVEESQL PEGRKGLQRI EELERMAHEG ALTGVTTDQK EKQKPDIVLY Q10466 PEPVRVLEGE TARFRCRVTG YPQPKVNWYL NGQLIRKSKR FRVRYDGIHY Q1046x PEPVRVLEGE TARFRCRVTG YPQPKVNWYL NGQLIRKSKR FRVRYDGIHY Q10466 LDIVDCKSYD TGEVKVTAEN PEGVIEHKVK LEIQQREDFR SVLRRAPEPR Q1046x LDIVDCKSYD TGEVKVTAEN PEGVIEHKVK LEIQQREDFR SVLRRAPEPR Q10466 PEFHVHEPGK LQFEVQKVDR PVDTTETKEV VKLKRAERIT HEKVPEESEE Q1046x PEFHVHEPGK LQFEVQKVDR PVDTTETKEV VKLKRAERIT HEKVPEESEE Q10466 LRSKFKRRTE EGYYEAITAV ELKSRKKDES YEELLRKTKD ELLHWTKELT Q1046x LRSKFKRRTE EGYYEAITAV ELKSRKKDES YEELLRKTKD ELLHWTKELT Q10466 EEEKKALAEE GKITIPTFKP DKIELSPSME APKIFERIQS QTVGQGSDAH Q1046x EEEKKALAEE GKITIPTFKP DKIELSPSME APKIFERIQS QTVGQGSDAH Q10466 FRVRVVGKPD PECEWYKNGV KIERSDRIYW YWPEDNVCEL VIRDVTAEDS Q1046x FRVRVVGKPD PECEWYKNGV KIERSDRIYW YWPEDNVCEL VIRDVTAEDS Q10466 ASIMVKAINI AGETSSHAFL LVQAKQLITF TQELQDVVAK EKDTMATFEC Q1046x ASIMVKAINI AGETSSHAFL LVQAKQLITF TQELQDVVAK EKDTMATFEC Q10466 ETSEPFVKVK WYKDGMEVHE GDKYRMHSDR KVHFLSILTI DTSDAEDYSC Q1046x ETSEPFVKVK WYKDGMEVHE GDKYRMHSDR KVHFLSILTI DTSDAEDYSC Q10466 VLVEDENVKT TAKLIVEGAV VEFVKELQDI EVPESYSGEL ECIVSPENIE Q1046x VLVEDENVKT TAKLIVEGAV VEFVKELQDI EVPESYSGEL ECIVSPENIE Q10466 GKWYHNDVEL KSNGKYTITS RRGRQNLTVK DVTKEDQGEY SFVIDGKKTT Q1046x GKWYHNDVEL KSNGKYTITS RRGRQNLTVK DVTKEDQGEY SFVIDGKKTT Q10466 CKLKMKPRPI AILQGLSDQK VCEGDIVQLE VKVSLESVEG VWMKDGQEVQ Q1046x CKLKMKPRPI AILQGLSDQK VCEGDIVQLE VKVSLESVEG VWMKDGQEVQ Q10466 PSDRVHIVID KQSHMLLIED MTKEDAGNYS FTIPALGLST SGRVSVYSVD Q1046x PSDRVHIVID KQSHMLLIED MTKEDAGNYS FTIPALGLST SGRVSVYSVD Q10466 VITPLKDVNV IEGTKAVLEC KVSVPDVTSV KWYLNDEQIK PDDRVQAIVK Q1046x VITPLKDVNV IEGTKAVLEC KVSVPDVTSV KWYLNDEQIK PDDRVQAIVK Q10466 GTKQRLVINR THASDEGPYK LIVGRVETNC NLSVEKIKII RGLRDLTCTE Q1046x GTKQRLVINR THASDEGPYK LIVGRVETNC NLSVEKIKII RGLRDLTCTE Q10466 TQNVVFEVEL SHSGIDVLWN FKDKEIKPSS KYKIEAHGKI YKLTVLNMMK Q1046x TQNVVFEVEL SHSGIDVLWN FKDKEIKPSS KYKIEAHGKI YKLTVLNMMK Q10466 DDEGKYTFYA GENMTSGKLT VAGGAISKPL TDQTVAESQE AVFECEVANP Q1046x DDEGKYTFYA GENMTSGKLT VAGGAISKPL TDQTVAESQE AVFECEVANP Q10466 DSKGEWLRDG KHLPLTNNIR SESDGHKRRL IIAATKLDDI GEYTYKVATS Q1046x DSKGEWLRDG KHLPLTNNIR SESDGHKRRL IIAATKLDDI GEYTYKVATS Q10466 KTSAKLKVEA VKIKKTLKNL TVTETQDAVF TVELTHPNVK GVQWIKNGVV Q1046x KTSAKLKVEA VKIKKTLKNL TVTETQDAVF TVELTHPNVK GVQWIKNGVV Q10466 LESNEKYAIS VKGTIYSLRI KNCAIVDESV YGFRLGRLGA SARLHVETVK Q1046x LESNEKYAIS VKGTIYSLRI KNCAIVDESV YGFRLGRLGA SARLHVETVK Q10466 IIKKPKDVTA LENATVAFEV SVSHDTVPVK WFHKSVEIKP SDKHRLVSER Q1046x IIKKPKDVTA LENATVAFEV SVSHDTVPVK WFHKSVEIKP SDKHRLVSER Q10466 KVHKLMLQNI SPSDAGEYTA VVGQLECKAK LFVETLHITK TMKNIEVPET Q1046x KVHKLMLQNI SPSDAGEYTA VVGQLECKAK LFVETLHITK TMKNIEVPET Q10466 KTASFECEVS HFNVPSMWLK NGVEIEMSEK FKIVVQGKLH QLIIMNTSTE Q1046x KTASFECEVS HFNVPSMWLK NGVEIEMSEK FKIVVQGKLH QLIIMNTSTE Q10466 DSAEYTFVCG NDQVSATLTV TPIMITSMLK DINAEEKDTI TFEVTVNYEG Q1046x DSAEYTFVCG NDQVSATLTV TPIMITSMLK DINAEEKDTI TFEVTVNYEG Q10466 ISYKWLKNGV EIKSTDKCQM RTKKLTHSLN IRNVHFGDAA DYTFVAGKAT Q1046x ISYKWLKNGV EIKSTDKCQM RTKKLTHSLN IRNVHFGDAA DYTFVAGKAT Q10466 STATLYVEAR HIEFRKHIKD IKVLEKKRAM FECEVSEPDI TVQWMKDDQE Q1046x STATLYVEAR HIEFRKHIKD IKVLEKKRAM FECEVSEPDI TVQWMKDDQE Q10466 LQITDRIKIQ KEKYVHRLLI PSTRMSDAGK YTVVAGGNVS TAKLFVEGRD Q1046x LQITDRIKIQ KEKYVHRLLI PSTRMSDAGK YTVVAGGNVS TAKLFVEGRD Q10466 VRIRSIKKEV QVIEKQRAVV EFEVNEDDVD AHWYKDGIEI NFQVQERHKY Q1046x VRIRSIKKEV QVIEKQRAVV EFEVNEDDVD AHWYKDGIEI NFQVQERHKY Q10466 VVERRIHRMF ISETRQSDAG EYTFVAGRNR SSVTLYVNAP EPPQVLQELQ Q1046x VVERRIHRMF ISETRQSDAG EYTFVAGRNR SSVTLYVNAP EPPQVLQELQ Q10466 PVTVQSGKPA RFCAMISGRP QPKISWYKEE QLLSTGFKCK FLHDGQEYTL Q1046x PVTVQSGKPA RFCAMISGRP QPKISWYKEE QLLSTGFKCK FLHDGQEYTL Q10466 LLIEAFPEDA AVYTCEAKND YGVATTSASL SVEVPEVVSP DQEMPVYPPA Q1046x LLIEAFPEDA AVYTCEAKND YGVATTSASL SVEVPEVVSP DQEMPVYPPA Q10466 IITPLQDTVT SEGQPARFQC RVSGTDLKVS WYSKDKKIKP SRFFRMTQFE Q1046x IITPLQDTVT SEGQPARFQC RVSGTDLKVS WYSKDKKIKP SRFFRMTQFE Q10466 DTYQLEIAEA YPEDEGTYTF VANNAVGQVS STANLSLEAP ESILHERIEQ Q1046x DTYQLEIAEA YPEDEGTYTF VANNAVGQVS STANLSLEAP ESILHERIEQ Q10466 EIEMEMKEFS SSFLSAEEEG LHSAELQLSK INETLELLSE SPVYPTKFDS Q1046x EIEMEMKEFS SSFLSAEEEG LHSAELQLSK INETLELLSE SPVYPTKFDS Q10466 EKEGTGPIFI KEVSNADISM GDVATLSVTV IGIPKPKIQW FFNGVLLTPS Q1046x EKEGTGPIFI KEVSNADISM GDVATLSVTV IGIPKPKIQW FFNGVLLTPS Q10466 ADYKFVFDGD DHSLIILFTK LEDEGEYTCM ASNDYGKTIC SAYLKINSKG Q1046x ADYKFVFDGD DHSLIILFTK LEDEGEYTCM ASNDYGKTIC SAYLKINSKG Q10466 EGHKDTETES AVAKSLEKLG GPCPPHFLKE LKPIRCAQGL PAIFEYTVVG Q1046x EGHKDTETES AVAKSLEKLG GPCPPHFLKE LKPIRCAQGL PAIFEYTVVG Q10466 EPAPTVTWFK ENKQLCTSVY YTIIHNPNGS GTFIVNDPQR EDSGLYICKA Q1046x EPAPTVTWFK ENKQLCTSVY YTIIHNPNGS GTFIVNDPQR EDSGLYICKA Q10466 ENMLGESTCA AELLVLLEDT DMTDTPCKAK STPEAPEDFP QTPLKGPAVE Q1046x ENMLGESTCA AELLVLLEDT DMTDTPCKAK STPEAPEDFP QTPLKGPAVE Q10466 ALDSEQEIAT FVKDTILKAA LITEENQQLS YEHIAKANEL SSQLPLGAQE Q1046x ALDSEQEIAT FVKDTILKAA LITEENQQLS YEHIAKANEL SSQLPLGAQE Q10466 LQSILEQDKL TPESTREFLC INGSIHFQPL KEPSPNLQLQ IVQSQKTFSK Q1046x LQSILEQDKL TPESTREFLC INGSIHFQPL KEPSPNLQLQ IVQSQKTFSK Q10466 EGILMPEEPE TQAVLSDTEK IFPSAMSIEQ INSLTVEPLK TLLAEPEGNY Q1046x EGILMPEEPE TQAVLSDTEK IFPSAMSIEQ INSLTVEPLK TLLAEPEGNY Q10466 PQSSIEPPMH SYLTSVAEEV LSLKEKTVSD TNREQRVTLQ KQEAQSALIL Q1046x PQSSIEPPMH SYLTSVAEEV LSLKEKTVSD TNREQRVTLQ KQEAQSALIL Q10466 SQSLAEGHVE SLQSPDVMIS QVNYEPLVPS EHSCTEGGKI LIESANPLEN Q1046x SQSLAEGHVE SLQSPDVMIS QVNYEPLVPS EHSCTEGGKI LIESANPLEN Q10466 AGQDSAVRIE EGKSLRFPLA LEEKQVLLKE EHSDNVVMPP DQIIESKREP Q1046x AGQDSAVRIE EGKSLRFPLA LEEKQVLLKE EHSDNVVMPP DQIIESKREP Q10466 VAIKKVQEVQ GRDLLSKESL LSGIPEEQRL NLKIQICRAL QAAVASEQPG Q1046x VAIKKVQEVQ GRDLLSKESL LSGIPEEQRL NLKIQICRAL QAAVASEQPG Q10466 LFSEWLRNIE KVEVEAVNIT QEPRHIMCMY LVTSAKSVTE EVTIIIEDVD Q1046x LFSEWLRNIE KVEVEAVNIT QEPRHIMCMY LVTSAKSVTE EVTIIIEDVD Q10466 PQMANLKMEL RDALCAIIYE EIDILTAEGP RIQQGAKTSL QEEMDSFSGS Q1046x PQMANLKMEL RDALCAIIYE EIDILTAEGP RIQQGAKTSL QEEMDSFSGS Q10466 QKVEPITEPE VESKYLISTE EVSYFNVQSR VKYLDATPVT KGVASAVVSD Q1046x QKVEPITEPE VESKYLISTE EVSYFNVQSR VKYLDATPVT KGVASAVVSD Q10466 EKQDESLKPS EEKEESSSES GTEEVATVKI QEAEGGLIKE DGPMIHTPLV Q1046x EKQDESLKPS EEKEESSSES GTEEVATVKI QEAEGGLIKE DGPMIHTPLV Q10466 DTVSEEGDIV HLTTSITNAK EVNWYFENKL VPSDEKFKCL QDQNTYTLVI Q1046x DTVSEEGDIV HLTTSITNAK EVNWYFENKL VPSDEKFKCL QDQNTYTLVI Q10466 DKVNTEDHQG EYVCEALNDS GKTATSAKLT VVKRAAPVIK RKIEPLEVAL Q1046x DKVNTEDHQG EYVCEALNDS GKTATSAKLT VVKRAAPVIK RKIEPLEVAL Q10466 GHLAKFTCEI QSAPNVRFQW FKAGREIYES DKCSIRSSKY ISSLEILRTQ Q1046x GHLAKFTCEI QSAPNVRFQW FKAGREIYES DKCSIRSSKY ISSLEILRTQ Q10466 VVDCGEYTCK ASNEYGSVSC TATLTVTVPG GEKKVRKLLP ERKPEPKEEV Q1046x VVDCGEYTCK ASNEYGSVSC TATLTVTVPG GEKKVRKLLP ERKPEPKEEV Q10466 VLKSVLRKRP EEEEPKVEPK KLEKVKKPAV PEPPPPKPVE EVEVPTVTKR Q1046x VLKSVLRKRP EEEEPKVEPK KLEKVKKPAV PEPPPPKPVE EVEVPTVTKR Q10466 ERKIPEPTKV PEIKPAIPLP APEPKPKPEA EVKTIKPPPV EPEPTPIAAP Q1046x ERKIPEPTKV PEIKPAIPLP APEPKPKPEA EVKTIKPPPV EPEPTPIAAP Q10466 VTVPVVGKKA EAKAPKEEAA KPKGPIKGVP KKTPSPIEAE RRKLRPGSGG Q1046x VTVPVVGKKA EAKAPKEEAA KPKGPIKGVP KKTPSPIEAE RRKLRPGSGG Q10466 EKPPDEAPFT YQLKAVPLKF VKEIKDIILT ESEFVGSSAI FECLVSPSTA Q1046x EKPPDEAPFT YQLKAVPLKF VKEIKDIILT ESEFVGSSAI FECLVSPSTA Q10466 ITTWMKDGSN IRESPKHRFI ADGKDRKLHI IDVQLSDAGE YTCVLRLGNK Q1046x ITTWMKDGSN IRESPKHRFI ADGKDRKLHI IDVQLSDAGE YTCVLRLGNK Q10466 EKTSTAKLVV EELPVRFVKT LEEEVTVVKG QPLYLSCELN KERDVVWRKD Q1046x EKTSTAKLVV EELPVRFVKT LEEEVTVVKG QPLYLSCELN KERDVVWRKD Q10466 GKIVVEKPGR IVPGVIGLMR ALTINDADDT DAGTYTVTVE NANNLECSSC Q1046x GKIVVEKPGR IVPGVIGLMR ALTINDADDT DAGTYTVTVE NANNLECSSC Q10466 VKVVEVIRDW LVKPIRDQHV KPKGTAIFAC DIAKDTPNIK WFKGYDEIPA Q1046x VKVVEVIRDW LVKPIRDQHV KPKGTAIFAC DIAKDTPNIK WFKGYDEIPA Q10466 EPNDKTEILR DGNHLYLKIK NAMPEDIAEY AVEIEGKRYP AKLTLGEREV Q1046x EPNDKTEILR DGNHLYLKIK NAMPEDIAEY AVEIEGKRYP AKLTLGEREV Q10466 ELLKPIEDVT IYEKESASFD AEISEADIPG QWKLKGELLR PSPTCEIKAE Q1046x ELLKPIEDVT IYEKESASFD AEISEADIPG QWKLKGELLR PSPTCEIKAE Q10466 GGKRFLTLHK VKLDQAGEVL YQALNAITTA ILTVKEIELD FAVPLKDVTV Q1046x GGKRFLTLHK VKLDQAGEVL YQALNAITTA ILTVKEIELD FAVPLKDVTV Q10466 PERRQARFEC VLTREANVIW SKGPDIIKSS DKFDIIADGK KHILVINDSQ Q1046x PERRQARFEC VLTREANVIW SKGPDIIKSS DKFDIIADGK KHILVINDSQ Q10466 FDDEGVYTAE VEGKKTSARL FVTGIRLKFM SPLEDQTVKE GETATFVCEL Q1046x FDDEGVYTAE VEGKKTSARL FVTGIRLKFM SPLEDQTVKE GETATFVCEL Q10466 SHEKMHVVWF KNDAKLHTSR TVLISSEGKT HKLEMKEVTL DDISQIKAQV Q1046x SHEKMHVVWF KNDAKLHTSR TVLISSEGKT HKLEMKEVTL DDISQIKAQV Q10466 KELSSTAQLK VLEADPYFTV KLHDKTAVEK DEITLKCEVS KDVPVKWFKD Q1046x KELSSTAQLK VLEADPYFTV KLHDKTAVEK DEITLKCEVS KDVPVKWFKD Q10466 GEEIVPSPKY SIKADGLRRI LKIKKADLKD KGEYVCDCGT DKTKANVTVE Q1046x GEEIVPSPKY SIKADGLRRI LKIKKADLKD KGEYVCDCGT DKTKANVTVE Q10466 ARLIEVEKPL YGVEVFVGET AHFEIELSEP DVHGQWKLKG QPLTASPDCE Q1046x ARLIEVEKPL YGVEVFVGET AHFEIELSEP DVHGQWKLKG QPLTASPDCE Q10466 IIEDGKKHIL ILHNCQLGMT GEVSFQAANA KSAANLKVKE LPLIFITPLS Q1046x IIEDGKKHIL ILHNCQLGMT GEVSFQAANA KSAANLKVKE LPLIFITPLS Q10466 DVKVFEKDEA KFECEVSREP KTFRWLKGTQ EITGDDRFEL IKDGTKHSMV Q1046x DVKVFEKDEA KFECEVSREP KTFRWLKGTQ EITGDDRFEL IKDGTKHSMV Q10466 IKSAAFEDEA KYMFEAEDKH TSGKLIIEGI RLKFLTPLKD VTAKEKESAV Q1046x IKSAAFEDEA KYMFEAEDKH TSGKLIIEGI RLKFLTPLKD VTAKEKESAV Q10466 FTVELSHDNI RVKWFKNDQR LHTTRSVSMQ DEGKTHSITF KDLSIDDTSQ Q1046x FTVELSHDNI RVKWFKNDQR LHTTRSVSMQ DEGKTHSITF KDLSIDDTSQ Q10466 IRVEAMGMSS EAKLTVLEGD PYFTGKLQDY TGVEKDEVIL QCEISKADAP Q1046x IRVEAMGMSS EAKLTVLEGD PYFTGKLQDY TGVEKDEVIL QCEISKADAP Q10466 VKWFKDGKEI KPSKNAVIKT DGKKRMLILK KALKSDIGQY TCDCGTDKTS Q1046x VKWFKDGKEI KPSKNAVIKT DGKKRMLILK KALKSDIGQY TCDCGTDKTS Q10466 GKLDIEDREI KLVRPLHSVE VMETETARFE TEISEDDIHA NWKLKGEALL Q1046x GKLDIEDREI KLVRPLHSVE VMETETARFE TEISEDDIHA NWKLKGEALL Q10466 QTPDCEIKEE GKIHSLVLHN CRLDQTGGVD FQAANVKSSA HLRVKPRVIG Q1046x QTPDCEIKEE GKIHSLVLHN CRLDQTGGVD FQAANVKSSA HLRVKPRVIG Q10466 LLRPLKDVTV TAGETATFDC ELSYEDIPVE WYLKGKKLEP SDKVVPRSEG Q1046x LLRPLKDVTV TAGETATFDC ELSYEDIPVE WYLKGKKLEP SDKVVPRSEG Q10466 KVHTLTLRDV KLEDAGEVQL TAKDFKTHAN LFVKEPPVEF TKPLEDQTVE Q1046x KVHTLTLRDV KLEDAGEVQL TAKDFKTHAN LFVKEPPVEF TKPLEDQTVE Q10466 EGATAVLECE VSRENAKVKW FKNGTEILKS KKYEIVADGR VRKLVIHDCT Q1046x EGATAVLECE VSRENAKVKW FKNGTEILKS KKYEIVADGR VRKLVIHDCT Q10466 PEDIKTYTCD AKDFKTSCNL NVVPPHVEFL RPLTDLQVRE KEMARFECEL Q1046x PEDIKTYTCD AKDFKTSCNL NVVPPHVEFL RPLTDLQVRE KEMARFECEL Q10466 SRENAKVKWF KDGAEIKKGK KYDIISKGAV RILVINKCLL DDEAEYSCEV Q1046x SRENAKVKWF KDGAEIKKGK KYDIISKGAV RILVINKCLL DDEAEYSCEV Q10466 RTARTSGMLT VLEEEAVFTK NLANIEVSET DTIKLVCEVS KPGAEVIWYK Q1046x RTARTSGMLT VLEEEAVFTK NLANIEVSET DTIKLVCEVS KPGAEVIWYK Q10466 GDEEIIETGR YEILTEGRKR ILVIQNAHLE DAGNYNCRLP SSRTDGKVKV Q1046x GDEEIIETGR YEILTEGRKR ILVIQNAHLE DAGNYNCRLP SSRTDGKVKV Q10466 HELAAEFISK PQNLEILEGE KAEFVCSISK ESFPVQWKRD DKTLESGDKY Q1046x HELAAEFISK PQNLEILEGE KAEFVCSISK ESFPVQWKRD DKTLESGDKY Q10466 DVIADGKKRV LVVKDATLQD MGTYVVMVGA ARAAAHLTVI EKLRIVVPLK Q1046x DVIADGKKRV LVVKDATLQD MGTYVVMVGA ARAAAHLTVI EKLRIVVPLK Q10466 DTRVKEQQEV VFNCEVNTEG AKAKWFRNEE AIFDSSKYII LQKDLVYTLR Q1046x DTRVKEQQEV VFNCEVNTEG AKAKWFRNEE AIFDSSKYII LQKDLVYTLR Q10466 IRDAHLDDQA NYNVSLTNHR GENVKSAANL IVEEEDLRIV EPLKDIETME Q1046x IRDAHLDDQA NYNVSLTNHR GENVKSAANL IVEEEDLRIV EPLKDIETME Q10466 KKSVTFWCKV NRLNVTLKWT KNGEEVPFDN RVSYRVDKYK HMLTIKDCGF Q1046x KKSVTFWCKV NRLNVTLKWT KNGEEVPFDN RVSYRVDKYK HMLTIKDCGF Q10466 PDEGEYIVTA GQDKSVAELL IIEAPTEFVE HLEDQTVTEF DDAVFSCQLS Q1046x PDEGEYIVTA GQDKSVAELL IIEAPTEFVE HLEDQTVTEF DDAVFSCQLS Q10466 REKANVKWYR NGREIKEGKK YKFEKDGSIH RLIIKDCRLD DECEYACGVE Q1046x REKANVKWYR NGREIKEGKK YKFEKDGSIH RLIIKDCRLD DECEYACGVE Q10466 DRKSRARLFV EEIPVEIIRP PQDILEAPGA DVVFLAELNK DKVEVQWLRN Q1046x DRKSRARLFV EEIPVEIIRP PQDILEAPGA DVVFLAELNK DKVEVQWLRN Q10466 NMVVVQGDKH QMMSEGKIHR LQICDIKPRD QGEYRFIAKD KEARAKLELA Q1046x NMVVVQGDKH QMMSEGKIHR LQICDIKPRD QGEYRFIAKD KEARAKLELA Q10466 AAPKIKTADQ DLVVDVGKPL TMVVPYDAYP KAEAEWFKEN EPLSTKTIDT Q1046x AAPKIKTADQ DLVVDVGKPL TMVVPYDAYP KAEAEWFKEN EPLSTKTIDT Q10466 TAEQTSFRIL EAKKGDKGRY KIVLQNKHGK AEGFINLKVI DVPGPVRNLE Q1046x TAEQTSFRIL EAKKGDKGRY KIVLQNKHGK AEGFINLKVI DVPGPVRNLE Q10466 VTETFDGEVS LAWEEPLTDG GSKIIGYVVE RRDIKRKTWV LATDRAESCE Q1046x VTETFDGEVS LAWEEPLTDG GSKIIGYVVE RRDIKRKTWV LATDRAESCE Q10466 FTVTGLQKGG VEYLFRVSAR NRVGTGEPVE TDNPVEARSK YDVPGPPLNV Q1046x FTVTGLQKGG VEYLFRVSAR NRVGTGEPVE TDNPVEARSK YDVPGPPLNV Q10466 TITDVNRFGV SLTWEPPEYD GGAEITNYVI ELRDKTSIRW DTAMTVRAED Q1046x TITDVNRFGV SLTWEPPEYD GGAEITNYVI ELRDKTSIRW DTAMTVRAED Q10466 LSATVTDVVE GQEYSFRVRA QNRIGVGKPS AATPFVKVAD PIERPSPPVN Q1046x LSATVTDVVE GQEYSFRVRA QNRIGVGKPS AATPFVKVAD PIERPSPPVN Q10466 LTSSDQTQSS VQLKWEPPLK DGGSPILGYI IERCEEGKDN WIRCNMKLVP Q1046x LTSSDQTQSS VQLKWEPPLK DGGSPILGYI IERCEEGKDN WIRCNMKLVP Q10466 ELTYKVTGLE KGNKYLYRVS AENKAGVSDP SEILGPLTAD DAFVEPTMDL Q1046x ELTYKVTGLE KGNKYLYRVS AENKAGVSDP SEILGPLTAD DAFVEPTMDL Q10466 SAFKDGLEVI VPNPITILVP STGYPRPTAT WCFGDKVLET GDRVKMKTLS Q1046x SAFKDGLEVI VPNPITILVP STGYPRPTAT WCFGDKVLET GDRVKMKTLS Q10466 AYAELVISPS ERSDKGIYTL KLENRVKTIS GEIDVNVIAR PSAPKELKFG Q1046x AYAELVISPS ERSDKGIYTL KLENRVKTIS GEIDVNVIAR PSAPKELKFG Q10466 DITKDSVHLT WEPPDDDGGS PLTGYVVEKR EVSRKTWTKV MDFVTDLEFT Q1046x DITKDSVHLT WEPPDDDGGS PLTGYVVEKR EVSRKTWTKV MDFVTDLEFT Q10466 VPDLVQGKEY LFKVCARNKC GPGEPAYVDE PVNMSTPATV PDPPENVKWR Q1046x VPDLVQGKEY LFKVCARNKC GPGEPAYVDE PVNMSTPATV PDPPENVKWR Q10466 DRTANSIFLT WDPPKNDGGS RIKGYIVERC PRGSDKWVAC GEPVAETKME Q1046x DRTANSIFLT WDPPKNDGGS RIKGYIVERC PRGSDKWVAC GEPVAETKME Q10466 VTGLEEGKWY AYRVKTLNRQ GASKPSRPTE EIQAVDTQEA PEIFLDVKLL Q1046x VTGLEEGKWY AYRVKTLNRQ GASKPSRPTE EIQAVDTQEA PEIFLDVKLL Q10466 AGLTVKAGTK IELPATVTGK PEPKITWTKA DMILKQDKRI TIENVPKKST Q1046x AGLTVKAGTK IELPATVTGK PEPKITWTKA DMILKQDKRI TIENVPKKST Q10466 VTIVDSKRSD TGTYIIEAVN VCGRATAVVE VNVLDKPGPP AAFDITDVTN Q1046x VTIVDSKRSD TGTYIIEAVN VCGRATAVVE VNVLDKPGPP AAFDITDVTN Q10466 ESCLLTWNPP RDDGGSKITN YVVERRATDS EVWHKLSSTV KDTNFKATKL Q1046x ESCLLTWNPP RDDGGSKITN YVVERRATDS EVWHKLSSTV KDTNFKATKL Q10466 IPNKEYIFRV AAENMYGAGE PVQASPITAK YQFDPPGPPT RLEPSDITKD Q1046x IPNKEYIFRV AAENMYGAGE PVQASPITAK YQFDPPGPPT RLEPSDITKD Q10466 AVTLTWCEPD DDGGSPITGY WVERLDPDTD KWVRCNKMPV KDTTYRVKGL Q1046x AVTLTWCEPD DDGGSPITGY WVERLDPDTD KWVRCNKMPV KDTTYRVKGL Q10466 TNKKKYRFRV LAENLAGPGK PSKSTEPILI KDPIDPPWPP GKPTVKDVGK Q1046x TNKKKYRFRV LAENLAGPGK PSKSTEPILI KDPIDPPWPP GKPTVKDVGK Q10466 TSVRLNWTKP EHDGGAKIES YVIEMLKTGT DEWVRVAEGV PTTQHLLPGL Q1046x TSVRLNWTKP EHDGGAKIES YVIEMLKTGT DEWVRVAEGV PTTQHLLPGL Q10466 MEGQEYSFRV RAVNKAGESE PSEPSDPVLC REKLYPPSPP RWLEVINITK Q1046x MEGQEYSFRV RAVNKAGESE PSEPSDPVLC REKLYPPSPP RWLEVINITK Q10466 NTADLKWTVP EKDGGSPITN YIVEKRDVRR KGWQTVDTTV KDTKCTVTPL Q1046x NTADLKWTVP EKDGGSPITN YIVEKRDVRR KGWQTVDTTV KDTKCTVTPL Q10466 TEGSLYVFRV AAENAIGQSD YTEIEDSVLA KDTFTTPGPP YALAVVDVTK Q1046x TEGSLYVFRV AAENAIGQSD YTEIEDSVLA KDTFTTPGPP YALAVVDVTK Q10466 RHVDLKWEPP KNDGGRPIQR YVIEKKERLG TRWVKAGKTA GPDCNFRVTD Q1046x RHVDLKWEPP KNDGGRPIQR YVIEKKERLG TRWVKAGKTA GPDCNFRVTD Q10466 VIEGTEVQFQ VRAENEAGVG HPSEPTEILS IEDPTSPPSP PLDLHVTDAG Q1046x VIEGTEVQFQ VRAENEAGVG HPSEPTEILS IEDPTSPPSP PLDLHVTDAG Q10466 RKHIAIAWKP PEKNGGSPII GYHVEMCPVG TEKWMRVNSR PIKDLKFKVE Q1046x RKHIAIAWKP PEKNGGSPII GYHVEMCPVG TEKWMRVNSR PIKDLKFKVE Q10466 EGVVPDKEYV LRVRAVNAIG VSEPSEISEN VVAKDPDCKP TIDLETHDII Q1046x EGVVPDKEYV LRVRAVNAIG VSEPSEISEN VVAKDPDCKP TIDLETHDII Q10466 VIEGEKLSIP VPFRAVPVPT VSWHKDGKEV KASDRLTMKN DHISAHLEVP Q1046x VIEGEKLSIP VPFRAVPVPT VSWHKDGKEV KASDRLTMKN DHISAHLEVP Q10466 KSVRADAGIY TITLENKLGS ATASINVKVI GLPGPCKDIK ASDITKSSCK Q1046x KSVRADAGIY TITLENKLGS ATASINVKVI GLPGPCKDIK ASDITKSSCK Q10466 LTWEPPEFDG GTPILHYVLE RREAGRRTYI PVMSGENKLS WTVKDLIPNG Q1046x LTWEPPEFDG GTPILHYVLE RREAGRRTYI PVMSGENKLS WTVKDLIPNG Q10466 EYFFRVKAVN KVGGGEYIEL KNPVIAQDPK QPPDPPVDVE VHNPTAEAMT Q1046x EYFFRVKAVN KVGGGEYIEL KNPVIAQDPK QPPDPPVDVE VHNPTAEAMT Q10466 ITWKPPLYDG GSKIMGYIIE KIAKGEERWK RCNEHLVPIL TYTAKGLEEG Q1046x ITWKPPLYDG GSKIMGYIIE KIAKGEERWK RCNEHLVPIL TYTAKGLEEG Q10466 KEYQFRVRAE NAAGISEPSR ATPPTKAVDP IDAPKVILRT SLEVKRGDEI Q1046x KEYQFRVRAE NAAGISEPSR ATPPTKAVDP IDAPKVILRT SLEVKRGDEI Q10466 ALDASISGSP YPTITWIKDE NVIVPEEIKK RAAPLVRRRK GEVQEEEPFV Q1046x ALDASISGSP YPTITWIKDE NVIVPEEIKK RAAPLVRRRK GEVQEEEPFV Q10466 LPLTQRLSID NSKKGESQLR VRDSLRPDHG LYMIKVENDH GIAKAPCTVS Q1046x LPLTQRLSID NSKKGESQLR VRDSLRPDHG LYMIKVENDH GIAKAPCTVS Q10466 VLDTPGPPIN FVFEDIRKTS VLCKWEPPLD DGGSEIINYT LEKKDKTKPD Q1046x VLDTPGPPIN FVFEDIRKTS VLCKWEPPLD DGGSEIINYT LEKKDKTKPD Q10466 SEWIVVTSTL RHCKYSVTKL IEGKEYLFRV RAENRFGPGP PCVSKPLVAK Q1046x SEWIVVTSTL RHCKYSVTKL IEGKEYLFRV RAENRFGPGP PCVSKPLVAK Q10466 DPFGPPDAPD KPIVEDVTSN SMLVKWNEPK DNGSPILGYW LEKREVNSTH Q1046x DPFGPPDAPD KPIVEDVTSN SMLVKWNEPK DNGSPILGYW LEKREVNSTH Q10466 WSRVNKSLLN ALKANVDGLL EGLTYVFRVC AENAAGPGKF SPPSDPKTAH Q1046x WSRVNKSLLN ALKANVDGLL EGLTYVFRVC AENAAGPGKF SPPSDPKTAH Q10466 DPISPPGPPI PRVTDTSSTT IELEWEPPAF NGGGEIVGYF VDKQLVGTNK Q1046x DPISPPGPPI PRVTDTSSTT IELEWEPPAF NGGGEIVGYF VDKQLVGTNK Q10466 WSRCTEKMIK VRQYTVKEIR EGADYKLRVS AVNAAGEGPP GETQPVTVAE Q1046x WSRCTEKMIK VRQYTVKEIR EGADYKLRVS AVNAAGEGPP GETQPVTVAE Q10466 PQEPPAVELD VSVKGGIQIM AGKTLRIPAV VTGRPVPTKV WTKEEGELDK Q1046x PQEPPAVELD VSVKGGIQIM AGKTLRIPAV VTGRPVPTKV WTKEEGELDK Q10466 DRVVIDNVGT KSELIIKDAL RKDHGRYVIT ATNSCGSKFA AARVEVFDVP Q1046x DRVVIDNVGT KSELIIKDAL RKDHGRYVIT ATNSCGSKFA AARVEVFDVP Q10466 GPVLDLKPVV TNRKMCLLNW SDPEDDGGSE ITGFIIERKD AKMHTWRQPI Q1046x GPVLDLKPVV TNRKMCLLNW SDPEDDGGSE ITGFIIERKD AKMHTWRQPI Q10466 ETERSKCDIT GLLEGQEYKF RVIAKNKFGC GPPVEIGPIL AVDPLGPPTS Q1046x ETERSKCDIT GLLEGQEYKF RVIAKNKFGC GPPVEIGPIL AVDPLGPPTS Q10466 PERLTYTERQ RSTITLDWKE PRSNGGSPIQ GYIIEKRRHD KPDFERVNKR Q1046x PERLTYTERQ RSTITLDWKE PRSNGGSPIQ GYIIEKRRHD KPDFERVNKR Q10466 LCPTTSFLVE NLDEHQMYEF RVKAVNEIGE SEPSLPLNVV IQDDEVPPTI Q1046x LCPTTSFLVE NLDEHQMYEF RVKAVNEIGE SEPSLPLNVV IQDDEVPPTI Q10466 KLRLSVRGDT IKVKAGEPVH IPADVTGLPM PKIEWSKNET VIEKPTDALQ Q1046x KLRLSVRGDT IKVKAGEPVH IPADVTGLPM PKIEWSKNET VIEKPTDALQ Q10466 ITKEEVSRSE AKTELSIPKA VREDKGTYTV TASNRLGSVF RNVHVEVYDR Q1046x ITKEEVSRSE AKTELSIPKA VREDKGTYTV TASNRLGSVF RNVHVEVYDR Q10466 PSPPRNLAVT DIKAESCYLT WDAPLDNGGS EITHYVIDKR DASRKKAEWE Q1046x PSPPRNLAVT DIKAESCYLT WDAPLDNGGS EITHYVIDKR DASRKKAEWE Q10466 EVTNTAVEKR YGIWKLIPNG QYEFRVRAVN KYGISDECKS DKVVIQDPYR Q1046x EVTNTAVEKR YGIWKLIPNG QYEFRVRAVN KYGISDECKS DKVVIQDPYR Q10466 LPGPPGKPKV LARTKGSMLV SWTPPLDNGG SPITGYWLEK REEGSPYWSR Q1046x LPGPPGKPKV LARTKGSMLV SWTPPLDNGG SPITGYWLEK REEGSPYWSR Q10466 VSRAPITKVG LKGVEFNVPR LLEGVKYQFR AMAINAAGIG PPSEPSDPEV Q1046x VSRAPITKVG LKGVEFNVPR LLEGVKYQFR AMAINAAGIG PPSEPSDPEV Q10466 AGDPIFPPGP PSCPEVKDKT KSSISLGWKP PAKDGGSPIK GYIVEMQEEG Q1046x AGDPIFPPGP PSCPEVKDKT KSSISLGWKP PAKDGGSPIK GYIVEMQEEG Q10466 TTDWKRVNEP DKLITTCECV VPNLKELRKY RFRVKAVNEA GESEPSDTTG Q1046x TTDWKRVNEP DKLITTCECV VPNLKELRKY RFRVKAVNEA GESEPSDTTG Q10466 EIPATDIQEE PEVFIDIGAQ DCLVCKAGSQ IRIPAVIKGR PTPKSSWEFD Q1046x EIPATDIQEE PEVFIDIGAQ DCLVCKAGSQ IRIPAVIKGR PTPKSSWEFD Q10466 GKAKKAMKDG VHDIPEDAQL ETAENSSVII IPECKRSHTG KYSITAKNKA Q1046x GKAKKAMKDG VHDIPEDAQL ETAENSSVII IPECKRSHTG KYSITAKNKA Q10466 GQKTANCRVK VMDVPGPPKD LKVSDITRGS CRLSWKMPDD DGGDRIKGYV Q1046x GQKTANCRVK VMDVPGPPKD LKVSDITRGS CRLSWKMPDD DGGDRIKGYV Q10466 IEKRTIDGKA WTKVNPDCGS TTFVVPDLLS EQQYFFRVRA ENRFGIGPPV Q1046x IEKRTIDGKA WTKVNPDCGS TTFVVPDLLS EQQYFFRVRA ENRFGIGPPV Q10466 ETIQRTTARD PIYPPDPPIK LKIGLITKNT VHLSWKPPKN DGGSPVTHYI Q1046x ETIQRTTARD PIYPPDPPIK LKIGLITKNT VHLSWKPPKN DGGSPVTHYI Q10466 VECLAWDPTG TKKEAWRQCN KRDVEELQFT VEDLVEGGEY EFRVKAVNAA Q1046x VECLAWDPTG TKKEAWRQCN KRDVEELQFT VEDLVEGGEY EFRVKAVNAA Q10466 GVSKPSATVG PCDCQRPDMP PSIDLKEFME VEEGTNVNIV AKIKGVPFPT Q1046x GVSKPSATVG PCDCQRPDMP PSIDLKEFME VEEGTNVNIV AKIKGVPFPT Q10466 LTWFKAPPKK PDNKEPVLYD THVNKLVVDD TCTLVIPQSR RSDTGLYTIT Q1046x LTWFKAPPKK PDNKEPVLYD THVNKLVVDD TCTLVIPQSR RSDTGLYTIT Q10466 AVNNLGTASK EMRLNVLGRP GPPVGPIKFE SVSADQMTLS WFPPKDDGGS Q1046x AVNNLGTASK EMRLNVLGRP GPPVGPIKFE SVSADQMTLS WFPPKDDGGS Q10466 KITNYVIEKR EANRKTWVHV SSEPKECTYT IPKLLEGHEY VFRIMAQNKY Q1046x KITNYVIEKR EANRKTWVHV SSEPKECTYT IPKLLEGHEY VFRIMAQNKY Q10466 GIGEPLDSEP ETARNLFSVP GAPDKPTVSS VTRNSMTVNW EEPEYDGGSP Q1046x GIGEPLDSEP ETARNLFSVP GAPDKPTVSS VTRNSMTVNW EEPEYDGGSP Q10466 VTGYWLEMKD TTSKRWKRVN RDPIKAMTLG VSYKVTGLIE GSDYQFRVYA Q1046x VTGYWLEMKD TTSKRWKRVN RDPIKAMTLG VSYKVTGLIE GSDYQFRVYA Q10466 INAAGVGPAS LPSDPATARD PIAPPGPPFP KVTDWTKSSA DLEWSPPLKD Q1046x INAAGVGPAS LPSDPATARD PIAPPGPPFP KVTDWTKSSA DLEWSPPLKD Q10466 GGSKVTGYIV EYKEEGKEEW EKGKDKEVRG TKLVVTGLKE GAFYKFRVSA Q1046x GGSKVTGYIV EYKEEGKEEW EKGKDKEVRG TKLVVTGLKE GAFYKFRVSA Q10466 VNIAGIGEPG EVTDVIEMKD RLVSPDLQLD ASVRDRIVVH AGGVIRIIAY Q1046x VNIAGIGEPG EVTDVIEMKD RLVSPDLQLD ASVRDRIVVH AGGVIRIIAY Q10466 VSGKPPPTVT WNMNERTLPQ EATIETTAIS SSMVIKNCQR SHQGVYSLLA Q1046x VSGKPPPTVT WNMNERTLPQ EATIETTAIS SSMVIKNCQR SHQGVYSLLA Q10466 KNEAGERKKT IIVDVLDVPG PVGTPFLAHN LTNESCKLTW FSPEDDGGSP Q1046x KNEAGERKKT IIVDVLDVPG PVGTPFLAHN LTNESCKLTW FSPEDDGGSP Q10466 ITNYVIEKRE SDRRAWTPVT YTVTRQNATV QGLIQGKAYF FRIAAENSIG Q1046x ITNYVIEKRE SDRRAWTPVT YTVTRQNATV QGLIQGKAYF FRIAAENSIG Q10466 MGPFVETSEA LVIREPITVP ERPEDLEVKE VTKNTVTLTW NPPKYDGGSE Q1046x MGPFVETSEA LVIREPITVP ERPEDLEVKE VTKNTVTLTW NPPKYDGGSE Q10466 IINYVLESRL IGTEKFHKVT NDNLLSRKYT VKGLKEGDTY EYRVSAVNIV Q1046x IINYVLESRL IGTEKFHKVT NDNLLSRKYT VKGLKEGDTY EYRVSAVNIV Q10466 GQGKPSFCTK PITCKDELAP PTLHLDFRDK LTIRVGEAFA LTGRYSGKPK Q1046x GQGKPSFCTK PITCKDELAP PTLHLDFRDK LTIRVGEAFA LTGRYSGKPK Q10466 PKVSWFKDEA DVLEDDRTHI KTTPATLALE KIKAKRSDSG KYCVVVENST Q1046x PKVSWFKDEA DVLEDDRTHI KTTPATLALE KIKAKRSDSG KYCVVVENST Q10466 GSRKGFCQVN VVDHPGPPVG PVSFDEVTKD YMVISWKPPL DDGGSKITNY Q1046x GSRKGFCQVN VVDHPGPPVG PVSFDEVTKD YMVISWKPPL DDGGSKITNY Q10466 IIEKKEVGKD VWMPVTSASA KTTCKVSKLL EGKDYIFRIH AENLYGISDP Q1046x IIEKKEVGKD VWMPVTSASA KTTCKVSKLL EGKDYIFRIH AENLYGISDP Q10466 LVSDSMKAKD RFRVPDAPDQ PIVTEVTKDS ALVTWNKPHD GGKPITNYIL Q1046x LVSDSMKAKD RFRVPDAPDQ PIVTEVTKDS ALVTWNKPHD GGKPITNYIL Q10466 EKRETMSKRW ARVTKDPIHP YTKFRVPDLL EGCQYEFRVS AENEIGIGDP Q1046x EKRETMSKRW ARVTKDPIHP YTKFRVPDLL EGCQYEFRVS AENEIGIGDP Q10466 SPPSKPVFAK DPIAKPSPPV NPEAIDTTCN SVDLTWQPPR HDGGSKILGY Q1046x SPPSKPVFAK DPIAKPSPPV NPEAIDTTCN SVDLTWQPPR HDGGSKILGY Q10466 IVEYQKVGDE EWRRANHTPE SCPETKYKVT GLRDGQTYKF RVLAVNAAGE Q1046x IVEYQKVGDE EWRRANHTPE SCPETKYKVT GLRDGQTYKF RVLAVNAAGE Q10466 SDPAHVPEPV LVKDRLEPPE LILDANMARE QHIKVGDTLR LSAIIKGVPF Q1046x SDPAHVPEPV LVKDRLEPPE LILDANMARE QHIKVGDTLR LSAIIKGVPF Q10466 PKVTWKKEDR DAPTKARIDV TPVGSKLEIR NAAHEDGGIY SLTVENPAGS Q1046x PKVTWKKEDR DAPTKARIDV TPVGSKLEIR NAAHEDGGIY SLTVENPAGS Q10466 KTVSVKVLVL DKPGPPRDLE VSEIRKDSCY LTWKEPLDDG GSVITNYVVE Q1046x KTVSVKVLVL DKPGPPRDLE VSEIRKDSCY LTWKEPLDDG GSVITNYVVE Q10466 RRDVASAQWS PLSATSKKKS HFAKHLNEGN QYLFRVAAEN QYGRGPFVET Q1046x RRDVASAQWS PLSATSKKKS HFAKHLNEGN QYLFRVAAEN QYGRGPFVET Q10466 PKPIKALDPL HPPGPPKDLH HVDVDKTEVS LVWNKPDRDG GSPITGYLVE Q1046x PKPIKALDPL HPPGPPKDLH HVDVDKTEVS LVWNKPDRDG GSPITGYLVE Q10466 YQEEGTQDWI KFKTVTNLEC VVTGLQQGKT YRFRVKAENI VGLGLPDTTI Q1046x YQEEGTQDWI KFKTVTNLEC VVTGLQQGKT YRFRVKAENI VGLGLPDTTI Q10466 PIECQEKLVP PSVELDVKLI EGLVVKAGTT VRFPAIIRGV PVPTAKWTTD Q1046x PIECQEKLVP PSVELDVKLI EGLVVKAGTT VRFPAIIRGV PVPTAKWTTD Q10466 GSEIKTDEHY TVETDNFSSV LTIKNCLRRD TGEYQITVSN AAGSKTVAVH Q1046x GSEIKTDEHY TVETDNFSSV LTIKNCLRRD TGEYQITVSN AAGSKTVAVH Q10466 LTVLDVPGPP TGPINILDVT PEHMTISWQP PKDDGGSPVI NYIVEKQDTR Q1046x LTVLDVPGPP TGPINILDVT PEHMTISWQP PKDDGGSPVI NYIVEKQDTR Q10466 KDTWGVVSSG SSKTKLKIPH LQKGCEYVFR VRAENKIGVG PPLDSTPTVA Q1046x KDTWGVVSSG SSKTKLKIPH LQKGCEYVFR VRAENKIGVG PPLDSTPTVA Q10466 KHKFSPPSPP GKPVVTDITE NAATVSWTLP KSDGGSPITG YYMERREVTG Q1046x KHKFSPPSPP GKPVVTDITE NAATVSWTLP KSDGGSPITG YYMERREVTG Q10466 KWVRVNKTPI ADLKFRVTGL YEGNTYEFRV FAENLAGLSK PSPSSDPIKA Q1046x KWVRVNKTPI ADLKFRVTGL YEGNTYEFRV FAENLAGLSK PSPSSDPIKA Q10466 CRPIKPPGPP INPKLKDKSR ETADLVWTKP LSDGGSPILG YVVECQKPGT Q1046x CRPIKPPGPP INPKLKDKSR ETADLVWTKP LSDGGSPILG YVVECQKPGT Q10466 AQWNRINKDE LIRQCAFRVP GLIEGNEYRF RIKAANIVGE GEPRELAESV Q1046x AQWNRINKDE LIRQCAFRVP GLIEGNEYRF RIKAANIVGE GEPRELAESV Q10466 IAKDILHPPE VELDVTCRDV ITVRVGQTIR ILARVKGRPE PDITWTKEGK Q1046x IAKDILHPPE VELDVTCRDV ITVRVGQTIR ILARVKGRPE PDITWTKEGK Q10466 VLVREKRVDL IQDLPRVELQ IKEAVRADHG KYIISAKNSS GHAQGSAIVN Q1046x VLVREKRVDL IQDLPRVELQ IKEAVRADHG KYIISAKNSS GHAQGSAIVN Q10466 VLDRPGPCQN LKVTNVTKEN CTISWENPLD NGGSEITNFI VEYRKPNQKG Q1046x VLDRPGPCQN LKVTNVTKEN CTISWENPLD NGGSEITNFI VEYRKPNQKG Q10466 WSIVASDVTK RLIKANLLAN NEYYFRVCAE NKVGVGPTIE TKTPILAINP Q1046x WSIVASDVTK RLIKANLLAN NEYYFRVCAE NKVGVGPTIE TKTPILAINP Q10466 IDRPGEPENL HIADKGKTFV YLKWRRPDYD GGSPNLSYHV ERRLKGSDDW Q1046x IDRPGEPENL HIADKGKTFV YLKWRRPDYD GGSPNLSYHV ERRLKGSDDW Q10466 ERVHKGSIKE THYMVDRCVE NQIYEFRVQT KNEGGESDWV KTEEVVVKED Q1046x ERVHKGSIKE THYMVDRCVE NQIYEFRVQT KNEGGESDWV KTEEVVVKED Q10466 LQKPVLDLKL SGVLTVKAGD TIRLEAGVRG KPFPEVAWTK DKDATDLTRS Q1046x LQKPVLDLKL SGVLTVKAGD TIRLEAGVRG KPFPEVAWTK DKDATDLTRS Q10466 PRVKIDTRAD SSKFSLTKAK RSDGGKYVVT ATNTAGSFVA YATVNVLDKP Q1046x PRVKIDTRAD SSKFSLTKAK RSDGGKYVVT ATNTAGSFVA YATVNVLDKP Q10466 GPVRNLKIVD VSSDRCTVCW DPPEDDGGCE IQNYILEKCE TKRMVWSTYS Q1046x GPVRNLKIVD VSSDRCTVCW DPPEDDGGCE IQNYILEKCE TKRMVWSTYS Q10466 ATVLTPGTTV TRLIEGNEYI FRVRAENKIG TGPPTESKPV IAKTKYDKPG Q1046x ATVLTPGTTV TRLIEGNEYI FRVRAENKIG TGPPTESKPV IAKTKYDKPG Q10466 RPDPPEVTKV SKEEMTVVWN PPEYDGGKSI TGYFLEKKEK HSTRWVPVNK Q1046x RPDPPEVTKV SKEEMTVVWN PPEYDGGKSI TGYFLEKKEK HSTRWVPVNK Q10466 SAIPERRMKV QNLLPDHEYQ FRVKAENEIG IGEPSLPSRP VVAKDPIEPP Q1046x SAIPERRMKV QNLLPDHEYQ FRVKAENEIG IGEPSLPSRP VVAKDPIEPP Q10466 GPPTNFRVVD TTKHSITLGW GKPVYDGGAP IIGYVVEMRP KIADASPDEG Q1046x GPPTNFRVVD TTKHSITLGW GKPVYDGGAP IIGYVVEMRP KIADASPDEG Q10466 WKRCNAAAQL VRKEFTVTSL DENQEYEFRV CAQNQVGIGR PAELKEAIKP Q1046x WKRCNAAAQL VRKEFTVTSL DENQEYEFRV CAQNQVGIGR PAELKEAIKP Q10466 KEILEPPEID LDASMRKLVI VRAGCPIRLF AIVRGRPAPK VTWRKVGIDN Q1046x KEILEPPEID LDASMRKLVI VRAGCPIRLF AIVRGRPAPK VTWRKVGIDN Q10466 VVRKGQVDLV DTMAFLVIPN STRDDSGKYS LTLVNPAGEK AVFVNVRVLD Q1046x VVRKGQVDLV DTMAFLVIPN STRDDSGKYS LTLVNPAGEK AVFVNVRVLD Q10466 TPGPVSDLKV SDVTKTSCHV SWAPPENDGG SQVTHYIVEK READRKTWST Q1046x TPGPVSDLKV SDVTKTSCHV SWAPPENDGG SQVTHYIVEK READRKTWST Q10466 VTPEVKKTSF HVTNLVPGNE YYFRVTAVNE YGPGVPTDVP KPVLASDPLS Q1046x VTPEVKKTSF HVTNLVPGNE YYFRVTAVNE YGPGVPTDVP KPVLASDPLS Q10466 EPDPPRKLEA TEMTKNSATL AWLPPLRDGG AKIDGYIISY REEEQPADRW Q1046x EPDPPRKLEA TEMTKNSATL AWLPPLRDGG AKIDGYIISY REEEQPADRW Q10466 TEYSVVKDLS LVVTGLKEGK KYKFRVAARN AVGVSLPREA EGVYEAKEQL Q1046x TEYSVVKDLS LVVTGLKEGK KYKFRVAARN AVGVSLPREA EGVYEAKEQL Q10466 LPPKILMPEQ ITIKAGKKLR IEAHVYGKPH PTCKWKKGED EVVTSSHLAV Q1046x LPPKILMPEQ ITIKAGKKLR IEAHVYGKPH PTCKWKKGED EVVTSSHLAV Q10466 HKADSSSILI IKDVTRKDSG YYSLTAENSS GTDTQKIKVV VMDAPGPPQP Q1046x HKADSSSILI IKDVTRKDSG YYSLTAENSS GTDTQKIKVV VMDAPGPPQP Q10466 PFDISDIDAD ACSLSWHIPL EDGGSNITNY IVEKCDVSRG DWVTALASVT Q1046x PFDISDIDAD ACSLSWHIPL EDGGSNITNY IVEKCDVSRG DWVTALASVT Q10466 KTSCRVGKLI PGQEYIFRVR AENRFGISEP LTSPKMVAQF PFGVPSEPKN Q1046x KTSCRVGKLI PGQEYIFRVR AENRFGISEP LTSPKMVAQF PFGVPSEPKN Q10466 ARVTKVNKDC IFVAWDRPDS DGGSPIIGYL IERKERNSLL WVKANDTLVR Q1046x ARVTKVNKDC IFVAWDRPDS DGGSPIIGYL IERKERNSLL WVKANDTLVR Q10466 STEYPCAGLV EGLEYSFRIY ALNKAGSSPP SKPTEYVTAR MPVDPPGKPE Q1046x STEYPCAGLV EGLEYSFRIY ALNKAGSSPP SKPTEYVTAR MPVDPPGKPE Q10466 VIDVTKSTVS LIWARPKHDG GSKIIGYFVE ACKLPGDKWV RCNTAPHQIP Q1046x VIDVTKSTVS LIWARPKHDG GSKIIGYFVE ACKLPGDKWV RCNTAPHQIP Q10466 QEEYTATGLE EKAQYQFRAI ARTAVNISPP SEPSDPVTIL AENVPPRIDL Q1046x QEEYTATGLE EKAQYQFRAI ARTAVNISPP SEPSDPVTIL AENVPPRIDL Q10466 SVAMKSLLTV KAGTNVCLDA TVFGKPMPTV SWKKDGTLLK PAEGIKMAMQ Q1046x SVAMKSLLTV KAGTNVCLDA TVFGKPMPTV SWKKDGTLLK PAEGIKMAMQ Q10466 RNLCTLELFS VNRKDSGDYT ITAENSSGSK SATIKLKVLD KPGPPASVKI Q1046x RNLCTLELFS VNRKDSGDYT ITAENSSGSK SATIKLKVLD KPGPPASVKI Q10466 NKMYSDRAML SWEPPLEDGG SEITNYIVDK RETSRPNWAQ VSATVPITSC Q1046x NKMYSDRAML SWEPPLEDGG SEITNYIVDK RETSRPNWAQ VSATVPITSC Q10466 SVEKLIEGHE YQFRICAENK YGVGDPVFTE PAIAKNPYDP PGRCDPPVIS Q1046x SVEKLIEGHE YQFRICAENK YGVGDPVFTE PAIAKNPYDP PGRCDPPVIS Q10466 NITKDHMTVS WKPPADDGGS PITGYLLEKR ETQAVNWTKV NRKPIIERTL Q1046x NITKDHMTVS WKPPADDGGS PITGYLLEKR ETQAVNWTKV NRKPIIERTL Q10466 KATGLQEGTE YEFRVTAINK AGPGKPSDAS KAAYARDPQY PPAPPAFPKV Q1046x KATGLQEGTE YEFRVTAINK AGPGKPSDAS KAAYARDPQY PPAPPAFPKV Q10466 YDTTRSSVSL SWGKPAYDGG SPIIGYLVEV KRADSDNWVR CNLPQNLQKT Q1046x YDTTRSSVSL SWGKPAYDGG SPIIGYLVEV KRADSDNWVR CNLPQNLQKT Q10466 RFEVTGLMED TQYQFRVYAV NKIGYSDPSD VPDKHYPKDI LIPPEGEHDA Q1046x RFEVTGLMED TQYQFRVYAV NKIGYSDPSD VPDKHYPKDI LIPPEGEHDA Q10466 DLRKTLILRA GVTMRLYVPV KGRPPPKITW SKPNVNLRDR IGLDIKSTDF Q1046x DLRKTLILRA GVTMRLYVPV KGRPPPKITW SKPNVNLRDR IGLDIKSTDF Q10466 DTFLRCENVN KYDAGKYILT LENSCGKKEY TIVVKVLDTP GPPINVTVKE Q1046x DTFLRCENVN KYDAGKYILT LENSCGKKEY TIVVKVLDTP GPPINVTVKE Q10466 ISKDSAYVTW EPPIIDGGSP IINYVVQKRD AERKSWSTVT TECSKTSFRV Q1046x ISKDSAYVTW EPPIIDGGSP IINYVVQKRD AERKSWSTVT TECSKTSFRV Q10466 PNLEEGKSYF FRVFAENEYG IGDPGETRDA VKASQTPGPV VDLKVRSVSK Q1046x PNLEEGKSYF FRVFAENEYG IGDPGETRDA VKASQTPGPV VDLKVRSVSK Q10466 SSCSIGWKKP HSDGGSRIIG YVVDFLTEEN KWQRVMKSLS LQYSAKDLTE Q1046x SSCSIGWKKP HSDGGSRIIG YVVDFLTEEN KWQRVMKSLS LQYSAKDLTE Q10466 GKEYTFRVSA ENENGEGTPS EITVVARDDV VAPDLDLKGL PDLCYLAKEN Q1046x GKEYTFRVSA ENENGEGTPS EITVVARDDV VAPDLDLKGL PDLCYLAKEN Q10466 SNFRLKIPIK GKPAPSVSWK KGEDPLATDT RVSVESSAVN TTLIVYDCQK Q1046x SNFRLKIPIK GKPAPSVSWK KGEDPLATDT RVSVESSAVN TTLIVYDCQK Q10466 SDAGKYTITL KNVAGTKEGT ISIKVVGKPG IPTGPIKFDE VTAEAMTLKW Q1046x SDAGKYTITL KNVAGTKEGT ISIKVVGKPG IPTGPIKFDE VTAEAMTLKW Q10466 APPKDDGGSE ITNYILEKRD SVNNKWVTCA SAVQKTTFRV TRLHEGMEYT Q1046x APPKDDGGSE ITNYILEKRD SVNNKWVTCA SAVQKTTFRV TRLHEGMEYT Q10466 FRVSAENKYG VGEGLKSEPI VARHPFDVPD APPPPNIVDV RHDSVSLTWT Q1046x FRVSAENKYG VGEGLKSEPI VARHPFDVPD APPPPNIVDV RHDSVSLTWT Q10466 DPKKTGGSPI TGYHLEFKER NSLLWKRANK TPIRMRDFKV TGLTEGLEYE Q1046x DPKKTGGSPI TGYHLEFKER NSLLWKRANK TPIRMRDFKV TGLTEGLEYE Q10466 FRVMAINLAG VGKPSLPSEP VVALDPIDPP GKPEVINITR NSVTLIWTEP Q1046x FRVMAINLAG VGKPSLPSEP VVALDPIDPP GKPEVINITR NSVTLIWTEP Q10466 KYDGGHKLTG YIVEKRDLPS KSWMKANHVN VPECAFTVTD LVEGGKYEFR Q1046x KYDGGHKLTG YIVEKRDLPS KSWMKANHVN VPECAFTVTD LVEGGKYEFR Q10466 IRAKNTAGAI SAPSESTETI ICKDEYEAPT IVLDPTIKDG LTIKAGDTIV Q1046x IRAKNTAGAI SAPSESTETI ICKDEYEAPT IVLDPTIKDG LTIKAGDTIV Q10466 LNAISILGKP LPKSSWSKAG KDIRPSDITQ ITSTPTSSML TIKYATRKDA Q1046x LNAISILGKP LPKSSWSKAG KDIRPSDITQ ITSTPTSSML TIKYATRKDA Q10466 GEYTITATNP FGTKVEHVKV TVLDVPGPPG PVEISNVSAE KATLTWTPPL Q1046x GEYTITATNP FGTKVEHVKV TVLDVPGPPG PVEISNVSAE KATLTWTPPL Q10466 EDGGSPIKSY ILEKRETSRL LWTVVSEDIQ SCRHVATKLI QGNEYIFRVS Q1046x EDGGSPIKSY ILEKRETSRL LWTVVSEDIQ SCRHVATKLI QGNEYIFRVS Q10466 AVNHYGKGEP VQSEPVKMVD RFGPPGPPEK PEVSNVTKNT ATVSWKRPVD Q1046x AVNHYGKGEP VQSEPVKMVD RFGPPGPPEK PEVSNVTKNT ATVSWKRPVD Q10466 DGGSEITGYH VERREKKSLR WVRAIKTPVS DLRCKVTGLQ EGSTYEFRVS Q1046x DGGSEITGYH VERREKKSLR WVRAIKTPVS DLRCKVTGLQ EGSTYEFRVS Q10466 AENRAGIGPP SEASDSVLMK DAAYPPGPPS NPHVTDTTKK SASLAWGKPH Q1046x AENRAGIGPP SEASDSVLMK DAAYPPGPPS NPHVTDTTKK SASLAWGKPH Q10466 YDGGLEITGY VVEHQKVGDE AWIKDTTGTA LRITQFVVPD LQTKEKYNFR Q1046x YDGGLEITGY VVEHQKVGDE AWIKDTTGTA LRITQFVVPD LQTKEKYNFR Q10466 ISAINDAGVG EPAVIPDVEI VEREMAPDFE LDAELRRTLV VRAGLSIRIF Q1046x ISAINDAGVG EPAVIPDVEI VEREMAPDFE LDAELRRTLV VRAGLSIRIF Q10466 VPIKGRPAPE VTWTKDNINL KNRANIENTE SFTLLIIPEC NRYDTGKFVM Q1046x VPIKGRPAPE VTWTKDNINL KNRANIENTE SFTLLIIPEC NRYDTGKFVM Q10466 TIENPAGKKS GFVNVRVLDT PGPVLNLRPT DITKDSVTLH WDLPLIDGGS Q1046x TIENPAGKKS GFVNVRVLDT PGPVLNLRPT DITKDSVTLH WDLPLIDGGS Q10466 RITNYIVEKR EATRKSYSTA TTKCHKCTYK VTGLSEGCEY FFRVMAENEY Q1046x RITNYIVEKR EATRKSYSTA TTKCHKCTYK VTGLSEGCEY FFRVMAENEY Q10466 GIGEPTETTE PVKASEAPSP PDSLNIMDIT KSTVSLAWPK PKHDGGSKIT Q1046x GIGEPTETTE PVKASEAPSP PDSLNIMDIT KSTVSLAWPK PKHDGGSKIT Q10466 GYVIEAQRKG SDQWTHITTV KGLECVVRNL TEGEEYTFQV MAVNSAGRSA Q1046x GYVIEAQRKG SDQWTHITTV KGLECVVRNL TEGEEYTFQV MAVNSAGRSA Q10466 PRESRPVIVK EQTMLPELDL RGIYQKLVIA KAGDNIKVEI PVLGRPKPTV Q1046x PRESRPVIVK EQTMLPELDL RGIYQKLVIA KAGDNIKVEI PVLGRPKPTV Q10466 TWKKGDQILK QTQRVNFETT ATSTILNINE CVRSDSGPYP LTARNIVGEV Q1046x TWKKGDQILK QTQRVNFETT ATSTILNINE CVRSDSGPYP LTARNIVGEV Q10466 GDVITIQVHD IPGPPTGPIK FDEVSSDFVT FSWDPPENDG GVPISNYVVE Q1046x GDVITIQVHD IPGPPTGPIK FDEVSSDFVT FSWDPPENDG GVPISNYVVE Q10466 MRQTDSTTWV ELATTVIRTT YKATRLTTGL EYQFRVKAQN RYGVGPGITS Q1046x MRQTDSTTWV ELATTVIRTT YKATRLTTGL EYQFRVKAQN RYGVGPGITS Q10466 AWIVANYPFK VPGPPGTPQV TAVTKDSMTI SWHEPLSDGG SPILGYHVER Q1046x AWIVANYPFK VPGPPGTPQV TAVTKDSMTI SWHEPLSDGG SPILGYHVER Q10466 KERNGILWQT VSKALVPGNI FKSSGLTDGI AYEFRVIAEN MAGKSKPSKP Q1046x KERNGILWQT VSKALVPGNI FKSSGLTDGI AYEFRVIAEN MAGKSKPSKP Q10466 SEPMLALDPI DPPGKPVPLN ITRHTVTLKW AKPEYTGGFK ITSYIVEKRD Q1046x SEPMLALDPI DPPGKPVPLN ITRHTVTLKW AKPEYTGGFK ITSYIVEKRD Q10466 LPNGRWLKAN FSNILENEFT VSGLTEDAAY EFRVIAKNAA GAISPPSEPS Q1046x LPNGRWLKAN FSNILENEFT VSGLTEDAAY EFRVIAKNAA GAISPPSEPS Q10466 DAITCRDDVE APKIKVDVKF KDTVILKAGE AFRLEADVSG RPPPTMEWSK Q1046x DAITCRDDVE APKIKVDVKF KDTVILKAGE AFRLEADVSG RPPPTMEWSK Q10466 DGKELEGTAK LEIKIADFST NLVNKDSTRR DSGAYTLTAT NPGGFAKHIF Q1046x DGKELEGTAK LEIKIADFST NLVNKDSTRR DSGAYTLTAT NPGGFAKHIF Q10466 NVKVLDRPGP PEGPLAVTEV TSEKCVLSWF PPLDDGGAKI DHYIVQKRET Q1046x NVKVLDRPGP PEGPLAVTEV TSEKCVLSWF PPLDDGGAKI DHYIVQKRET Q10466 SRLAWTNVAS EVQVTKLKVT KLLKGNEYIF RVMAVNKYGV GEPLESEPVL Q1046x SRLAWTNVAS EVQVTKLKVT KLLKGNEYIF RVMAVNKYGV GEPLESEPVL Q10466 AVNPYGPPDP PKNPEVTTIT KDSMVVCWGH PDSDGGSEII NYIVERRDKA Q1046x AVNPYGPPDP PKNPEVTTIT KDSMVVCWGH PDSDGGSEII NYIVERRDKA Q10466 GQRWIKCNKK TLTDLRYKVS GLTEGHEYEF RIMAENAAGI SAPSPTSPFY Q1046x GQRWIKCNKK TLTDLRYKVS GLTEGHEYEF RIMAENAAGI SAPSPTSPFY Q10466 KACDTVFKPG PPGNPRVLDT SRSSISIAWN KPIYDGGSEI TGYMVEIALP Q1046x KACDTVFKPG PPGNPRVLDT SRSSISIAWN KPIYDGGSEI TGYMVEIALP Q10466 EEDEWQIVTP PAGLKATSYT ITGLTENQEY KIRIYAMNSE GLGEPALVPG Q1046x EEDEWQIVTP PAGLKATSYT ITGLTENQEY KIRIYAMNSE GLGEPALVPG Q10466 TPKAEDRMLP PEIELDADLR KVVTIRACCT LRLFVPIKGR PDPEVKWARD Q1046x TPKAEDRMLP PEIELDADLR KVVTIRACCT LRLFVPIKGR PDPEVKWARD Q10466 HGESLDKASI ESASSYTLLI VGNVNRFDSG KYILTVENSS GSKSAFVNVR Q1046x HGESLDKASI ESASSYTLLI VGNVNRFDSG KYILTVENSS GSKSAFVNVR Q10466 VLDTPGPPQD LKVKEVTKTS VTLTWDPPLL DGGSKIKNYI VEKRESTRKA Q1046x VLDTPGPPQD LKVKEVTKTS VTLTWDPPLL DGGSKIKNYI VEKRESTRKA Q10466 YSTVATNCHK TSWKVDQLQE GCSYYFRVLA ENEYGIGLPA ETAESVKASE Q1046x YSTVATNCHK TSWKVDQLQE GCSYYFRVLA ENEYGIGLPA ETAESVKASE Q10466 RPLPPGKITL MDVTRNSVSL SWEKPEHDGG SRILGYIVEM QTKGSDKWAT Q1046x RPLPPGKITL MDVTRNSVSL SWEKPEHDGG SRILGYIVEM QTKGSDKWAT Q10466 CATVKVTEAT ITGLIQGEEY SFRVSAQNEK GISDPRQLSV PVIAKDLVIP Q1046x CATVKVTEAT ITGLIQGEEY SFRVSAQNEK GISDPRQLSV PVIAKDLVIP Q10466 PAFKLLFNTF TVLAGEDLKV DVPFIGRPTP AVTWHKDNVP LKQTTRVNAE Q1046x PAFKLLFNTF TVLAGEDLKV DVPFIGRPTP AVTWHKDNVP LKQTTRVNAE Q10466 STENNSLLTI KDACREDVGH YVVKLTNSAG EAIETLNVIV LDKPGPPTGP Q1046x STENNSLLTI KDACREDVGH YVVKLTNSAG EAIETLNVIV LDKPGPPTGP Q10466 VKMDEVTADS ITLSWGPPKY DGGSSINNYI VEKRDTSTTT WQIVSATVAR Q1046x VKMDEVTADS ITLSWGPPKY DGGSSINNYI VEKRDTSTTT WQIVSATVAR Q10466 TTIKACRLKT GCEYQFRIAA ENRYGKSTYL NSEPTVAQYP FKVPGPPGTP Q1046x TTIKACRLKT GCEYQFRIAA ENRYGKSTYL NSEPTVAQYP FKVPGPPGTP Q10466 VVTLSSRDSM EVQWNEPISD GGSRVIGYHL ERKERNSILW VKLNKTPIPQ Q1046x VVTLSSRDSM EVQWNEPISD GGSRVIGYHL ERKERNSILW VKLNKTPIPQ Q10466 TKFKTTGLEE GVEYEFRVSA ENIVGIGKPS KVSECYVARD PCDPPGRPEA Q1046x TKFKTTGLEE GVEYEFRVSA ENIVGIGKPS KVSECYVARD PCDPPGRPEA Q10466 IIVTRNSVTL QWKKPTYDGG SKITGYIVEK KELPEGRWMK ASFTNIIDTH Q1046x IIVTRNSVTL QWKKPTYDGG SKITGYIVEK KELPEGRWMK ASFTNIIDTH Q10466 FEVTGLVEDH RYEFRVIARN AAGVFSEPSE STGAITARDE VDPPRISMDP Q1046x FEVTGLVEDH RYEFRVIARN AAGVFSEPSE STGAITARDE VDPPRISMDP Q10466 KYKDTIVVHA GESFKVDADI YGKPIPTIQW IKGDQELSNT ARLEIKSTDF Q1046x KYKDTIVVHA GESFKVDADI YGKPIPTIQW IKGDQELSNT ARLEIKSTDF Q10466 ATSLSVKDAV RVDSGNYILK AKNVAGERSV TVNVKVLDRP GPPEGPVVIS Q1046x ATSLSVKDAV RVDSGNYILK AKNVAGERSV TVNVKVLDRP GPPEGPVVIS Q10466 GVTAEKCTLA WKPPLQDGGS DIINYIVERR ETSRLVWTVV DANVQTLSCK Q1046x GVTAEKCTLA WKPPLQDGGS DIINYIVERR ETSRLVWTVV DANVQTLSCK Q10466 VTKLLEGNEY TFRIMAVNKY GVGEPLESEP VVAKNPFVVP DAPKAPEVTT Q1046x VTKLLEGNEY TFRIMAVNKY GVGEPLESEP VVAKNPFVVP DAPKAPEVTT Q10466 VTKDSMIVVW ERPASDGGSE ILGYVLEKRD KEGIRWTRCH KRLIGELRLR Q1046x VTKDSMIVVW ERPASDGGSE ILGYVLEKRD KEGIRWTRCH KRLIGELRLR Q10466 VTGLIENHDY EFRVSAENAA GLSEPSPPSA YQKACDPIYK PGPPNNPKVI Q1046x VTGLIENHDY EFRVSAENAA GLSEPSPPSA YQKACDPIYK PGPPNNPKVI Q10466 DITRSSVFLS WSKPIYDGGC EIQGYIVEKC DVNVGEWTMC TPPTGINKTN Q1046x DITRSSVFLS WSKPIYDGGC EIQGYIVEKC DVNVGEWTMC TPPTGINKTN Q10466 IEVEKLLEKH EYNFRICAIN KAGVGEHADV PGPIIVEEKL EAPDIDLDLE Q1046x IEVEKLLEKH EYNFRICAIN KAGVGEHADV PGPIIVEEKL EAPDIDLDLE Q10466 LRKIINIRAG GSLRLFVPIK GRPTPEVKWG KVDGEIRDAA IIDVTSSFTS Q1046x LRKIINIRAG GSLRLFVPIK GRPTPEVKWG KVDGEIRDAA IIDVTSSFTS Q10466 LVLDNVNRYD SGKYTLTLEN SSGTKSAFVT VRVLDTPSPP VNLKVTEITK Q1046x LVLDNVNRYD SGKYTLTLEN SSGTKSAFVT VRVLDTPSPP VNLKVTEITK Q10466 DSVSITWEPP LLDGGSKIKN YIVEKREATR KSYAAVVTNC HKNSWKIDQL Q1046x DSVSITWEPP LLDGGSKIKN YIVEKREATR KSYAAVVTNC HKNSWKIDQL Q10466 QEGCSYYFRV TAENEYGIGL PAQTADPIKV AEVPQPPGKI TVDDVTRNSV Q1046x QEGCSYYFRV TAENEYGIGL PAQTADPIKV AEVPQPPGKI TVDDVTRNSV Q10466 SLSWTKPEHD GGSKIIQYIV EMQAKHSEKW SECARVKSLQ AVITNLTQGE Q1046x SLSWTKPEHD GGSKIIQYIV EMQAKHSEKW SECARVKSLQ AVITNLTQGE Q10466 EYLFRVVAVN EKGRSDPRSL AVPIVAKDLV IEPDVKPAFS SYSVQVGQDL Q1046x EYLFRVVAVN EKGRSDPRSL AVPIVAKDLV IEPDVKPAFS SYSVQVGQDL Q10466 KMEVPISGRP KPTITWTKDG LPLKQTTRIN VTDSLDLTTL SIKETHKDDG Q1046x KMEVPISGRP KPTITWTKDG LPLKQTTRIN VTDSLDLTTL SIKETHKDDG Q10466 GQYGITVANV VGQKTASIEI VTLDKPDPPK GPVKFDDVSA ESITLSWNPP Q1046x GQYGITVANV VGQKTASIEI VTLDKPDPPK GPVKFDDVSA ESITLSWNPP Q10466 LYTGGCQITN YIVQKRDTTT TVWDVVSATV ARTTLKVTKL KTGTEYQFRI Q1046x LYTGGCQITN YIVQKRDTTT TVWDVVSATV ARTTLKVTKL KTGTEYQFRI Q10466 FAENRYGQSF ALESDPIVAQ YPYKEPGPPG TPFATAISKD SMVIQWHEPV Q1046x FAENRYGQSF ALESDPIVAQ YPYKEPGPPG TPFATAISKD SMVIQWHEPV Q10466 NNGGSPVIGY HLERKERNSI LWTKVNKTII HDTQFKAQNL EEGIEYEFRV Q1046x NNGGSPVIGY HLERKERNSI LWTKVNKTII HDTQFKAQNL EEGIEYEFRV Q10466 YAENIVGVGK ASKNSECYVA RDPCDPPGTP EPIMVKRNEI TLQWTKPVYD Q1046x YAENIVGVGK ASKNSECYVA RDPCDPPGTP EPIMVKRNEI TLQWTKPVYD Q10466 GGSMITGYIV EKRDLPDGRW MKASFTNVIE TQFTVSGLTE DQRYEFRVIA Q1046x GGSMITGYIV EKRDLPDGRW MKASFTNVIE TQFTVSGLTE DQRYEFRVIA Q10466 KNAAGAISKP SDSTGPITAK DEVELPRISM DPKFRDTIVV NAGETFRLEA Q1046x KNAAGAISKP SDSTGPITAK DEVELPRISM DPKFRDTIVV NAGETFRLEA Q10466 DVHGKPLPTI EWLRGDKEIE ESARCEIKNT DFKALLIVKD AIRIDGGQYI Q1046x DVHGKPLPTI EWLRGDKEIE ESARCEIKNT DFKALLIVKD AIRIDGGQYI Q10466 LRASNVAGSK SFPVNVKVLD RPGPPEGPVQ VTGVTSEKCS LTWSPPLQDG Q1046x LRASNVAGSK SFPVNVKVLD RPGPPEGPVQ VTGVTSEKCS LTWSPPLQDG Q10466 GSDISHYVVE KRETSRLAWT VVASEVVTNS LKVTKLLEGN EYVFRIMAVN Q1046x GSDISHYVVE KRETSRLAWT VVASEVVTNS LKVTKLLEGN EYVFRIMAVN Q10466 KYGVGEPLES APVLMKNPFV LPGPPKSLEV TNIAKDSMTV CWNRPDSDGG Q1046x KYGVGEPLES APVLMKNPFV LPGPPKSLEV TNIAKDSMTV CWNRPDSDGG Q10466 SEIIGYIVEK RDRSGIRWIK CNKRRITDLR LRVTGLTEDH EYEFRVSAEN Q1046x SEIIGYIVEK RDRSGIRWIK CNKRRITDLR LRVTGLTEDH EYEFRVSAEN Q10466 AAGVGEPSPA TVYYKACDPV FKPGPPTNAH IVDTTKNSIT LAWGKPIYDG Q1046x AAGVGEPSPA TVYYKACDPV FKPGPPTNAH IVDTTKNSIT LAWGKPIYDG Q10466 GSEILGYVVE ICKADEEEWQ IVTPQTGLRV TRFEISKLTE HQEYKIRVCA Q1046x GSEILGYVVE ICKADEEEWQ IVTPQTGLRV TRFEISKLTE HQEYKIRVCA Q10466 LNKVGLGEAT SVPGTVKPED KLEAPELDLD SELRKGIVVR AGGSARIHIP Q1046x LNKVGLGEAT SVPGTVKPED KLEAPELDLD SELRKGIVVR AGGSARIHIP Q10466 FKGRPMPEIT WSREEGEFTD KVQIEKGVNY TQLSIDNCDR NDAGKYILKL Q1046x FKGRPMPEIT WSREEGEFTD KVQIEKGVNY TQLSIDNCDR NDAGKYILKL Q10466 ENSSGSKSAF VTVKVLDTPG PPQNLAVKEV RKDSAFLVWE PPIIDGGAKV Q1046x ENSSGSKSAF VTVKVLDTPG PPQNLAVKEV RKDSAFLVWE PPIIDGGAKV Q10466 KNYVIDKRES TRKAYANVSS KCSKTSFKVE NLTEGAIYYF RVMAENEFGV Q1046x KNYVIDKRES TRKAYANVSS KCSKTSFKVE NLTEGAIYYF RVMAENEFGV Q10466 GVPVETVDAV KAAEPPSPPG KVTLTDVSQT SASLMWEKPE HDGGSRVLGY Q1046x GVPVETVDAV KAAEPPSPPG KVTLTDVSQT SASLMWEKPE HDGGSRVLGY Q10466 VVEMQPKGTE KWSIVAESKV CNAVVTGLSS GQEYQFRVKA YNEKGKSDPR Q1046x VVEMQPKGTE KWSIVAESKV CNAVVTGLSS GQEYQFRVKA YNEKGKSDPR Q10466 VLGVPVIAKD LTIQPSLKLP FNTYSIQAGE DLKIEIPVIG RPRPNISWVK Q1046x VLGVPVIAKD LTIQPSLKLP FNTYSIQAGE DLKIEIPVIG RPRPNISWVK Q10466 DGEPLKQTTR VNVEETATST VLHIKEGNKD DFGKYTVTAT NSAGTATENL Q1046x DGEPLKQTTR VNVEETATST VLHIKEGNKD DFGKYTVTAT NSAGTATENL Q10466 SVIVLEKPGP PVGPVRFDEV SADFVVISWE PPAYTGGCQI SNYIVEKRDT Q1046x SVIVLEKPGP PVGPVRFDEV SADFVVISWE PPAYTGGCQI SNYIVEKRDT Q10466 TTTTWHMVSA TVARTTIKIT KLKTGTEYQF RIFAENRYGK SAPLDSKAVI Q1046x TTTTWHMVSA TVARTTIKIT KLKTGTEYQF RIFAENRYGK SAPLDSKAVI Q10466 VQYPFKEPGP PGTPFVTSIS KDQMLVQWHE PVNDGGTKII GYHLEQKEKN Q1046x VQYPFKEPGP PGTPFVTSIS KDQMLVQWHE PVNDGGTKII GYHLEQKEKN Q10466 SILWVKLNKT PIQDTKFKTT GLDEGLEYEF KVSAENIVGI GKPSKVSECF Q1046x SILWVKLNKT PIQDTKFKTT GLDEGLEYEF KVSAENIVGI GKPSKVSECF Q10466 VARDPCDPPG RPEAIVITRN NVTLKWKKPA YDGGSKITGY IVEKKDLPDG Q1046x VARDPCDPPG RPEAIVITRN NVTLKWKKPA YDGGSKITGY IVEKKDLPDG Q10466 RWMKASFTNV LETEFTVSGL VEDQRYEFRV IARNAAGNFS EPSDSSGAIT Q1046x RWMKASFTNV LETEFTVSGL VEDQRYEFRV IARNAAGNFS EPSDSSGAIT Q10466 ARDEIDAPNA SLDPKYKDVI VVHAGETFVL EADIRGKPIP DVVWSKDGKE Q1046x ARDEIDAPNA SLDPKYKDVI VVHAGETFVL EADIRGKPIP DVVWSKDGKE Q10466 LEETAARMEI KSTIQKTTLV VKDCIRTDGG QYILKLSNVG GTKSIPITVK Q1046x LEETAARMEI KSTIQKTTLV VKDCIRTDGG QYILKLSNVG GTKSIPITVK Q10466 VLDRPGSPEG PLKVTGVTAE KCYLAWNPPL QDGGANISHY IIEKRETSRL Q1046x VLDRPGSPEG PLKVTGVTAE KCYLAWNPPL QDGGANISHY IIEKRETSRL Q10466 SWTQVSTEVQ ALNYKVTKLL PGNEYIFRVM AVNKYGIGEP LESGPVTACN Q1046x SWTQVSTEVQ ALNYKVTKLL PGNEYIFRVM AVNKYGIGEP LESGPVTACN Q10466 PYKPPGPPST PEVSAITKDS MVVTWARPVD DGGTEIEGYI LEKRDKEGVR Q1046x PYKPPGPPST PEVSAITKDS MVVTWARPVD DGGTEIEGYI LEKRDKEGVR Q10466 WTKCNKKTLT DLRLRVTGLT EGHSYEFRVA AENAAGVGEP SEPSVFYRAC Q1046x WTKCNKKTLT DLRLRVTGLT EGHSYEFRVA AENAAGVGEP SEPSVFYRAC Q10466 DALYPPGPPS NPKVTDTSRS SVSLAWSKPI YDGGAPVKGY VVEVKEAAAD Q1046x DALYPPGPPS NPKVTDTSRS SVSLAWSKPI YDGGAPVKGY VVEVKEAAAD Q10466 EWTTCTPPTG LQGKQFTVTK LKENTEYNFR ICAINSEGVG EPATLPGSVV Q1046x EWTTCTPPTG LQGKQFTVTK LKENTEYNFR ICAINSEGVG EPATLPGSVV Q10466 AQERIEPPEI ELDADLRKVV VLRASATLRL FVTIKGRPEP EVKWEKAEGI Q1046x AQERIEPPEI ELDADLRKVV VLRASATLRL FVTIKGRPEP EVKWEKAEGI Q10466 LTDRAQIEVT SSFTMLVIDN VTRFDSGRYN LTLENNSGSK TAFVNVRVLD Q1046x LTDRAQIEVT SSFTMLVIDN VTRFDSGRYN LTLENNSGSK TAFVNVRVLD Q10466 SPSAPVNLTI REVKKDSVTL SWEPPLIDGG AKITNYIVEK RETTRKAYAT Q1046x SPSAPVNLTI REVKKDSVTL SWEPPLIDGG AKITNYIVEK RETTRKAYAT Q10466 ITNNCTKTTF RIENLQEGCS YYFRVLASNE YGIGLPAETT EPVKVSEPPL Q1046x ITNNCTKTTF RIENLQEGCS YYFRVLASNE YGIGLPAETT EPVKVSEPPL Q10466 PPGRVTLVDV TRNTATIKWE KPESDGGSKI TGYVVEMQTK GSEKWSTCTQ Q1046x PPGRVTLVDV TRNTATIKWE KPESDGGSKI TGYVVEMQTK GSEKWSTCTQ Q10466 VKTLEATISG LTAGEEYVFR VAAVNEKGRS DPRQLGVPVI ARDIEIKPSV Q1046x VKTLEATISG LTAGEEYVFR VAAVNEKGRS DPRQLGVPVI ARDIEIKPSV Q10466 ELPFHTFNVK AREQLKIDVP FKGRPQATVN WRKDGQTLKE TTRVNVSSSK Q1046x ELPFHTFNVK AREQLKIDVP FKGRPQATVN WRKDGQTLKE TTRVNVSSSK Q10466 TVTSLSIKEA SKEDVGTYEL CVSNSAGSIT VPITIIVLDR PGPPGPIRID Q1046x TVTSLSIKEA SKEDVGTYEL CVSNSAGSIT VPITIIVLDR PGPPGPIRID Q10466 EVSCDSITIS WNPPEYDGGC QISNYIVEKK ETTSTTWHIV SQAVARTSIK Q1046x EVSCDSITIS WNPPEYDGGC QISNYIVEKK ETTSTTWHIV SQAVARTSIK Q10466 IVRLTTGSEY QFRVCAENRY GKSSYSESSA VVAEYPFSPP GPPGTPKVVH Q1046x IVRLTTGSEY QFRVCAENRY GKSSYSESSA VVAEYPFSPP GPPGTPKVVH Q10466 ATKSTMLVTW QVPVNDGGSR VIGYHLEYKE RSSILWSKAN KILIADTQVK Q1046x ATKSTMLVTW QVPVNDGGSR VIGYHLEYKE RSSILWSKAN KILIADTQVK Q10466 VSGLDEGLMY EYRVYAENIA GIGKCSKSCE PVPARDPCDP PGQPEVTNIT Q1046x VSGLDEGLMY EYRVYAENIA GIGKCSKSCE PVPARDPCDP PGQPEVTNIT Q10466 RKSVSLKWSK PHYDGGAKIT GYIVERRELP DGRWLKCNYT NIQETYFEVT Q1046x RKSVSLKWSK PHYDGGAKIT GYIVERRELP DGRWLKCNYT NIQETYFEVT Q10466 ELTEDQRYEF RVFARNAADS VSEPSESTGP IIVKDDVEPP RVMMDVKFRD Q1046x ELTEDQRYEF RVFARNAADS VSEPSESTGP IIVKDDVEPP RVMMDVKFRD Q10466 VIVVKAGEVL KINADIAGRP LPVISWAKDG IEIEERARTE IISTDNHTLL Q1046x VIVVKAGEVL KINADIAGRP LPVISWAKDG IEIEERARTE IISTDNHTLL Q10466 TVKDCIRRDT GQYVLTLKNV AGTRSVAVNC KVLDKPGPPA GPLEINGLTA Q1046x TVKDCIRRDT GQYVLTLKNV AGTRSVAVNC KVLDKPGPPA GPLEINGLTA Q10466 EKCSLSWGRP QEDGGADIDY YHRKKRETSH LAWTICEGEL QMTSCKVTKL Q1046x EKCSLSWGRP QEDGGADIDY YHRKKRETSH LAWTICEGEL QMTSCKVTKL Q10466 LKGNEYIFRV TGVNKYGVGE PLESVAIKAL DPFTVPSPPT SLEITSVTKE Q1046x LKGNEYIFRV TGVNKYGVGE PLESVAIKAL DPFTVPSPPT SLEITSVTKE Q10466 SMTLCWSRPE SDGGSEISGY IIERREKNSL RWVRVNKKPV YDLRVKSTGL Q1046x SMTLCWSRPE SDGGSEISGY IIERREKNSL RWVRVNKKPV YDLRVKSTGL Q10466 REGCEYEYRV YAENAAGLSL PSETSPLIRA EDPVFLPSPP SKPKIVDSGK Q1046x REGCEYEYRV YAENAAGLSL PSETSPLIRA EDPVFLPSPP SKPKIVDSGK Q10466 TTITIAWVKP LFDGGAPITG YTVEYKKSDD TDWKTSIQSL RGTEYTISGL Q1046x TTITIAWVKP LFDGGAPITG YTVEYKKSDD TDWKTSIQSL RGTEYTISGL Q10466 TTGAEYVFRV KSVNKVGASD PSDSSDPQIA KEREEEPLFD IDSEMRKTLI Q1046x TTGAEYVFRV KSVNKVGASD PSDSSDPQIA KEREEEPLFD IDSEMRKTLI Q10466 VKAGASFTMT VPFRGRPVPN VLWSKPDTDL RTRAYVDTTD SRTSLTIENA Q1046x VKAGASFTMT VPFRGRPVPN VLWSKPDTDL RTRAYVDTTD SRTSLTIENA Q10466 NRNDSGKYTL TIQNVLSAAS LTLVVKVLDT PGPPTNITVQ DVTKESAVLS Q1046x NRNDSGKYTL TIQNVLSAAS LTLVVKVLDT PGPPTNITVQ DVTKESAVLS Q10466 WDVPENDGGA PVKNYHIEKR EASKKAWVSV TNNCNRLSYK VTNLQEGAIY Q1046x WDVPENDGGA PVKNYHIEKR EASKKAWVSV TNNCNRLSYK VTNLQEGAIY Q10466 YFRVSGENEF GVGIPAETKE GVKITEKPSP PEKLGVTSIS KDSVSLTWLK Q1046x YFRVSGENEF GVGIPAETKE GVKITEKPSP PEKLGVTSIS KDSVSLTWLK Q10466 PEHDGGSRIV HYVVEALEKG QKNWVKCAVA KSTHHVVSGL RENSEYFFRV Q1046x PEHDGGSRIV HYVVEALEKG QKNWVKCAVA KSTHHVVSGL RENSEYFFRV Q10466 FAENQAGLSD PRELLLPVLI KEQLEPPEID MKNFPSHTVY VRAGSNLKVD Q1046x FAENQAGLSD PRELLLPVLI KEQLEPPEID MKNFPSHTVY VRAGSNLKVD Q10466 IPISGKPLPK VTLSRDGVPL KATMRFNTEI TAENLTINLK ESVTADAGRY Q1046x IPISGKPLPK VTLSRDGVPL KATMRFNTEI TAENLTINLK ESVTADAGRY Q10466 EITAANSSGT TKAFINIVVL DRPGPPTGPV VISDITEESV TLKWEPPKYD Q1046x EITAANSSGT TKAFINIVVL DRPGPPTGPV VISDITEESV TLKWEPPKYD Q10466 GGSQVTNYIL LKRETSTAVW TEVSATVART MMKVMKLTTG EEYQFRIKAE Q1046x GGSQVTNYIL LKRETSTAVW TEVSATVART MMKVMKLTTG EEYQFRIKAE Q10466 NRFGISDHID SACVTVKLPY TTPGPPSTPW VTNVTRESIT VGWHEPVSNG Q1046x NRFGISDHID SACVTVKLPY TTPGPPSTPW VTNVTRESIT VGWHEPVSNG Q10466 GSAVVGYHLE MKDRNSILWQ KANKLVIRTT HFKVTTISAG LIYEFRVYAE Q1046x GSAVVGYHLE MKDRNSILWQ KANKLVIRTT HFKVTTISAG LIYEFRVYAE Q10466 NAAGVGKPSH PSEPVLAIDA CEPPRNVRIT DISKNSVSLS WQQPAFDGGS Q1046x NAAGVGKPSH PSEPVLAIDA CEPPRNVRIT DISKNSVSLS WQQPAFDGGS Q10466 KITGYIVERR DLPDGRWTKA SFTNVTETQF TISGLTQNSQ YEFRVFARNA Q1046x KITGYIVERR DLPDGRWTKA SFTNVTETQF TISGLTQNSQ YEFRVFARNA Q10466 VGSISNPSEV VGPITCIDSY GGPVIDLPLE YTEVVKYRAG TSVKLRAGIS Q1046x VGSISNPSEV VGPITCIDSY GGPVIDLPLE YTEVVKYRAG TSVKLRAGIS Q10466 GKPAPTIEWY KDDKELQTNA LVCVENTTDL ASILIKDADR LNSGCYELKL Q1046x GKPAPTIEWY KDDKELQTNA LVCVENTTDL ASILIKDADR LNSGCYELKL Q10466 RNAMASASAT IRVQILDKPG PPGGPIEFKT VTAEKITLLW RPPADDGGAK Q1046x RNAMASASAT IRVQILDKPG PPGGPIEFKT VTAEKITLLW RPPADDGGAK Q10466 ITHYIVEKRE TSRVVWSMVS EHLEECIITT TKIIKGNEYI FRVRAVNKYG Q1046x ITHYIVEKRE TSRVVWSMVS EHLEECIITT TKIIKGNEYI FRVRAVNKYG Q10466 IGEPLESDSV VAKNAFVTPG PPGIPEVTKI TKNSMTVVWS RPIADGGSDI Q1046x IGEPLESDSV VAKNAFVTPG PPGIPEVTKI TKNSMTVVWS RPIADGGSDI Q10466 SGYFLEKRDK KSLGWFKVLK ETIRDTRQKV TGLTENSDYQ YRVCAVNAAG Q1046x SGYFLEKRDK KSLGWFKVLK ETIRDTRQKV TGLTENSDYQ YRVCAVNAAG Q10466 QGPFSEPSEF YKAADPIDPP GPPAKIRIAD STKSSITLGW SKPVYDGGSA Q1046x QGPFSEPSEF YKAADPIDPP GPPAKIRIAD STKSSITLGW SKPVYDGGSA Q10466 VTGYVVEIRQ GEEEEWTTVS TKGEVRTTEY VVSNLKPGVN YYFRVSAVNC Q1046x VTGYVVEIRQ GEEEEWTTVS TKGEVRTTEY VVSNLKPGVN YYFRVSAVNC Q10466 AGQGEPIEMN EPVQAKDILE APEIDLDVAL RTSVIAKAGE DVQVLIPFKG Q1046x AGQGEPIEMN EPVQAKDILE APEIDLDVAL RTSVIAKAGE DVQVLIPFKG Q10466 RPPPTVTWRK DEKNLGSDAR YSIENTDSSS LLTIPQVTRN DTGKYILTIE Q1046x RPPPTVTWRK DEKNLGSDAR YSIENTDSSS LLTIPQVTRN DTGKYILTIE Q10466 NGVGEPKSST VSVKVLDTPA ACQKLQVKHV SRGTVTLLWD PPLIDGGSPI Q1046x NGVGEPKSST VSVKVLDTPA ACQKLQVKHV SRGTVTLLWD PPLIDGGSPI Q10466 INYVIEKRDA TKRTWSVVSH KCSSTSFKLI DLSEKTPFFF RVLAENEIGI Q1046x INYVIEKRDA TKRTWSVVSH KCSSTSFKLI DLSEKTPFFF RVLAENEIGI Q10466 GEPCETTEPV KAAEVPAPIR DLSMKDSTKT SVILSWTKPD FDGGSVITEY Q1046x GEPCETTEPV KAAEVPAPIR DLSMKDSTKT SVILSWTKPD FDGGSVITEY Q10466 VVERKGKGEQ TWSHAGISKT CEIEVSQLKE QSVLEFRVFA KNEKGLSDPV Q1046x VVERKGKGEQ TWSHAGISKT CEIEVSQLKE QSVLEFRVFA KNEKGLSDPV Q10466 TIGPITVKEL IITPEVDLSD IPGAQVTVRI GHNVHLELPY KGKPKPSISW Q1046x TIGPITVKEL IITPEVDLSD IPGAQVTVRI GHNVHLELPY KGKPKPSISW Q10466 LKDGLPLKES EFVRFSKTEN KITLSIKNAK KEHGGKYTVI LDNAVCRIAV Q1046x LKDGLPLKES EFVRFSKTEN KITLSIKNAK KEHGGKYTVI LDNAVCRIAV Q10466 PITVITLGPP SKPKGPIRFD EIKADSVILS WDVPEDNGGG EITCYSIEKR Q1046x PITVITLGPP SKPKGPIRFD EIKADSVILS WDVPEDNGGG EITCYSIEKR Q10466 ETSQTNWKMV CSSVARTTFK VPNLVKDAEY QFRVRAENRY GVSQPLVSSI Q1046x ETSQTNWKMV CSSVARTTFK VPNLVKDAEY QFRVRAENRY GVSQPLVSSI Q10466 IVAKHQFRIP GPPGKPVIYN VTSDGMSLTW DAPVYDGGSE VTGFHVEKKE Q1046x IVAKHQFRIP GPPGKPVIYN VTSDGMSLTW DAPVYDGGSE VTGFHVEKKE Q10466 RNSILWQKVN TSPISGREYR ATGLVEGLDY QFRVYAENSA GLSSPSDPSK Q1046x RNSILWQKVN TSPISGREYR ATGLVEGLDY QFRVYAENSA GLSSPSDPSK Q10466 FTLAVSPVDP PGTPDYIDVT RETITLKWNP PLRDGGSKIV GYSIEKRQGN Q1046x FTLAVSPVDP PGTPDYIDVT RETITLKWNP PLRDGGSKIV GYSIEKRQGN Q10466 ERWVRCNFTD VSECQYTVTG LSPGDRYEFR IIARNAVGTI SPPSQSSGII Q1046x ERWVRCNFTD VSECQYTVTG LSPGDRYEFR IIARNAVGTI SPPSQSSGII Q10466 MTRDENVPPI VEFGPEYFDG LIIKSGESLR IKALVQGRPV PRVTWFKDGV Q1046x MTRDENVPPI VEFGPEYFDG LIIKSGESLR IKALVQGRPV PRVTWFKDGV Q10466 EIEKRMNMEI TNVLGSTSLF VRDATRDHRG VYTVEAKNAS GSAKAEIKVK Q1046x EIEKRMNMEI TNVLGSTSLF VRDATRDHRG VYTVEAKNAS GSAKAEIKVK Q10466 VQDTPGKVVG PIRFTNITGE KMTLWWDAPL NDGCAPITHY IIEKRETSRL Q1046x VQDTPGKVVG PIRFTNITGE KMTLWWDAPL NDGCAPITHY IIEKRETSRL Q10466 AWALIEDKCE AQSYTAIKLI NGNEYQFRVS AVNKFGVGRP LDSDPVVAQI Q1046x AWALIEDKCE AQSYTAIKLI NGNEYQFRVS AVNKFGVGRP LDSDPVVAQI Q10466 QYTVPDAPGI PEPSNITGNS ITLTWARPES DGGSEIQQYI LERREKKSTR Q1046x QYTVPDAPGI PEPSNITGNS ITLTWARPES DGGSEIQQYI LERREKKSTR Q10466 WVKVISKRPI SETRFKVTGL TEGNEYEFHV MAENAAGVGP ASGISRLIKC Q1046x WVKVISKRPI SETRFKVTGL TEGNEYEFHV MAENAAGVGP ASGISRLIKC Q10466 REPVNPPGPP TVVKVTDTSK TTVSLEWSKP VFDGGMEIIG YIIEMCKTDL Q1046x REPVNPPGPP TVVKVTDTSK TTVSLEWSKP VFDGGMEIIG YIIEMCKTDL Q10466 GDWHKVNAEA CVKTRYTVTD LQAGEEYKFR VSAINGAGKG DSCEVTGTIK Q1046x GDWHKVNAEA CVKTRYTVTD LQAGEEYKFR VSAINGAGKG DSCEVTGTIK Q10466 AVDRLTAPEL DIDANFKQTH VVRAGASIRL FIAYQGRPTP TAVWSKPDSN Q1046x AVDRLTAPEL DIDANFKQTH VVRAGASIRL FIAYQGRPTP TAVWSKPDSN Q10466 LSLRADIHTT DSFSTLTVEN CNRNDAGKYT LTVENNSGSK SITFTVKVLD Q1046x LSLRADIHTT DSFSTLTVEN CNRNDAGKYT LTVENNSGSK SITFTVKVLD Q10466 TPGPPGPITF KDVTRGSATL MWDAPLLDGG ARIHHYVVEK REASRRSWQV Q1046x TPGPPGPITF KDVTRGSATL MWDAPLLDGG ARIHHYVVEK REASRRSWQV Q10466 ISEKCTRQIF KVNDLAEGVP YYFRVSAVNE YGVGEPYEMP EPIVATEQPA Q1046x ISEKCTRQIF KVNDLAEGVP YYFRVSAVNE YGVGEPYEMP EPIVATEQPA Q10466 PPRRLDVVDT SKSSAVLAWL KPDHDGGSRI TGYLLEMRQK GSDLWVEAGH Q1046x PPRRLDVVDT SKSSAVLAWL KPDHDGGSRI TGYLLEMRQK GSDLWVEAGH Q10466 TKQLTFTVER LVEKTEYEFR VKAKNDAGYS EPREAFSSVI IKEPQIEPTA Q1046x TKQLTFTVER LVEKTEYEFR VKAKNDAGYS EPREAFSSVI IKEPQIEPTA Q10466 DLTGITNQLI TCKAGSPFTI DVPISGRPAP KVTWKLEEMR LKETDRVSIT Q1046x DLTGITNQLI TCKAGSPFTI DVPISGRPAP KVTWKLEEMR LKETDRVSIT Q10466 TTKDRTTLTV KDSMRGDSGR YFLTLENTAG VKTFSVTVVV IGRPGPVTGP Q1046x TTKDRTTLTV KDSMRGDSGR YFLTLENTAG VKTFSVTVVV IGRPGPVTGP Q10466 IEVSSVSAES CVLSWGEPKD GGGTEITNYI VEKRESGTTA WQLVNSSVKR Q1046x IEVSSVSAES CVLSWGEPKD GGGTEITNYI VEKRESGTTA WQLVNSSVKR Q10466 TQIKVTHLTK YMEYSFRVSS ENRFGVSKPL ESAPIIAEHP FVPPSAPTRP Q1046x TQIKVTHLTK YMEYSFRVSS ENRFGVSKPL ESAPIIAEHP FVPPSAPTRP Q10466 EVYHVSANAM SIRWEEPYHD GGSKIIGYWV EKKERNTILW VKENKVPCLE Q1046x EVYHVSANAM SIRWEEPYHD GGSKIIGYWV EKKERNTILW VKENKVPCLE Q10466 CNYKVTGLVE GLEYQFRTYA LNAAGVSKAS EASRPIMAQN PVDAPGRPEV Q1046x CNYKVTGLVE GLEYQFRTYA LNAAGVSKAS EASRPIMAQN PVDAPGRPEV Q10466 TDVTRSTVSL IWSAPAYDGG SKVVGYIIER KPVSEVGDGR WLKCNYTIVS Q1046x TDVTRSTVSL IWSAPAYDGG SKVVGYIIER KPVSEVGDGR WLKCNYTIVS Q10466 DNFFTVTALS EGDTYEFRVL AKNAAGVISK GSESTGPVTC RDEYAPPKAE Q1046x DNFFTVTALS EGDTYEFRVL AKNAAGVISK GSESTGPVTC RDEYAPPKAE Q10466 LDARLHGDLV TIRAGSDLVL DAAVGGKPEP KIIWTKGDKE LDLCEKVSLQ Q1046x LDARLHGDLV TIRAGSDLVL DAAVGGKPEP KIIWTKGDKE LDLCEKVSLQ Q10466 YTGKRATAVI KFCDRSDSGK YTLTVKNASG TKAVSVMVKV LDSPGPCGKL Q1046x YTGKRATAVI KFCDRSDSGK YTLTVKNASG TKAVSVMVKV LDSPGPCGKL Q10466 TVSRVTQEKC TLAWSLPQED GGAEITHYIV ERRETSRLNW VIVEGECPTL Q1046x TVSRVTQEKC TLAWSLPQED GGAEITHYIV ERRETSRLNW VIVEGECPTL Q10466 SYVVTRLIKN NEYIFRVRAV NKYGPGVPVE SEPIVARNSF TIPSPPGIPE Q1046x SYVVTRLIKN NEYIFRVRAV NKYGPGVPVE SEPIVARNSF TIPSPPGIPE Q10466 EVGTGKEHII IQWTKPESDG GNEISNYLVD KREKESLRWT RVNKDYVVYD Q1046x EVGTGKEHII IQWTKPESDG GNEISNYLVD KREKESLRWT RVNKDYVVYD Q10466 TRLKVTSLME GCDYQFRVTA VNAAGNSEPS ERSNFISCRE PSYTPGPPSA Q1046x TRLKVTSLME GCDYQFRVTA VNAAGNSEPS ERSNFISCRE PSYTPGPPSA Q10466 PRVVDTTKHS ISLAWTKPMY DGGTDIVGYV LEMQEKDTDQ WYRVHTNATI Q1046x PRVVDTTKHS ISLAWTKPMY DGGTDIVGYV LEMQEKDTDQ WYRVHTNATI Q10466 RNTEFTVPDL KMGQKYSFRV AAVNVKGMSE YSESIAEIEP VERIEIPDLE Q1046x RNTEFTVPDL KMGQKYSFRV AAVNVKGMSE YSESIAEIEP VERIEIPDLE Q10466 LADDLKKTVT IRAGASLRLM VSVSGRPPPV ITWSKQGIDL ASRAIIDTTE Q1046x LADDLKKTVT IRAGASLRLM VSVSGRPPPV ITWSKQGIDL ASRAIIDTTE Q10466 SYSLLIVDKV NRYDAGKYTI EAENQSGKKS ATVLVKVYDT PGPCPSVKVK Q1046x SYSLLIVDKV NRYDAGKYTI EAENQSGKKS ATVLVKVYDT PGPCPSVKVK Q10466 EVSRDSVTIT WEIPTIDGGA PINNYIVEKR EAAMRAFKTV TTKCSKTLYR Q1046x EVSRDSVTIT WEIPTIDGGA PINNYIVEKR EAAMRAFKTV TTKCSKTLYR Q10466 ISGLVEGTMH YFRVLPENIY GIGEPCETSD AVLVSEVPLV PAKLEVVDVT Q1046x ISGLVEGTMH YFRVLPENIY GIGEPCETSD AVLVSEVPLV PAKLEVVDVT Q10466 KSTVTLAWEK PLYDGGSRLT GYVLEACKAG TERWMKVVTL KPTVLEHTVT Q1046x KSTVTLAWEK PLYDGGSRLT GYVLEACKAG TERWMKVVTL KPTVLEHTVT Q10466 SLNEGEQYLF RIRAQNEKGV SEPRETVTAV TVQDLRVLPT IDLSTMPQKT Q1046x SLNEGEQYLF RIRAQNEKGV SEPRETVTAV TVQDLRVLPT IDLSTMPQKT Q10466 IHVPAGRPVE LVIPIAGRPP PAASWFFAGS KLRESERVTV ETHTKVAKLT Q1046x IHVPAGRPVE LVIPIAGRPP PAASWFFAGS KLRESERVTV ETHTKVAKLT Q10466 IRETTIRDTG EYTLELKNVT GTTSETIKVI ILDKPGPPTG PIKIDEIDAT Q1046x IRETTIRDTG EYTLELKNVT GTTSETIKVI ILDKPGPPTG PIKIDEIDAT Q10466 SITISWEPPE LDGGAPLSGY VVEQRDAHRP GWLPVSESVT RSTFKFTRLT Q1046x SITISWEPPE LDGGAPLSGY VVEQRDAHRP GWLPVSESVT RSTFKFTRLT Q10466 EGNEYVFRVA ATNRFGIGSY LQSEVIECRS SIRIPGPPET LQIFDVSRDG Q1046x EGNEYVFRVA ATNRFGIGSY LQSEVIECRS SIRIPGPPET LQIFDVSRDG Q10466 MTLTWYPPED DGGSQVTGYI VERKEVRADR WVRVNKVPVT MTRYRSTGLT Q1046x MTLTWYPPED DGGSQVTGYI VERKEVRADR WVRVNKVPVT MTRYRSTGLT Q10466 EGLEYEHRVT AINARGSGKP SRPSKPIVAM DPIAPPGKPQ NPRVTDTTRT Q1046x EGLEYEHRVT AINARGSGKP SRPSKPIVAM DPIAPPGKPQ NPRVTDTTRT Q10466 SVSLAWSVPE DEGGSKVTGY LIEMQKVDQH EWTKCNTTPT KIREYTLTHL Q1046x SVSLAWSVPE DEGGSKVTGY LIEMQKVDQH EWTKCNTTPT KIREYTLTHL Q10466 PQGAEYRFRV LACNAGGPGE PAEVPGTVKV TEMLEYPDYE LDERYQEGIF Q1046x PQGAEYRFRV LACNAGGPGE PAEVPGTVKV TEMLEYPDYE LDERYQEGIF Q10466 VRQGGVIRLT IPIKGKPFPI CKWTKEGQDI SKRAMIATSE THTELVIKEA Q1046x VRQGGVIRLT IPIKGKPFPI CKWTKEGQDI SKRAMIATSE THTELVIKEA Q10466 DRGDSGTYDL VLENKCGKKA VYIKVRVIGS PNSPEGPLEY DDIQVRSVRV Q1046x DRGDSGTYDL VLENKCGKKA VYIKVRVIGS PNSPEGPLEY DDIQVRSVRV Q10466 SWRPPADDGG ADILGYILER REVPKAAWYT IDSRVRGTSL VVKGLKENVE Q1046x SWRPPADDGG ADILGYILER REVPKAAWYT IDSRVRGTSL VVKGLKENVE Q10466 YHFRVSAENQ FGISKPLKSE EPVTPKTPLN PPEPPSNPPE VLDVTKSSVS Q1046x YHFRVSAENQ FGISKPLKSE EPVTPKTPLN PPEPPSNPPE VLDVTKSSVS Q10466 LSWSRPKDDG GSRVTGYYIE RKETSTDKVV RHNKTQITTT MYTVTGLVPD Q1046x LSWSRPKDDG GSRVTGYYIE RKETSTDKVV RHNKTQITTT MYTVTGLVPD Q10466 AEYQFRIIAQ NDVGLSETSP ASEPVVCKDP FDKPSQPGEL EILSISKDSV Q1046x AEYQFRIIAQ NDVGLSETSP ASEPVVCKDP FDKPSQPGEL EILSISKDSV Q10466 TLQWEKPECD GGKEILGYWV EYRQSGDSAW KKSNKERIKD KQFTIGGLLE Q1046x TLQWEKPECD GGKEILGYWV EYRQSGDSAW KKSNKERIKD KQFTIGGLLE Q10466 ATEYEFRVFA ENETGLSRPR RTAMSIKTKL TSGEAPGIRK EMKDVTTKLG Q1046x ATEYEFRVFA ENETGLSRPR RTAMSIKTKL TSGEAPGIRK EMKDVTTKLG Q10466 EAAQLSCQIV GRPLPDIKWY RFGKELIQSR KYKMSSDGRT HTLTVMTEEQ Q1046x EAAQLSCQIV GRPLPDIKWY RFGKELIQSR KYKMSSDGRT HTLTVMTEEQ Q10466 EDEGVYTCIA TNEVGEVETS SKLLLQATPQ FHPGYPLKEK YYGAVGSTLR Q1046x EDEGVYTCIA TNEVGEVETS SKLLLQATPQ FHPGYPLKEK YYGAVGSTLR Q10466 LHVMYIGRPV PAMTWFHGQK LLQNSENITI ENTEHYTHLV MKNVQRKTHA Q1046x LHVMYIGRPV PAMTWFHGQK LLQNSENITI ENTEHYTHLV MKNVQRKTHA Q10466 GKYKVQLSNV FGTVDAILDV EIQDKPDKPT GPIVIEALLK NSAVISWKPP Q1046x GKYKVQLSNV FGTVDAILDV EIQDKPDKPT GPIVIEALLK NSAVISWKPP Q10466 ADDGGSWITN YVVEKCEAKE GAEWQLVSSA ISVTTCRIVN LTENAGYYFR Q1046x ADDGGSWITN YVVEKCEAKE GAEWQLVSSA ISVTTCRIVN LTENAGYYFR Q10466 VSAQNTFGIS DPLEVSSVVI IKSPFEKPGA PGKPTITAVT KDSCVVAWKP Q1046x VSAQNTFGIS DPLEVSSVVI IKSPFEKPGA PGKPTITAVT KDSCVVAWKP Q10466 PASDGGAKIR NYYLEKREKK QNKWISVTTE EIRETVFSVK NLIEGLEYEF Q1046x PASDGGAKIR NYYLEKREKK QNKWISVTTE EIRETVFSVK NLIEGLEYEF Q10466 RVKCENLGGE SEWSEISEPI TPKSDVPIQA PHFKEELRNL NVRYQSNATL Q1046x RVKCENLGGE SEWSEISEPI TPKSDVPIQA PHFKEELRNL NVRYQSNATL Q10466 VCKVTGHPKP IVKWYRQGKE IIADGLKYRI QEFKGGYHQL IIASVTDDDA Q1046x VCKVTGHPKP IVKWYRQGKE IIADGLKYRI QEFKGGYHQL IIASVTDDDA Q10466 TVYQVRATNQ GGSVSGTASL EVEVPAKIHL PKTLEGMGAV HALRGEVVSI Q1046x TVYQVRATNQ GGSVSGTASL EVEVPAKIHL PKTLEGMGAV HALRGEVVSI Q10466 KIPFSGKPDP VITWQKGQDL IDNNGHYQVI VTRSFTSLVF PNGVERKDAG Q1046x KIPFSGKPDP VITWQKGQDL IDNNGHYQVI VTRSFTSLVF PNGVERKDAG Q10466 FYVVCAKNRF GIDQKTVELD VADVPDPPRG VKVSDASRDS VNLTWTEPAS Q1046x FYVVCAKNRF GIDQKTVELD VADVPDPPRG VKVSDASRDS VNLTWTEPAS Q10466 DGGSKITNYI VEKCATTAER WLRVGQARET RYTVINLFGK TSYQFRVIAE Q1046x DGGSKITNYI VEKCATTAER WLRVGQARET RYTVINLFGK TSYQFRVIAE Q10466 NKFGLSKPSE PSEPTITKED KTRAMNYDEE VDETREVSMT KASHSSTKEL Q1046x NKFGLSKPSE PSEPTITKED KTRAMNYDEE VDETREVSMT KASHSSTKEL Q10466 YEKYMIAEDL GRGEFGIVHR CVETSSKKTY MAKFVKVKGT DQVLVKKEIS Q1046x YEKYMIAEDL GRGEFGIVHR CVETSSKKTY MAKFVKVKGT DQVLVKKEIS Q10466 ILNIARHRNI LHLHESFESM EELVMIFEFI SGLDIFERIN TSAFELNERE Q1046x ILNIARHRNI LHLHESFESM EELVMIFEFI SGLDIFERIN TSAFELNERE Q10466 IVSYVHQVCE ALQFLHSHNI GHFDIRPENI IYQTRRSSTI KIIEFGQARQ Q1046x IVSYVHQVCE ALQFLHSHNI GHFDIRPENI IYQTRRSSTI KIIEFGQARQ Q10466 LKPGDNFRLL FTAPEYYAPE VHQHDVVSTA TDMWSLGTLV YVLLSGINPF Q1046x LKPGDNFRLL FTAPEYYAPE VHQHDVVSTA TDMWSLGTLV YVLLSGINPF Q10466 LAETNQQIIE NIMNAEYTFD EEAFKEISIE AMDFVDRLLV KERKSRMTAS Q1046x LAETNQQIIE NIMNAEYTFD EEAFKEISIE AMDFVDRLLV KERKSRMTAS Q10466 EALQHPWLKQ KIERVSTKVI RTLKHRRYYH TLIKKDLNMV VSAARISCGG Q1046x EALQHPWLKQ KIERVSTKVI RTLKHRRYYH TLIKKDLNMV VSAARISCGG Q10466 AIRSQKGVSV AKVKVASIEI GPVSGQIMHA VGEEGGHVKY VCKIENYDQS Q1046x AIRSQKGVSV AKVKVASIEI GPVSGQIMHA VGEEGGHVKY VCKIENYDQS Q10466 TQVTWYFGVR QLENSEKYEI TYEDGVAILY VKDITKLDDG TYRCKVVNDY Q1046x TQVTWYFGVR QLENSEKYEI TYEDGVAILY VKDITKLDDG TYRCKVVNDY Q10466 GEDSSYAELF VKGVREVYDY YCRRTMKKIK RRTDTMRLLE RPPEFTLPLY Q1046x GEDSSYAELF VKGVREVYDY YCRRTMKKIK RRTDTMRLLE RPPEFTLPLY Q10466 NKTAYVGENV RFGVTITVHP EPHVTWYKSG QKIKPGDNDK KYTFESDKGL Q1046x NKTAYVGENV RFGVTITVHP EPHVTWYKSG QKIKPGDNDK KYTFESDKGL Q10466 YQLTINSVTT DDDAEYTVVA RNKYGEDSCK AKLTVTLHPP PTDSTLRPMF Q1046x YQLTINSVTT DDDAEYTVVA RNKYGEDSCK AKLTVTLHPP PTDSTLRPMF Q10466 KRLLANAECQ EGQSVCFEIR VSGIPPPTLK WEKDGQPLSL GPNIEIIHEG Q1046x KRLLANAECQ EGQSVCFEIR VSGIPPPTLK WEKDGQPLSL GPNIEIIHEG Q10466 LDYYALHIRD TLPEDTGYYR VTATNTAGST SCQAHLQVER LRYKKQEFKS Q1046x LDYYALHIRD TLPEDTGYYR VTATNTAGST SCQAHLQVER LRYKKQEFKS Q10466 KEEHERHVQK QIDKTLRMAE ILSGTESVPL TQVAKEALRE AAVLYKPAVS Q1046x KEEHERHVQK QIDKTLRMAE ILSGTESVPL TQVAKEALRE AAVLYKPAVS Q10466 TKTVKGEFRL EIEEKKEERK LRMPYDVPEP RKYKQTTIEE DQRIKQFVPM Q1046x TKTVKGEFRL EIEEKKEERK LRMPYDVPEP RKYKQTTIEE DQRIKQFVPM Q10466 SDMKWYKKIR DQYEMPGKLD RVVQKRPKRI RLSRWEQFYV MPLPRITDQY Q1046x SDMKWYKKIR DQYEMPGKLD RVVQKRPKRI RLSRWEQFYV MPLPRITDQY Q10466 RPKWRIPKLS QDDLEIVRPA RRRTPSPDYD FYYRPRRRSL GDISDEELLL Q1046x RPKWRIPKLS QDDLEIVRPA RRRTPSPDYD FYYRPRRRSL GDISDEELLL Q10466 PIDDYLAMKR TEEERLRLEE ELELGFSASP PSRSPPHFEL SSLRYSSPQA Q1046x PIDDYLAMKR TEEERLRLEE ELELGFSASP PSRSPPHFEL SSLRYSSPQA Q10466 HVKVEETRKN FRYSTYHIPT KAEASTSYAE LRERHAQAAY RQPKQRQRIM Q1046x HVKVEETRKN FRYSTYHIPT KAEASTSYAE LRERHAQAAY RQPKQRQRIM Q10466 AEREDEELLR PVTTTQHLSE YKSELDFMSK EEKSRKKSRR QREVTEITEI Q1046x AEREDEELLR PVTTTQHLSE YKSELDFMSK EEKSRKKSRR QREVTEITEI Q10466 EEEYEISKHA QRESSSSASR LLRRRRSLSP TYIELMRPVS ELIRSRPQPA Q1046x EEEYEISKHA QRESSSSASR LLRRRRSLSP TYIELMRPVS ELIRSRPQPA Q10466 EEYEDDTERR SPTPERTRPR SPSPVSSERS LSRFERSARF DIFSRYESMK Q1046x EEYEDDTERR SPTPERTRPR SPSPVSSERS LSRFERSARF DIFSRYESMK Q10466 AALKTQKTSE RKYEVLSQQP FTLDHAPRIT LRMRSHRVPC GQNTRFILNV Q1046x AALKTQKTSE RKYEVLSQQP FTLDHAPRIT LRMRSHRVPC GQNTRFILNV Q10466 QSKPTAEVKW YHNGVELQES SKIHYTNTSG VLTLEILDCH TDDSGTYRAV Q1046x QSKPTAEVKW YHNGVELQES SKIHYTNTSG VLTLEILDCH TDDSGTYRAV Q10466 CTNYKGEASD YATLDVTGGD YTTYASQRRD EEVPRSVFPE LTRTEAYAVP Q1046x CTNYKGEASD YATLDVTGGD YTTYASQRRD EEVPRSVFPE LTRTEAYAVP Q10466 SFKKTSEMEA SSSVREVKSQ MTETRESLSS YEHSASAEMK SAALEEKSLE Q1046x SFKKTSEMEA SSSVREVKSQ MTETRESLSS YEHSASAEMK SAALEEKSLE Q10466 EKSTTRKIKT TLAARILTKP RSMTVYEGES ARFSCDTDGE PVPTVTWLRK Q1046x EKSTTRKIKT TLAARILTKP RSMTVYEGES ARFSCDTDGE PVPTVTWLRK Q10466 GQVLSTSARH QVTTTKYKST FEISSVQASD EGNYSVVVEN SEGKQEAEFT Q1046x GQVLSTSARH QVTTTKYKST FEISSVQASD EGNYSVVVEN SEGKQEAEFT Q10466 LTIQKARVTE KAVTSPPRVK SPEPRVKSPE AVKSPKRVKS PEPSHPKAVS Q1046x LTIQKARVTE KAVTSPPRVK SPEPRVKSPE AVKSPKRVKS PEPSHPKAVS Q10466 PTETKPTPRE KVQHLPVSAP PKITQFLKAE ASKEIAKLTC VVESSVLRAK Q1046x PTETKPTPRE KVQHLPVSAP PKITQFLKAE ASKEIAKLTC VVESSVLRAK Q10466 EVTWYKDGKK LKENGHFQFH YSADGTYELK INNLTESDQG EYVCEISGEG Q1046x EVTWYKDGKK LKENGHFQFH YSADGTYELK INNLTESDQG EYVCEISGEG Q10466 GTSKTNLQFM GQAFKSIHEK VSKISETKKS DQKTTESTVT RKTEPKAPEP Q1046x GTSKTNLQFM GQAFKSIHEK VSKISETKKS DQKTTESTVT RKTEPKAPEP Q10466 ISSKPVIVTG LQDTTVSSDS VAKFAVKATG EPRPTAIWTK DGKAITQGGK Q1046x ISSKPVIVTG LQDTTVSSDS VAKFAVKATG EPRPTAIWTK DGKAITQGGK Q10466 YKLSEDKGGF FLEIHKTDTS DSGLYTCTVK NSAGSVSSSC KLTIKAIKDT Q1046x YKLSEDKGGF FLEIHKTDTS DSGLYTCTVK NSAGSVSSSC KLTIKAIKDT Q10466 EAQKVSTQKT SEITPQKKAV VQEEISQKAL RSEEIKMSEA KSQEKLALKE Q1046x EAQKVSTQKT SEITPQKKAV VQEEISQKAL RSEEIKMSEA KSQEKLALKE Q10466 EASKVLISEE VKKSAATSLE KSIVHEEITK TSQASEEVRT HAEIKAFSTQ Q1046x EASKVLISEE VKKSAATSLE KSIVHEEITK TSQASEEVRT HAEIKAFSTQ Q10466 MSINEGQRLV LKANIAGATD VKWVLNGVEL TNSEEYRYGV SGSDQTLTIK Q1046x MSINEGQRLV LKANIAGATD VKWVLNGVEL TNSEEYRYGV SGSDQTLTIK Q10466 QASHRDEGIL TCISKTKEGI VKCQYDLTLS KELSDAPAFI SQPRSQNINE Q1046x QASHRDEGIL TCISKTKEGI VKCQYDLTLS KELSDAPAFI SQPRSQNINE Q10466 GQNVLFTCEI SGEPSPEIEW FKNNLPISIS SNVSISRSRN VYSLEIRNAS Q1046x GQNVLFTCEI SGEPSPEIEW FKNNLPISIS SNVSISRSRN VYSLEIRNAS Q10466 VSDSGKYTIK AKNFRGQCSA TASLMVLPLV EEPSREVVLR TSGDTSLQGS Q1046x VSDSGKYTIK AKNFRGQCSA TASLMVLPLV EEPSREVVLR TSGDTSLQGS Q10466 FSSQSVQMSA SKQEASFSSF SSSSASSMTE MKFASMSAQS MSSMQESFVE Q1046x FSSQSVQMSA SKQEASFSSF SSSSASSMTE MKFASMSAQS MSSMQESFVE Q10466 MSSSSFMGIS NMTQLESSTS KMLKAGIRGI PPKIEALPSD ISIDEGKVLT Q1046x MSSSSFMGIS NMTQLESSTS KMLKAGIRGI PPKIEALPSD ISIDEGKVLT Q10466 VACAFTGEPT PEVTWSCGGR KIHSQEQGRF HIENTDDLTT LIIMDVQKQD Q1046x VACAFTGEPT PEVTWSCGGR KIHSQEQGRF HIENTDDLTT LIIMDVQKQD Q10466 GGLYTLSLGN EFGSDSATVN IHIRSI Q1046x GGLYTLSLGN EFGSDSATVN IHIRSI profphd-utils-1.0.10/maxhom.common0000755015075101507510000004564612012371466016416 0ustar lkajanlkajanC parallel stuff INTEGER IDPROC,NPROC,NWORKER,NWORKSET,ID_HOST,LINK(0:MAXPROC) COMMON/CPARALLEL_1/IDPROC,NPROC,NWORKER,NWORKSET,ID_HOST,LINK INTEGER WORKSETEND(MAXPROC),IDTOP,MSGTYPE,NSEQ_WARM_START COMMON/CPARALLEL_2/WORKSETEND,IDTOP,MSGTYPE,NSEQ_WARM_START INTEGER LINK_HOST,LINK_NODE_SENDER,LINK_NODE_RECEIVER COMMON/CPARALLEL_3/LINK_HOST,LINK_NODE_SENDER,LINK_NODE_RECEIVER INTEGER SENDER_NODE(MAXPROC),RECEIVER_NODE(MAXPROC) COMMON/CPARALLEL_4/SENDER_NODE,RECEIVER_NODE INTEGER WORKSETSIZE(MAXPROC),WORKSETBEG(MAXPROC) COMMON/CPARALLEL_5/WORKSETSIZE,WORKSETBEG CHARACTER*200 HOST_FILE CHARACTER*30 NODE_NAME COMMON/CPARALLEL_6/HOST_FILE,NODE_NAME CHARACTER*20 MP_MODEL COMMON/CPARALLEL_7/MP_MODEL c logical lmixed_arch,lsmall_machine c common/cparallel_8/lmixed_arch,lsmall_machine LOGICAL LSMALL_MACHINE COMMON/CPARALLEL_8/LSMALL_MACHINE LOGICAL L3WAY,L3WAYDONE,LWARM_START,LFIRST_SCAN COMMON/C3WAY/L3WAY,L3WAYDONE,LWARM_START,LFIRST_SCAN INTEGER N_ONE COMMON/C_ONE/N_ONE CHARACTER CDATABASE_BUFFER(MAXDATABASE_BUFFER) CHARACTER CBUFFER_LINE*(MAXBUFFER_LINE) COMMON/CWARMSTART1/CDATABASE_BUFFER,CBUFFER_LINE INTEGER NBUFFER_LEN COMMON/CWARMSTART2/NBUFFER_LEN C TIMING INTEGER ITIME_OLD(3),ITIME_NEW(3) COMMON/CTIMING1/ITIME_OLD,ITIME_NEW REAL TOTAL_TIME COMMON/CTIMING2/TOTAL_TIME C NUMBER OF RESIDUES IN SEQ 1 INTEGER N1 COMMON/CNUMRES1/N1 INTEGER N2IN COMMON/CNUMRES2/N2IN C attributes of sequences CAUTION if you change the length of "compnd" and/or ACCESSION,pdbref C do it also in "get_swiss_entry" AND in the "split_db" program CHARACTER*200 NAME_1,NAME_2 CHARACTER*200 COMPND_1,COMPND_2,SOURCE_1,SOURCE_2, + AUTHOR_1,AUTHOR_2 CHARACTER*40 HEADER_1,HEADER_2 CHARACTER*12 ACCESSION_1,ACCESSION_2,PDBREF_1,PDBREF_2 COMMON/CSEQFILES1/NAME_1,NAME_2,COMPND_1,COMPND_2,SOURCE_1 COMMON/CSEQFILES2/SOURCE_2,AUTHOR_1,AUTHOR_2,HEADER_1,HEADER_2 COMMON/CSEQFILES3/ACCESSION_1,ACCESSION_2,PDBREF_1,PDBREF_2 CHARACTER CSQ_1*(MAXSQ),CSQ_2*(MAXSQ) c LENGTH*7 !! CHARACTER*7 COLS_1(MAXSQ),COLS_2(MAXSQ) CHARACTER*1 STRUC_1(MAXSQ),STRUC_2(MAXSQ),CHAINID_1(MAXSQ) CHARACTER STRCLASS_1*(MAXSQ),STRCLASS_2*(MAXSQ) CHARACTER*1 CHAINID_2(MAXSQ),SHEETLABEL_1(MAXSQ) CHARACTER*1 SHEETLABEL_2(MAXSQ) COMMON/CSEQUENCE1/CSQ_1,CSQ_2,COLS_1,COLS_2,STRUC_1,STRUC_2 COMMON/CSEQUENCE2/STRCLASS_1,STRCLASS_2,CHAINID_1,CHAINID_2 COMMON/CSEQUENCE3/SHEETLABEL_1,SHEETLABEL_2 C AUTION LSTRUC IS NOT JUST 3 STATES C NUSRF ACCESSIBLE SURFACE AREA INTEGER LSQ_1(MAXSQ),LSQ_2(MAXSQ) INTEGER LSTRUC_1(MAXSQ),LSTRUC_2(MAXSQ) INTEGER LSTRCLASS_1(MAXSQ),LSTRCLASS_2(MAXSQ) INTEGER NSURF_1(MAXSQ),NSURF_2(MAXSQ) INTEGER LACC_1(MAXSQ),LACC_2(MAXSQ) INTEGER PDBNO_1(MAXSQ),PDBNO_2(MAXSQ) INTEGER NOCC_1(MAXSQ),NOCC_2(MAXSQ) INTEGER BP1_1(MAXSQ),BP1_2(MAXSQ),BP2_1(MAXSQ),BP2_2(MAXSQ) INTEGER NCHAIN_1,NCHAINUSED COMMON/CSEQUENCE4/LSQ_1,LSQ_2,LSTRUC_1,LSTRUC_2,LSTRCLASS_1 COMMON/CSEQUENCE5/LSTRCLASS_2,NSURF_1,NSURF_2,LACC_1,LACC_2 COMMON/CSEQUENCE6/PDBNO_1,PDBNO_2,NOCC_1,NOCC_2,BP1_1,BP1_2 COMMON/CSEQUENCE7/BP2_1,BP2_2,NCHAIN_1,NCHAINUSED C ALIGNMENT ATTRIBUTES OF THE MAXHITS BEST ALIGNMENTS REAL AL_VAL(MAXHITS) REAL AL_STRHOM(MAXHITS) REAL AL_VPERRES(MAXHITS) REAL AL_RMS(MAXHITS) REAL AL_HOM(MAXHITS),AL_SIM(MAXHITS) REAL AL_ENTROPY(MAXSQ) REAL AL_SDEV(MAXHITS) INTEGER AL_LEN(MAXHITS),AL_HOMLEN(MAXHITS),AL_NGAP(MAXHITS) INTEGER AL_LGAP(MAXHITS),AL_LSEQ_2(MAXHITS),AL_IFIRST(MAXHITS) INTEGER AL_ILAST(MAXHITS),AL_JFIRST(MAXHITS),AL_JLAST(MAXHITS) INTEGER AL_NDELETION(MAXSQ),AL_NINS(MAXSQ) INTEGER AL_VARIABILITY(MAXSQ),AL_RELENT(MAXSQ) INTEGER AL_SEQPROF(MAXSQ,MAXPROFAA) CHARACTER AL_EXCLUDEFLAG(MAXHITS) CHARACTER AL_EMBLPID(MAXHITS)*40 CHARACTER AL_COMPOUND(MAXHITS)*200 CHARACTER AL_ACCESSION(MAXHITS)*12 CHARACTER AL_PDB_POINTER(MAXHITS)*12 c REAL COMMON/CALIGNMENT1/AL_VAL,AL_STRHOM,AL_VPERRES,AL_RMS,AL_HOM COMMON/CALIGNMENT2/AL_SIM,AL_ENTROPY,AL_SDEV c INT COMMON/CALIGNMENT3/AL_LEN,AL_HOMLEN,AL_NGAP,AL_LGAP,AL_LSEQ_2 COMMON/CALIGNMENT4/AL_IFIRST,AL_ILAST,AL_JFIRST,AL_JLAST COMMON/CALIGNMENT5/AL_NDELETION,AL_NINS,AL_VARIABILITY COMMON/CALIGNMENT6/AL_RELENT,AL_SEQPROF c CHAR COMMON/CALIGNMENT7/AL_EXCLUDEFLAG,AL_EMBLPID,AL_COMPOUND COMMON/CALIGNMENT8/AL_ACCESSION,AL_PDB_POINTER C BUFFER TO STORE ALIGNMENTS CHARACTER*1 SEQBUFFER(MAXSEQBUFFER) COMMON/CSEQBUFFER/SEQBUFFER C POINTS TO SELECTED ALIGNMENTS IN SEQBUFFER INTEGER ISEQPOINTER(MAXHITS),ISEQPOS COMMON/CSEQBUFFERPOINTER/ISEQPOINTER,ISEQPOS c BUFFER TO STORE SECONDARY STRUCTURE OF ALIGNMENTS CHARACTER*1 STRBUFFER(MAXSTRBUFFER) COMMON/CSTRBUFFER/STRBUFFER C POINTS TO SELECTED ALIGNMENTS IN STRBUFFER INTEGER ISTRPOINTER(MAXHITS) COMMON/CSTRBUFFERPOINTER/ISTRPOINTER c BUFFER TO STORE INSIDE/OUTSIDE STATE OF ALIGNMENTS CHARACTER*1 CIOBUFFER(MAXIOBUFFER) COMMON/CIOBUFFER/CIOBUFFER C POINTS TO SELECTED ALIGNMENTS IN IOBUFFER INTEGER IOPOINTER(MAXHITS) COMMON/CIOBUFFERPOINTER/IOPOINTER c BUFFER TO STORE INSERTIONS OF ALIGNMENTS c CHARACTER INSBUFFER*(MAXINSBUFFER) CHARACTER*1 INSBUFFER(MAXINSBUFFER) COMMON/CINSBUFFER/INSBUFFER C POINTS TO SELECTED ALIGNMENTS IN INSBUFFER INTEGER INSNUMBER,INSALI(MAXINS),INSPOINTER(MAXINS) INTEGER INSLEN(MAXINS),INSBEG_1(MAXINS),INSBEG_2(MAXINS) COMMON/CINSBUFFERPOINTER1/INSNUMBER,INSALI,INSPOINTER,INSLEN COMMON/CINSBUFFERPOINTER2/INSBEG_1,INSBEG_2 C GLOBAL FOR HOST REAL ALISORTKEY(MAXALIGNS) COMMON/CALIGN0/ALISORTKEY INTEGER IALIGN_GOOD COMMON/CALIGN1/IALIGN_GOOD c INTEGER*2 LEN2_ORIG(MAXALIGNS) c COMMON/CALIGN1/LEN2_ORIG c INTEGER*2 IRECPOI(MAXALIGNS),IFILEPOI(MAXALIGNS) INTEGER IRECPOI(MAXALIGNS),IFILEPOI(MAXALIGNS) COMMON/CALIGN2/IRECPOI,IFILEPOI INTEGER IALIGNOLD,NGLOBALHITS,NBEST COMMON/CALIGN5/IALIGNOLD,NGLOBALHITS,NBEST c SELECTED ALIGNMENTS ARE SELECTED ACCORDING TO CHOSEN SORT-MODE CHARACTER*20 CSORTMODE COMMON/CALISORT/CSORTMODE c ORDER OF AMINO ACIDS AND ORDER OF SECONDARY STRUCTURE SYMBOLS CHARACTER TRANS*(NTRANS),STRTRANS*(NTRANS) COMMON/CT1/TRANS,STRTRANS C SIMILARITY PARAMETERS AND STRUCTURE DEPENDENT MATRICES REAL SIMMETRIC_1(MAXSQ,NTRANS),SIMMETRIC_2(MAXSQ,NTRANS) REAL METRIC_1(MAXSQ,NTRANS),METRIC_2(MAXSQ,NTRANS) REAL SIMCONSERV(NTRANS,NTRANS) COMMON/CSIM1/SIMMETRIC_1,SIMMETRIC_2,METRIC_1,METRIC_2, + SIMCONSERV C ORIGINAL METRIC AS READ IN, AND I/O STATES REAL SIMORG(NTRANS,NTRANS,MAXSTRSTATES,MAXIOSTATES, + MAXSTRSTATES,MAXIOSTATES) REAL IORANGE(MAXSTRSTATES,MAXIOSTATES) COMMON/CSIMORG1/SIMORG,IORANGE C NUMBER AND STRINGS OF STRUCTURE STATES AND I/O STATES OF METRIC INTEGER NSTRSTATES_1,NSTRSTATES_2,NIOSTATES_1,NIOSTATES_2 COMMON/CSIMORG2/NSTRSTATES_1,NSTRSTATES_2,NIOSTATES_1, + NIOSTATES_2 CHARACTER CSTRSTATES*(MAXSTRSTATES),CIOSTATES*(MAXIOSTATES) C---- br 99-03: note when changing STR_CLASS, update SBRs calling C---- it in lib-maxhom !!! CHARACTER*10 STR_CLASSES(MAXSTRSTATES) COMMON/CSIMORG3/CSTRSTATES,CIOSTATES,STR_CLASSES C GAP-OPEN, GAP-ELONGATION REAL GAPOPEN_1(MAXSQ),GAPOPEN_2(MAXSQ) REAL GAPELONG_1(MAXSQ),GAPELONG_2(MAXSQ) REAL OPEN_GAP_1(MAXSQ),OPEN_GAP_2(MAXSQ) REAL ELONG_GAP_1(MAXSQ),ELONG_GAP_2(MAXSQ) REAL OPEN_1,ELONG_1 COMMON/CGAP2/GAPOPEN_1,GAPOPEN_2,GAPELONG_1,GAPELONG_2, + OPEN_GAP_1 COMMON/CGAP3/OPEN_GAP_2,ELONG_GAP_1,ELONG_GAP_2,OPEN_1,ELONG_1 C SMIN,SMAX REAL SMIN,SMAX,MAPLOW,MAPHIGH,EPSILON COMMON/CGAP4/SMIN,SMAX,MAPLOW,MAPHIGH,EPSILON c CONSERVATION WEIGHTS REAL CONSWEIGHT_1(MAXSQ),CONSWEIGHT_2(MAXSQ),CONSMIN COMMON/CWEIGHT1/CONSWEIGHT_1,CONSWEIGHT_2,CONSMIN c LOGICAL FOR USE OF CONSERVATION WEIGHTS C LPASS2: SECOND PASS AFTER ALIGING ALL SECOND SEQUENCES AND CALCULATING THE C CONSERVATION WEIGHTS (FIX WEIGHTS AND ALIGN ALL SEQUENCES IN LIST C ONCE MORE) LOGICAL LCONSERV_1,LCONSERV_2,LINSERT_1,LINSERT_2 LOGICAL LCONSIMPORT,LFIRSTWEIGHT,LPASS2 COMMON/CALIOPT1/LCONSERV_1,LCONSERV_2,LINSERT_1,LINSERT_2 COMMON/CALIOPT2/LCONSIMPORT,LFIRSTWEIGHT,LPASS2 c NEED FOR CALCULATING THE VARIABILITY REAL SUMVARIABILITY(MAXSQ),SUMDISTANCE(MAXSQ) COMMON/CWEIGHT2/SUMVARIABILITY,SUMDISTANCE C BACKWARD SETMATRIX TO IDENTIFY SUBOPTIMALS AND RELIABILITY-SCORE LOGICAL LBACKWARD COMMON/CBACKWARD1/LBACKWARD REAL FILTER_VAL,SUBOPT_VAL COMMON/CBACKWARD2/FILTER_VAL,SUBOPT_VAL CHARACTER*200 METRICFILE,METRIC_HSSP_VAR COMMON/CMETRIC/METRICFILE,METRIC_HSSP_VAR C BEGIN AND END OF A PROFILE INTEGER NBOX_1,NBOX_2 INTEGER PROFILEBOX_1(MAXBOX,2),PROFILEBOX_2(MAXBOX,2) COMMON/CPROFRANGE1/NBOX_1,NBOX_2,PROFILEBOX_1,PROFILEBOX_2 INTEGER IPROFBEG,IPROFEND COMMON/CPROFRANGE2/IPROFBEG,IPROFEND C ALIGN SECONDARY STRUCTURE SYMBOLS LOGICAL LSTRUC_ALIGN COMMON/CSTR_ALIGN/LSTRUC_ALIGN C PROFILE CONTROL LOGICAL LWRITEPROFILE,LREADPROFILE,LPROFILE_1 COMMON/CPROFILE1/LWRITEPROFILE,LREADPROFILE,LPROFILE_1 LOGICAL LPROFILE_2,LNORM_PROFILE COMMON/CPROFILE2/LPROFILE_2,LNORM_PROFILE REAL CUTVALUE1,CUTVALUE2 COMMON/CVALUE/CUTVALUE1,CUTVALUE2 INTEGER PROFILEMODE COMMON/CPROFILE3/PROFILEMODE REAL PROFILE_EPSILON,PROFILE_GAMMA COMMON/CPROFILE4/PROFILE_EPSILON,PROFILE_GAMMA INTEGER ISOLEN(MAXCUTOFFSTEPS),NSTEP,ISAFE COMMON/CTHRESHOLD2/ISOLEN,NSTEP,ISAFE REAL ISOIDE(MAXCUTOFFSTEPS) COMMON/CTHRESHOLD3/ISOIDE C LOGICAL IF SEQUENCE1 AND SEQUENCE2 ARE IDENTICAL ==> SKIP ALIGNMENT, C BUT NOT IF HSSP-WANTED OR PROFILES.. LOGICAL LSAMESEQ,LSHOW_SAMESEQ COMMON/CSAMESEQ/LSAMESEQ,LSHOW_SAMESEQ C LOGICAL IF OUTPUT WANTED FOR SPECIFIED FILES LOGICAL LTRACE,LTRACEOUT,LSTRIP,LPROF,LONG_OUT COMMON/COUTPUT/LTRACE,LTRACEOUT,LSTRIP,LPROF,LONG_OUT C INPUT/OUTPUT FILES CHARACTER*200 LISTFILE_1,LISTFILE_2,PLOTFILE,STRIPFILE CHARACTER*200 FILESEQ,STATFILE,ISOSIGFILE,LOGFILE,WARNFILE CHARACTER*200 HISTOFILE,LONGFILE COMMON/CFILES1/LISTFILE_1,LISTFILE_2,PLOTFILE,STRIPFILE COMMON/CFILES2/FILESEQ,STATFILE,ISOSIGFILE,LOGFILE,WARNFILE COMMON/CFILES3/HISTOFILE,LONGFILE CHARACTER*500 LOGSTRING COMMON/CLOGSTRING/LOGSTRING C ENVIROMENT C WHICH ARCHITECTURE (UNIX/VMS, HOSTNAME, SUN4/DEC/SGI/PARIX....) CHARACTER*200 CMACHINE CHARACTER*60 MACHINE_NAME CHARACTER*20 ARCHITECTURE,HOSTNAME COMMON/CENVIROMENT/CMACHINE,MACHINE_NAME,ARCHITECTURE,HOSTNAME C LOGICAL FOR APPLIED THRESHOLD LOGICAL LCONSIDER,LFORMULA,LTHRESHOLD,LALL COMMON/CTHRESHOLD1/LCONSIDER,LFORMULA,LTHRESHOLD,LALL C COMPARE 3-D STRUCTURES OF ALIGNMENTS LOGICAL LCOMPSTR COMMON/C3D1/LCOMPSTR C CONTROL FLOW LOGICAL LBATCH COMMON/CONTROL1/LBATCH C INPUT FOR SEQUENCE FILES: LIST OF FILENAMES OR PROFILES OR DATABASES LOGICAL LIST,LPROF_1,LPROF_2,LISTOFSEQ_1,LISTOFSEQ_2 LOGICAL LSWISSBASE,LNRDBBASE,LFASTA_DB COMMON/CLIST1/LIST,LPROF_1,LPROF_2,LISTOFSEQ_1,LISTOFSEQ_2 COMMON/CLIST2/LSWISSBASE,LNRDBBASE,LFASTA_DB C PUNISH GAPS IN SECONDARY STRUCTURES IF INDELS NOT ALLOWED REAL PUNISH COMMON/GAPPUNISH/PUNISH C CHAIN-BREAK POSITIONS TO CHECK PIECES FROM DSSP AND BRK IF SUPERPOSITON IN 3-D INTEGER IBREAKPOS_1(MAXBREAK),IBREAKPOS_2(MAXBREAK) INTEGER NBREAK_1,NBREAK_2 COMMON/CHAINBREAK/IBREAKPOS_1,IBREAKPOS_2,NBREAK_1,NBREAK_2 C IF SEQUENCE 2 IS A DATABASE CHARACTER*200 SPLIT_DB_NAMES,SPLIT_DB_PATH,SPLIT_DB_DATA, + SPLIT_DB_INDEX COMMON/CDATABASE1/SPLIT_DB_NAMES,SPLIT_DB_DATA,SPLIT_DB_INDEX, + SPLIT_DB_PATH CHARACTER*200 SWISSPROT_SEQ,SW_CURRENT,CURRENT_DIR COMMON/CDATABASE2/SWISSPROT_SEQ,SW_CURRENT,CURRENT_DIR LOGICAL LBINARY COMMON/CDATABASE3/LBINARY CHARACTER HSSPID_1*40,BRKID_1*4,BRKID_2*4,HSSPID_2*40 COMMON/CHSSP2/HSSPID_1,BRKID_1,BRKID_2,HSSPID_2 C SEQUENCES ARE DSSP-FILES LOGICAL LDSSP_1,LDSSP_2 COMMON/CDSSP/LDSSP_1,LDSSP_2 C WRITE SELECTED ALIGNMENTS IN A TEMPORARY BINARY FILE WITH DIRECT ACCESS CHARACTER*200 COREPATH,COREFILE,JOB_ID COMMON/CORE3/COREPATH,COREFILE,JOB_ID C USED FOR 3-D-STRUCTURE COMPARISON CHARACTER*6 CRESID(MAXSQ) CHARACTER*200 BRKFILE_1,BRKFILE_2,BRKBEFORE1,BRKBEFORE2,PDBPATH COMMON/C3D2_1/BRKFILE_1,BRKFILE_2,BRKBEFORE1,BRKBEFORE2,CRESID COMMON/C3D2_2/PDBPATH C STORE ALL ANSWERS FOR QUESTIONS TO GENERATE COMMAND FILES CHARACTER*200 MAXHOM_DEFAULT,METRICPATH,FILTER_FASTA_EXE COMMON/CANSWER1/MAXHOM_DEFAULT,METRICPATH,FILTER_FASTA_EXE CHARACTER*200 ELONGWEIGHT_ANSWER,WEIGHT1_ANSWER,WEIGHT2_ANSWER COMMON/CANSWER2/ELONGWEIGHT_ANSWER,WEIGHT1_ANSWER,WEIGHT2_ANSWER CHARACTER*200 HSSP_FORMAT_ANSWER,COMPARE_ANSWER CHARACTER*200 PDBPATH_ANSWER COMMON/CANSWER3/HSSP_FORMAT_ANSWER,COMPARE_ANSWER,PDBPATH_ANSWER CHARACTER*200 NORM_PROFILE_ANSWER,PROFILE_EPSILON_ANSWER COMMON/CANSWER4/NORM_PROFILE_ANSWER,PROFILE_EPSILON_ANSWER CHARACTER*200 SMIN_ANSWER,SMAX_ANSWER,OPENWEIGHT_ANSWER COMMON/CANSWER5/SMIN_ANSWER,SMAX_ANSWER,OPENWEIGHT_ANSWER CHARACTER*200 NAME2_ANSWER,PROFILE_ANSWER,METRIC_ANSWER COMMON/CANSWER6/NAME2_ANSWER,PROFILE_ANSWER,METRIC_ANSWER CHARACTER*200 FASTA_EXE,FILTER_BLASTP_EXE,BLASTP_EXE COMMON/CANSWER7/FASTA_EXE,FILTER_BLASTP_EXE,BLASTP_EXE CHARACTER*200 CONVERTSEQ_EXE,NAME1_ANSWER COMMON/CANSWER8/CONVERTSEQ_EXE,NAME1_ANSWER CHARACTER*200 INDEL_ANSWER_1,INDEL_ANSWER_2,BACKWARD_ANSWER COMMON/CANSWER9/INDEL_ANSWER_1,INDEL_ANSWER_2,BACKWARD_ANSWER CHARACTER*200 PROFILEOUT_ANSWER,STRIPFILE_ANSWER COMMON/CANSWER10/PROFILEOUT_ANSWER,STRIPFILE_ANSWER CHARACTER*200 LONG_OUTPUT_ANSWER,PLOTFILE_ANSWER COMMON/CANSWER11/LONG_OUTPUT_ANSWER,PLOTFILE_ANSWER CHARACTER*200 PROFILE_GAMMA_ANSWER,STRUC_ALIGN_ANSWER COMMON/CANSWER12/PROFILE_GAMMA_ANSWER,STRUC_ALIGN_ANSWER CHARACTER*200 WAY3_ANSWER,THRESHOLD_ANSWER,SORTMODE_ANSWER COMMON/CANSWER13/WAY3_ANSWER,THRESHOLD_ANSWER,SORTMODE_ANSWER CHARACTER*200 HSSP_ANSWER,COMMANDFILE_ANSWER,DSSP_PATH COMMON/CANSWER14/HSSP_ANSWER,COMMANDFILE_ANSWER,DSSP_PATH CHARACTER*200 SAMESEQ_ANSWER,FILTER_ANSWER,NBEST_ANSWER COMMON/CANSWER15/SAMESEQ_ANSWER,FILTER_ANSWER,NBEST_ANSWER CHARACTER*200 NGLOBALHITS_ANSWER COMMON/CANSWER16/NGLOBALHITS_ANSWER C HSSP-OUTPUT LOGICAL LHSSP,LHSSP_LONG_ID COMMON/CHSSP1/LHSSP,LHSSP_LONG_ID C RELEASE NOTES OF SWISSPROT FOR HSSP-HEADER OUTPUT CHARACTER*200 RELNOTES COMMON/CSWISSREL/RELNOTES C VERY LONG SEQUENCES ARE CUT IN PIECES INTEGER NSHIFTED COMMON/CSHIFT1/NSHIFTED LOGICAL LSHIFTED COMMON/CSHIFT2/LSHIFTED C SPACE OVERFLOW LOGICAL LALIOVERFLOW,LBUFFEROVERFLOW COMMON/OVERFLOW/LALIOVERFLOW,LBUFFEROVERFLOW LOGICAL LASK,LDIALOG,LRUN COMMON/CONTROL2/LASK,LDIALOG,LRUN C ONLY USED TO GET RID OF INDEX COMMAND (CPU TIME) INTEGER TRANSPOS(NASCII) COMMON/CINDEX/TRANSPOS C Z-SCORES REAL ZSCORE(MAXALIGNS),ZSCORE_TEMP(MAXALIGNS) COMMON/CZSCORE/ZSCORE,ZSCORE_TEMP C LOCAL SEQUENCE ATTRIBUTES C AUTION: SET DIMENSION OF ALIGNED SEQUENCE ARRAY AL_2 HIGHER (SAY 10 %) THAN C THAT OF SEQUENCE ARRAY CSQ_1/2, AS ALIGNMENT MAY CONTAIN INSERTIONS IN C BOTH SEQUENCES, LEADING TO ALIGNED LENGTH HIGHER THAN SEQUENCE LENGTH. CHARACTER*1 CSQ_1_ARRAY(MAXSQ),CSQ_2_ARRAY(MAXSQ) COMMON/CLOCALSEQ/CSQ_1_ARRAY,CSQ_2_ARRAY CHARACTER AL_2*(MAXALSQ),SAL_2*(MAXALSQ) COMMON/CLOCALAL/AL_2,SAL_2 CHARACTER*(MAXSQ) STRUC_1_STRING,STRUC_2_STRING COMMON/CLOCALSTRUC/STRUC_1_STRING,STRUC_2_STRING C INSERTIONS IN SEQ 2 INTEGER IINS,INSLEN_LOCAL(MAXINS),INSBEG_1_LOCAL(MAXINS) INTEGER INSBEG_2_LOCAL(MAXINS) COMMON/CLOCALINS/IINS,INSLEN_LOCAL,INSBEG_1_LOCAL,INSBEG_2_LOCAL CHARACTER INSSEQ*(MAXINSBUFFER_LOCAL) COMMON/CLOCALINSSEQ/INSSEQ C PREDICTED STRUCTURE AND CORRECT PREDICTED STRUCTURE OF SEQUENCE 1 CHARACTER PREDSTR(MAXSQ),PREDSTRCORR(MAXSQ) COMMON/CPRED/PREDSTR,PREDSTRCORR REAL STRSUM(MAXSTRSTATES,MAXSQ),STRMAX COMMON/CSTRPRED/STRSUM,STRMAX C CHARACTER CTMPCHAR_1*(MAXALSQ) COMMON/CDUMMY_CHAR/CTMPCHAR_1 CHARACTER CTMPCHAR_ARRAY_1(MAXALSQ) COMMON/CTMPCHAR_ARRAY/CTMPCHAR_ARRAY_1 c TRACE-ROUTINE C AL_1_ARRAY/2 (I) = ALIGNMENT OF SEQ 1/2 C LAL_1/2 (I) = INTEGER VERSION OF AL C SAL_1_ARRAY/2 (I) = SECONDARY STRUCTURE OF ALIGNMENT OF SEQ 1/2 C LSAL_1/2 (I) = INTEGER VERSION OF SAL C ALI (I,1/2) = ALIGNMNETS PASSED TO 3-D SUPERPOSITION CHARACTER*1 AL_1_ARRAY(MAXALSQ),AL_2_ARRAY(MAXALSQ) CHARACTER*1 ALI_1(MAXALSQ),ALI_2(MAXALSQ) CHARACTER SAL_1_ARRAY(MAXALSQ),SAL_2_ARRAY(MAXALSQ) CHARACTER AL_AGREE(MAXALSQ),SAL_AGREE(MAXALSQ) COMMON/CTRACE_1/AL_1_ARRAY,AL_2_ARRAY,SAL_1_ARRAY,SAL_2_ARRAY COMMON/CTRACE_2/AL_AGREE,SAL_AGREE COMMON/CTRACE_3/ALI_1,ALI_2 INTEGER LAL_1(MAXALSQ),LAL_2(MAXALSQ) INTEGER LSAL_1(MAXALSQ),LSAL_2(MAXALSQ) INTEGER ITRACE(MAXALSQ),JTRACE(MAXALSQ) COMMON/CTRACE_4/LAL_1,LAL_2,LSAL_1,LSAL_2,ITRACE,JTRACE c SETMATRIX REAL MAX_D(0:MAXSQ+1),MAX_H(0:MAXSQ+1),MAX_V(0:MAXSQ+1) REAL LEFT_LH(0:MAXSQ+1),UP_LH(0:MAXSQ+1),DIAG_LH(0:MAXSQ+1) COMMON/CSETMATRIX1/MAX_D,MAX_H,MAX_V,LEFT_LH COMMON/CSETMATRIX2/UP_LH,DIAG_LH INTEGER*2 LDEL_H(0:MAXSQ+1),LDEL_V(0:MAXSQ+1) COMMON/CSETMATRIX3/LDEL_H,LDEL_V C SETBACK REAL RIGHT_LH(0:MAXSQ+1),DOWN_LH(0:MAXSQ+1) REAL MAX_ALL(0:MAXSQ+1) REAL MAX_METRIC_1_VAL(MAXSQ),MAX_METRIC_2_VAL(MAXSQ) COMMON/CSETBACK1/RIGHT_LH,DOWN_LH,MAX_ALL,MAX_METRIC_1_VAL COMMON/CSETBACK2/MAX_METRIC_2_VAL C GETCONSWEIGHT INTEGER LSEQ_2(MAXSQ),LSEQTEMP(MAXSQ) INTEGER*2 IFIRST(MAXALIGNS),ILAST(MAXALIGNS) INTEGER NOCC(MAXSQ) COMMON/CGETCONS_1/LSEQ_2,LSEQTEMP,IFIRST,ILAST,NOCC REAL SIMVAL(MAXALSQ) COMMON/CGETCONS_2/SIMVAL INTEGER*2 IABOVE(MAXALIGNS),IBOTH_LEGAL(MAXALSQ) COMMON/CGETCONS_3/IABOVE,IBOTH_LEGAL profphd-utils-1.0.10/maxhom.default0000755015075101507510000000143512012371466016536 0ustar lkajanlkajan# MAXHOM DEFAULTS # # MACHINE : UNIX METRIC_PATH : /home/rost/pub/max/mat/ SWISSPROT_SEQ : /data/swissprot/seq.dat RELEASE_NOTES : /home/rost/pub/max/mat/relnotes.txt SWISSPROT_CURRENT : /data/swissprot/current SW_PATH : /data/maxhom/ SW_INDEX : swissprot_maxhom.index SW_DATA : _swissprot_maxhom.dat PDB_PATH : /data/pdb/ DSSP_PATH : /data/dssp/ COREPATH : ./ COREFILE : MAXHOM_ALI. CONVERTSEQ_EXE : convert_seq FASTA_EXE : fasta FILTER_FASTA_EXE : filter_fasta BLASTP_EXE : blastp FILTER_BLASTP_EXE : filter_blastp ## profphd-utils-1.0.10/maxhom.f0000644015075101507510000074001612012371466015341 0ustar lkajanlkajanC======================================================================= C===================================================================== C MAXHOM at EMBL 1994 C C. Sander, MPIMF, Heidelberg, 1982/1984 C R. Schneider, EMBL, Heidelberg, 1988- C====================================================================== C best alignment(s) between sequences, and a little bit more...... C====================================================================== C Algorithm(s): C ============= C Smith and Waterman: C Identification of common molecular subsequences, C JMB 147 (1981) 195-197 C Gotoh: C An Improved Algorithm for Matching Biological Sequences, C JMB (1982) 162, 705-708 C Jones R. et.al.: C Protein Sequence Comparison on the Connection Machine CM-2, C in: Computers and DNA, SFI Studies in the Sciences of Complexity, C Vol VII, Addison-Wesley, 1990 C Gribskov et.al.: C Profile analysis: Detection of distantly related proteins, C PNAS 84, (1987) 4355-4358 C Sander C., Schneider R.: C Database of homology derived protein structures and the structural C meaning of sequence alignment C Proteins, 9:56-58 (1991) C + other stuff to be published C C====================================================================== C HISTORY C====================================================================== C plan: calculate reliability score from backward/forward C C July 1995: optimized inter-processors communication C July 1995: optimized I/O during database scan C June 1995: read Fasta-database format C June 1995: NRDB in parallel C 1994-Jun.95 : parallel version for single sequence, profile and C profile-profile (list of filenames) scan C ports to Parsytec PowerGC, Meiko CS2, IBM-SP2 C 1992-1994: parallel versions for PVM, P4, EXPRESS, PARIX, INTEL C JAN 92: matrix setting in forward and backward way C MAXDEL option removed; causes problem C DEZ 91: DO-loops in SETMATRIX vectorized C AUG 91: major rewritting and "clean up" (RS) C ==================================== C the main changes are: C 1. the metric is now allways handled as a profile. C The following values are now position dependent: C metric, gap-open, gap-elongation and conservation weights C this allows to align: C a) sequence(s) <===> sequence(s) C b) profile(s) <===> sequence(s) C c) sequence(s) <===> profile(s) C d) profile(s) <===> profile(s) C 1. full profile (inner product of position dependent C profiles, without taking into account the actual C sequence information C 2. sequence as a representative of the family C 3. maximum of profile position as the "consensus" sequence C PROFILE and METRIC files have now a standart format C 2. the matrix is not longer sorted. For each NBEST alignment C the matrix is searched for the best value which was not a C member of a previous best path. So the matrix dimension is C now "only" LH(x,y,2) and not LH(x,y,4) C ( like a fast TRACE back ) C 3. selected alignments are written now to a temporay binary C file with direct access (necessary because of point 5). C Only the ALISORTKEY (pointing to the record number) C is sorted via QSORT C 4. list of filenames for sequence 1 C 5. sequence 2 can be a database like SwissProt C 6. command file generator on run-time and .log file feature C======================================================================= C JUL 91: use also cons-weights for the second sequence (only read in) C JUL 91: MAXDEL restriction removed according to Gotoh (1982) (RS) C NOTE: its still possible to set a MAXDEL restriction C usefull if one looks for repeats..... C JUN 91: SETMATRIX in a anitidiagonal way (run in parallel) (RS) C implementation on an Alliant (GMD Bonn) C May 91: speed up by recoding SETMATRIX (RS) C APR 91: change HSSP-routines to HSSP-LIB routines (RS) C APR 91: sort hits according to user specification (RS) C APR 91: STRIP output procedure changed (no MAXLINES etc.) (RS) C MAR 91: cons-weight calculation debugged (RS) C DEZ 90: change GETSEQ, to handle chains by name (fx. x.dssp_!_A,B) C OKT 90: FOSFOS: Fitness Of Sequence For Structure (contact prefs for C second sequence (RS) C OKT 90: WRITE alis only in core if specified threshold is fulfilled C (RS) C SEP 90: linethickness for TRACE (RS) C AUG 90: two pass alignment (derive cons-weights ===> C align once more) (RS) C AUG 90: use contact prefs from CONAN for alignment (RS) C JUL 90: use conservation weights (RS) C JAN 90: use structure dependent matrices (RS) C JUN 89: extract and modify GETAASEQ to GETSEQ (RS). This version C reads the BRK-number to check if C pieces for RMS-evaluation from DSSP and BRK C are the same (CHECKPOSITION) C MAY 89: try to replace LHQSORT by searching every time the best C NO ADVANTAGE LHQSORT IS STILL ACTIVE (RS) C the problem: most of the 'best' values are reject in TRACE C because the trace jumps in a previous used trace, C so one has to search many time the best value C MAY 89: speed up (RS) (SETMATRIX,LHQSORT,GETCOOR,HSSP) C APR 89: ALITOSTRUC (RS) (superpose alignments using U3B of W Kabsch) C JAN 89: and later HSSP-routine(s) (RS) C FEB 88: TRACE to POSTSCRIPT via TREACE2PS (Brigitte Altenberg) C JAN 88: seq/str homology (MODESIM = 206) debugged C OCT 84: NEWMETRIC option, a learning process C AUG 84: LSIM generalized to different types of metric switched C by MODSIM C JUL? 84: all H etc. values converted to integer arithmetic to C save storage C MAY 84: jobinterrupt feature for long jobs C DEC? 83: secondary structure prediction from DSSP homologies C DEC? 83: global output to strip C DEC? 83: storage of alignments in core for global QSORT C SEP 83: converted to DNA, corrected deletion weight D2*(L-1) C SEP 83: corrected termination test in TRACE, corrected 'a'='C' C in GETAASEQ C SEP 83: adapted to 16-bit for testing, avoided BACKSPACE in GETSIM C SEP 83: repeat pass for MAXMAT overflow C 1982: first implementation for DNA and plot interface by C Oefner. C 1980: original algorithm by Temple Smith C======================================================================= C input files C======================================================================= C KGETSEQ sequence file called by GETSEQ C KLIS1 list of sequences for first sequences C KLIS2 list of sequence for second sequences C KSIM metric-data C KISO isosignificance data (HSSP) C KREL release notes swissprot C KBRK according brookhaven file (structure compare) C KBASE SWISSPROT AND PIRONLY seq.dat files C KINDEX SWISSPROT index file C KDEF MAXHOM defaults file C KREF SwissProt pdb-pointer selection C======================================================================= C output files C======================================================================= C KSTP STRIPFILE, STRIP.X, summary output file for globally C best sequences C KLONG long output file C KPLOT PLOTFILE, TRACE.X, intermediate plot file for use by TRACE C KSTAT STATFILE, COLLAGE-STATIS.DATA, prediction statistics file C KWARN WARNFILE contains different warnings C KSTRUC datafile for HSSP-PLOT (sec-struc and rms-d) C KTAB table file C KCONS conservation-weights (and evolution of cons-weights file) C KPROF MAXHOM-profile C KLOG MAXHOM-LOG file C KHSSP HSSP-output file C KCOM command file (*.com (VMS) .csh (UNIX)) C KHISTO C KDEB debug file at various places C======================================================================= C input and output C======================================================================= C KCORE C======================================================================= C C C*********************************************************************** C***** CAUTION: INSTRUCTIONS FOR DIMENSIONING ******* C*********************************************************************** C memory requirement around 32 MegaBytes C MAXMAT*2*4 = 2000000*2*4= 16.0 MegaBytes = REAL*4 LH(len1,len2,2) C MAXSQ=longest test C MAXALSQ is max length of aligned sequence C N1,N2IN are actual sequence lengths. C ND1,ND2 are used matrix dimensions, ND1=N1+1, ND2=N2+1 for margins. C PROGRAM MAXHOM IMPLICIT NONE C---- include parameter files INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C---- local variables REAL LH1(0:MAXMAT) INTEGER*2 LH2(0:MAXTRACE) COMMON/CBIGMATRIX/LH1 COMMON/CBIGTRACEBACK/LH2 C parallel stuff and control flow LOGICAL LPARALLEL,LAGAIN C WRITE selected alignments into temp binary file with direct access CHARACTER*200 CORETEMP C profile output file CHARACTER*200 PROFILEOUT C amino acid exchange metrices CHARACTER*200 METRIC_ADM,METRIC_GCG,METRIC_STRUC, + METRIC_IO,METRIC_STRIO,PROFILEMETRIC c character*80 metricpath,metric_adm,metric_gcg,metric_struc, c + metric_io,metric_strio,profilemetric c checkformat CHARACTER*20 SEQFORMAT C temporary string for strip-output CHARACTER STRIPLINE(4)*(MAXSQ) c chain remark string for hssp-header CHARACTER*200 CHAINREMARK CHARACTER CTEMP*1,CTEMP2*1,TEMPNAME*200,TEMPSTRING*200 C CHARACTER FILE_OPTION*80 CHARACTER FILE_OPTION*200 CHARACTER QUESTION*1000 CHARACTER PDBREFLINE*3000 C CHARACTER CFILTER*80,CSYMBOL CHARACTER CFILTER*200,CSYMBOL CHARACTER LINE*100 C histogram c integer lhist(maxhist,maxhist) INTEGER NFILE,NENTRIES,NAMINO_ACIDS,IFILE,NRECORD,IRECORD, + LRES,NCHAIN,KCHAIN,IDSSP,NALIGN,NSELECT,KSELECT, + IAL, + ISTRPOS,IOPOS,INSPOS,IAGR,ICLASS,JCLASS,LINELEN, + IFIR,ILAS,JFIR,JLAS,IDEL,NDEL,LEN1,LENOCC, + IPOS,JPOS, + I,J,K,IBEG,IEND c integer jbeg,jend INTEGER IARG,ISET,LALI,IWORKER c integer istep C local smin,smax... REAL XSMIN,XSMAX,XMAPLOW,XMAPHIGH C Z-scores c real zscore(maxaligns),zscore_temp(maxaligns) REAL VALUE,CVALSTR,CPERRES,CHECKVAL, + RMS,HOM,SIM,DISTANCE C if second sequence(s) are DSSP-files predict secondary structure LOGICAL LPREDICTION,LSTRIP_LONG,LENDFILE C error flag LOGICAL LERROR,LTRUNCATED C zscore REAL SDEV,CURT,SKEW,VAR,ADEV,AVE c profile-calculation CHARACTER*40 WEIGHT_MODE REAL SIGMA,BETA + C======================================================================= C init C======================================================================= C strings TOTAL_TIME=0.0 CALL INIT_CPU_TIME(ITIME_OLD) HOST_FILE= 'hosts.pvm' NODE_NAME= 'maxhom' CFILTER= ' ' JOB_ID= ' ' IFILE= 0 QUESTION= ' ' C PDBREFLINE=' ' c some warnings WARNFILE= 'MAXHOM.WARNING' LISTFILE_1= ' ' LISTFILE_2= ' ' COREPATH= ' ' COREFILE= ' ' BRKFILE_1= ' ' BRKFILE_2= ' ' BRKBEFORE1= ' ' BRKBEFORE2= ' ' TEMPNAME= ' ' TEMPSTRING= ' ' NAME_2= ' ' C logicals LISTOFSEQ_1=.FALSE. LISTOFSEQ_2=.FALSE. LSWISSBASE= .FALSE. LFASTA_DB= .FALSE. LNRDBBASE= .FALSE. LENDFILE= .FALSE. LAGAIN= .FALSE. LSTRIP_LONG=.FALSE. C run in parallel LPARALLEL= .FALSE. MP_MODEL= 'NIX' IDTOP= 0 IDPROC= 0 ID_HOST= 0 LINK_HOST= 0 NWORKER= 1 NWORKSET= 0 C LMIXED_ARCH=.FALSE. DO I = 1,NWORKER LINK(I) = 0 ENDDO TOTAL_TIME= 0.0 NFILE= 0 N_ONE= 1 C Way3 alignment (sequence ---> database -----> profile ----> database) L3WAY= .FALSE. L3WAYDONE= .FALSE. C warm start LFIRST_SCAN=.TRUE. LWARM_START=.FALSE. NSEQ_WARM_START=0 c for structure dependent metrices CSTRSTATES= 'ELH' NSTRSTATES_1=1 NIOSTATES_1= 1 NSTRSTATES_2=1 NIOSTATES_2= 1 I= MAXIOSTATES * MAXSTRSTATES CALL INIT_REAL_ARRAY(1,I,IORANGE,200.0) C matrix scaling XSMIN= 0.0 XSMAX= 0.0 XMAPLOW= 0.0 XMAPHIGH= 0.0 C weights ISEQPOS= 1 C---- C---- BR 98.10 do NOT see the reason why this should NOT be 0! C---- is minimal value for conservation weight CONSMIN= 0.01 CUTVALUE1= 0.0 CUTVALUE2= 0.0 VALUE= 1.0 CALL INIT_REAL_ARRAY(1,MAXSQ,CONSWEIGHT_1,VALUE) VALUE= 0.0 CALL INIT_REAL_ARRAY(1,MAXSQ,GAPOPEN_1,VALUE) CALL INIT_REAL_ARRAY(1,MAXSQ,GAPOPEN_2,VALUE) C sort scores and zscores CALL INIT_REAL_ARRAY(1,MAXHITS,AL_VAL,VALUE) CALL INIT_REAL_ARRAY(1,MAXHITS,AL_HOM,VALUE) CALL INIT_REAL_ARRAY(1,MAXHITS,AL_SDEV,VALUE) CALL INIT_REAL_ARRAY(1,MAXALIGNS,ZSCORE_TEMP,VALUE) CALL INIT_REAL_ARRAY(1,MAXALIGNS,ALISORTKEY,VALUE) I=0 CALL INIT_INT_ARRAY(1,MAXALIGNS,IFILEPOI,I) CALL INIT_INT_ARRAY(1,MAXALIGNS,IRECPOI,I) IALIGN_GOOD= 0 c call init_int2_array(1,maxaligns,len2_orig,i) cx call init_real_array(1,maxaligns,len2_orig,value) SDEV= 0.0 DO I=1,MAXHITS AL_PDB_POINTER(I)=' ' AL_ACCESSION(I)= ' ' ENDDO INSNUMBER= 0 DO I=1,MAXINSBUFFER INSBUFFER(I)= ' ' ENDDO C secondary structure prediction I=MAXSQ * MAXSTRSTATES CALL INIT_REAL_ARRAY(1,I,STRSUM,VALUE) C TRANS is set here TRANS='VLIMFWYGAPSTCHRKQENDBZX!-.' CALL GETPOS(TRANS,TRANSPOS,NASCII) C secondary structure symbols and translations STRTRANS= 'ELHT CSBAPMGIU!' STR_CLASSES(1)='EBAPMebapm' STR_CLASSES(2)='L TCStclss' STR_CLASSES(3)='HGIhgiiiii' STR_CLASSES(4)='U!!!!!!!!!' c other IEND=0 C======================================================================= C get enviroment C======================================================================= MACHINE_NAME= 'PARSYTEC' ARCHITECTURE= 'PX' MAXHOM_DEFAULT='maxhom.default' c call get_machine_name(machine_name) c tempname='ARCH' c call get_enviroment_variable(tempname,architecture) TEMPNAME= 'MAXHOM_DEFAULT' CALL GET_ENVIROMENT_VARIABLE(TEMPNAME,MAXHOM_DEFAULT) LSMALL_MACHINE=.FALSE. LPARALLEL= .TRUE. c lparallel=.false. CALL GET_ARG_NUMBER(IARG) IF (IARG .GT. 0) THEN DO I=1,IARG CALL GET_ARGUMENT(I,TEMPNAME) TEMPSTRING=TEMPNAME CALL LOWTOUP(TEMPNAME,LEN(TEMPNAME)) CALL STRPOS(TEMPNAME,IBEG,IEND) IF (INDEX(TEMPNAME,'-PAR') .NE. 0) THEN LPARALLEL=.TRUE. ELSE IF (INDEX(TEMPNAME,'-NOPAR') .NE. 0) THEN LPARALLEL=.FALSE. ELSE IF (INDEX(TEMPNAME,'-H=') .NE. 0) THEN HOST_FILE(1:)=TEMPSTRING(IBEG+3:IEND) ELSE IF (INDEX(TEMPNAME,'-D=') .NE. 0) THEN MAXHOM_DEFAULT(1:)=TEMPSTRING(IBEG+3:IEND) ENDIF ENDDO ENDIF C======================================================================= CPARALLEL C init the parallel enviroment: farmer-worker topology..... C======================================================================= IF (LPARALLEL .EQV. .TRUE.) THEN CALL MP_INIT_FARM() ENDIF IF (NWORKER .LT. 1) THEN LPARALLEL=.FALSE. ELSE IF (NWORKER .LE. MINPROC) THEN LSMALL_MACHINE=.TRUE. ENDIF IF (MP_MODEL .EQ. 'PARIX') THEN ARCHITECTURE='PX' ENDIF C======================================================================= CPARALLEL C ONLY HOST IS ASKING THE STUFF C======================================================================= IF (IDPROC .EQ. ID_HOST) THEN C======================================================================= WRITE(6,*)'************************************ MAXHOM ****'// + '****************************' WRITE(6,*)'* '// + ' *' WRITE(6,*)'* Chris Sander, MPIMF, 1'// + '982/1985 *' WRITE(6,*)'* Reinhard Schneider, EMBL , 1'// + '988-(?) *' WRITE(6,*)'* '// + ' *' WRITE(6,*)'************************************************'// + '****************************' WRITE(6,*)'* '// + ' *' WRITE(6,*)'* in case of strange behaviour of the program, '// + ' *' WRITE(6,*)'* please check the file: MAXHOM.LOG_"PID" '// + ' *' WRITE(6,*)'************************************************'// + '****************************' C======================================================================= c WRITE(6,*)'mp_model / ARCH / lsmall: ',mp_model,architecture, c + lsmall_machine c IF ( lparallel .eqv. .true.) THEN c WRITE(6,*)'INFO: parallel version switched ON ' c WRITE(6,*)'INFO: mixed architecture is: ',lmixed_arch c WRITE(6,*)'INFO: host_file=',host_file(1:40) c endif WRITE(6,*)'INFO: default=',maxhom_default(1:40) C======================================================================= C DEFAULTS C======================================================================= CALL GET_DEFAULT() COMMANDFILE_ANSWER= 'YES' C test sequence NAME1_ANSWER= ' ' NAME1_ANSWER= '/data/dssp/5p21.dssp' c comparison sequence(s) NAME2_ANSWER= 'swissprot' PROFILE_ANSWER= 'NO' METRIC_ANSWER= 'LACHLAN' C worst and best match SMIN_ANSWER= '-0.5' SMIN= -0.5 SMAX_ANSWER= '1.0' SMAX= 1.0 MAPLOW= 0.0 MAPHIGH= 0.0 c gap open ; gap elongation OPENWEIGHT_ANSWER= '3.0' OPEN_1= 3.0 ELONGWEIGHT_ANSWER= '0.1' ELONG_1= 0.1 c conservation weights WEIGHT1_ANSWER= 'NO' WEIGHT2_ANSWER= 'NO' WAY3_ANSWER= 'NO' C INDELs in secondary structure INDEL_ANSWER_1= 'NO' INDEL_ANSWER_2= 'NO' C profile normalization NORM_PROFILE_ANSWER= 'NO' PROFILE_EPSILON_ANSWER='0.1' PROFILE_GAMMA_ANSWER= '10.0' C suboptimal traces BACKWARD_ANSWER= 'NO' FILTER_ANSWER= '10.0' LBACKWARD= .FALSE. c number of alignments per pair NBEST_ANSWER= '1' c punish gap in secondary structure PUNISH= 9000.0 c maximum number of reported alignments WRITE(NGLOBALHITS_ANSWER,'(I8)')MAXHITS THRESHOLD_ANSWER= 'FORMULA' C sort hits SORTMODE_ANSWER= 'DISTANCE' C HSSP output and threshold HSSP_ANSWER= 'YES' LHSSP= .TRUE. LFORMULA= .FALSE. LTHRESHOLD= .TRUE. HSSP_FORMAT_ANSWER= 'NO' c show 100% identical hits SAMESEQ_ANSWER= 'YES' C compare 3D-structure if known COMPARE_ANSWER= 'NO' LCOMPSTR= .FALSE. C align secondary structure symbols STRUC_ALIGN_ANSWER= 'NO' C profile output PROFILEOUT_ANSWER= 'NO' C strip output STRIPFILE_ANSWER= 'NO' C long output file LONG_OUTPUT_ANSWER= 'NO' C dot-plot output PLOTFILE_ANSWER= 'NO' C pdb_path PDBPATH=PDBPATH_ANSWER c similarity matrix CALL STRPOS(METRICPATH,IBEG,IEND) METRIC_ADM=METRICPATH(:IEND)//'Maxhom_McLachlan.metric' METRIC_GCG=METRICPATH(:IEND)//'Maxhom_GCG.metric' METRIC_STRUC=METRICPATH(:IEND)//'????' METRIC_IO=METRICPATH(:IEND)//'?????' METRIC_STRIO=METRICPATH(:IEND)//'Maxhom_Struc_IO.metric' METRIC_HSSP_VAR=METRICPATH(:IEND)//'Maxhom_GCG.metric' C get input c WRITE(6,*)' call interface' CALL INTERFACE c log-file IF (JOB_ID .EQ. ' ') THEN JOB_ID='0' ENDIF CALL CONCAT_STRINGS('MAXHOM.LOG_',JOB_ID,LOGFILE) TEMPNAME=LOGFILE IF (COREPATH .NE. ' ') THEN CALL CONCAT_STRINGS(COREPATH,TEMPNAME,LOGFILE) ENDIF c open log file for parameter,warnings, error..... TEMPNAME= 'NEW,RECL=200' CALL OPEN_FILE(KLOG,LOGFILE,TEMPNAME,LERROR) CALL LOG_FILE(KLOG,'**************************** MAXHOM-'// + 'LOGFILE ***************************',0) c-------------------------------------------------------------------- IF (LDIALOG .EQV. .TRUE.) THEN QUESTION=' WRITE a command file for background job and '// + 'stop MAXHOM afterwards ? /n'// + 'NO : start the program interactive /n'// + 'YES : generates a command file to run MAXHOM with '// + 'the specified parameters ' CALL GETCHAR(LEN(COMMANDFILE_ANSWER),COMMANDFILE_ANSWER, + QUESTION) CALL LOWTOUP(COMMANDFILE_ANSWER,LEN(COMMANDFILE_ANSWER)) IF (INDEX(COMMANDFILE_ANSWER,'Y').NE.0) THEN COMMANDFILE_ANSWER='YES' ELSE COMMANDFILE_ANSWER='NO' ENDIF ENDIF C====================================================================== C now ask for all the stuff C====================================================================== C get sequence(s) one C====================================================================== IF (LDIALOG .EQV. .TRUE.) THEN QUESTION=' name of FIRST sequence(s) /n'// + '"filename" : one sequence file (Format:GCG/PIR/'// + 'SWISSPROT/BRK/DSSP/HSSP/PROFILE) /n'// + '"file list" : file with one sequence filename per'// + ' line /n NOTE: the filename of a "file list"'// + ' must contain one of /n'// + ' the following strings: "lifi" '// + '"list" "profile"' CALL GETCHAR(MAXLENSTRING,NAME1_ANSWER,QUESTION) ENDIF NAME_1=NAME1_ANSWER LISTFILE_1=NAME1_ANSWER WRITE(LOGSTRING,'(A,A)')'sequence file 1: ',name_1 CALL LOG_FILE(KLOG,LOGSTRING,0) CALL CHECKFORMAT(KGETSEQ,NAME_1,SEQFORMAT,LERROR) WRITE(6,*)' SEQFORMAT: ',seqformat C if format is 'UNK' assume its a list of filenames IF (SEQFORMAT .EQ. 'UNK') THEN TEMPNAME=NAME_1 CALL LOWTOUP(TEMPNAME,MAXLENSTRING) IF ( (INDEX(TEMPNAME,'LIFI') .NE. 0) .OR. + (INDEX(TEMPNAME,'LIST') .NE. 0) ) THEN LIST=.TRUE. ENDIF IF ( INDEX(TEMPNAME,'PROFILE') .NE. 0 ) THEN LPROF_1=.TRUE. ENDIF IF ( (LIST .EQV. .TRUE.) .OR. (LPROF_1 .EQV. .TRUE.) ) THEN LISTOFSEQ_1=.TRUE. ENDIF ELSE LPROFILE_1=INDEX(SEQFORMAT,'PROFILE').NE.0 IF (INDEX(SEQFORMAT,'DSSP') .NE.0 .OR. + INDEX(SEQFORMAT,'PROFILE-DSSP') .NE.0) THEN LDSSP_1=.TRUE. ENDIF ENDIF IF ( (LISTOFSEQ_1 .EQV. .TRUE. ) .AND. + (LPARALLEL .EQV. .TRUE. ) ) THEN LWARM_START= .TRUE. ENDIF C====================================================================== C GET COMPARISON SEQUENCE(S) IF (LDIALOG .EQV. .TRUE.) THEN IF (COMMANDFILE_ANSWER .EQ. 'YES') THEN QUESTION=' name of SECOND sequence(s) /n'// + '"filename" : one sequence file (Format:GCG/PIR/'// + 'SWISSPROT/BRK/DSSP/HSSP/PROFILE) OR/n'// + ' a database in FASTA-format/n'// + '"file list" : file with one sequence filename per line'// + '/n NOTE: the filename of a "file list" '// + 'must contain one of /n'// + ' the following strings: "lifi" '// + '"list" "profile" /n'// + 'SWISSPROT : search against SWISSPROT (Maxhom format)/n'// + 'NRDB : search against NRDB (Maxhom format)/n'// + ' ======================================== /n'// + ' PRE-FILTERS for database searches: /n'// + ' (necessary for deriving the conservation weights)/n'// + 'BLASTP : prefilter database with BLASTP/n'// + 'FASTA : prefilter database with FASTA' ELSE QUESTION=' name of second sequence /n'// + '"filename" : one sequence file (Format: GCG/PIR/'// + 'SWISSPROT/BRK/DSSP/HSSP) /n'// + ' a database in FASTA-format/n'// + '"file list" : file with one filename per line /n'// + ' NOTE: the filename of a "file list" '// + 'must contain one of /n'// + ' the following strings: "lifi" '// + '"list" "profile" /n'// + 'SWISSPROT : search against SWISSPROT (Maxhom) /n'// + 'NRDB : search against NRDB (Maxhom format)' ENDIF CALL GETCHAR(MAXLENSTRING,NAME2_ANSWER,QUESTION) ENDIF NAME_2=NAME2_ANSWER LISTFILE_2=NAME2_ANSWER WRITE(LOGSTRING,'(A,A)')'sequence file 2: ',name_2 CALL LOG_FILE(KLOG,LOGSTRING,0) TEMPNAME=NAME_2 CALL LOWTOUP(TEMPNAME,MAXLENSTRING) CALL STRPOS(TEMPNAME,IBEG,IEND) IF (INDEX(SPLIT_DB_NAMES,TEMPNAME(IBEG:IEND)) .GT. 0) THEN IF ( LPARALLEL .EQV. .TRUE. ) THEN LWARM_START= .TRUE. ENDIF CALL OPEN_FILE(KDEF,MAXHOM_DEFAULT,'OLD,READONLY',LERROR) LINE=' ' DO WHILE( LINE .NE. '##') READ(KDEF,'(A)')LINE IF (INDEX(TEMPNAME(IBEG:),'SWISSPROT') .NE. 0 ) THEN LSWISSBASE=.TRUE. CALL EXTRACT_STRING(LINE,':','SWISSPROT_INDEX', + SPLIT_DB_INDEX) CALL EXTRACT_STRING(LINE,':','SWISSPROT_PATH', + SPLIT_DB_PATH) CALL EXTRACT_STRING(LINE,':','SWISSPROT_DATA', + SPLIT_DB_DATA) ELSE IF ( INDEX(TEMPNAME(IBEG:),'NRDB' ) .NE.0 ) THEN LNRDBBASE=.TRUE. CALL EXTRACT_STRING(LINE,':','NRDB_INDEX', + SPLIT_DB_INDEX) CALL EXTRACT_STRING(LINE,':','NRDB_PATH', + SPLIT_DB_PATH) CALL EXTRACT_STRING(LINE,':','NRDB_DATA', + SPLIT_DB_DATA) ENDIF ENDDO ELSE IF (TEMPNAME(IBEG:) .EQ. 'FASTA') THEN CFILTER='FASTA' ELSE IF (TEMPNAME(IBEG:) .EQ. 'BLASTP') THEN CFILTER='BLASTP' ELSE CALL CHECKFORMAT(KGETSEQ,NAME_2,SEQFORMAT,LERROR) WRITE(6,*)' SEQFORMAT: ',SEQFORMAT C if format is 'UNK' assume its a list of filenames IF (SEQFORMAT .EQ. 'UNK') THEN IF ( (INDEX(TEMPNAME,'LIFI') .NE. 0) .OR. + (INDEX(TEMPNAME,'LIST') .NE. 0) ) THEN LIST=.TRUE. ENDIF IF ( INDEX(TEMPNAME,'PROFILE') .NE. 0) THEN LPROF_2=.TRUE. ENDIF IF ( (LIST .EQV. .TRUE.) .OR. + (LPROF_2 .EQV. .TRUE.)) THEN LISTOFSEQ_2=.TRUE. ENDIF ELSE IF (SEQFORMAT .EQ. 'FASTA-DB') THEN LFASTA_DB=.TRUE. LHSSP_LONG_ID=.TRUE. HSSP_FORMAT_ANSWER='YES' ENDIF LPROFILE_2=INDEX(SEQFORMAT,'PROFILE') .NE. 0 IF (INDEX(SEQFORMAT,'DSSP') .NE. 0 .OR. + INDEX(SEQFORMAT,'PROFILE-DSSP') .NE.0) THEN LDSSP_2=.TRUE. ENDIF ENDIF C-------------------------------------------------------------------- C PROFILEMODE C 0: no profiles, just a simple sequence alignment C 1: profile for sequence 1 (and not for sequence 2) C 2: profile for sequence 2 (and not for sequence 1) C 3: full alignment of two profiles (inner product of position C dependent profiles), without taking into account the sequence C (structure,I/O...) information C 4: take the sequences as a representative of the family C 5: take the maximal value at each position as a "consensus" sequence C 6: metric which depends on both sequence/structure/io-states C like: seq,seq2,str1,str2,io1,io2 C-------------------------------------------------------------------- LASK=.TRUE. IF (LDIALOG .EQV. .TRUE.) THEN IF ( (LPROFILE_1 .AND. LPROFILE_2 ) .OR. + (LPROF_1 .AND. LPROF_2) .OR. + (LPROFILE_1 .AND. LPROF_2) .OR. + (LPROF_1 .AND. LPROFILE_2) ) THEN PROFILE_ANSWER='FULL' QUESTION=' profile against profile mode ? (found two '// + 'profiles ) /n'// + 'FULL : profile alignment without sequence '// + 'information /n'// + 'MEMBER: sequences are the representative of '// + 'the family /n'// + 'MAX : the maximal value at each position as '// + 'consensus' c + 'IGNORE: ignore the 2. profile (for example if '// c + 'one wants to /n'// c + ' use only the weights from the 2. profile' ELSE LASK=.FALSE. WRITE(LOGSTRING,'(A)')'2 PROFILES MODE: NO PROFILES' QUESTION= + ' 2 profiles mode: * disabled (NEED 2 PROFILES) *' CALL STRPOS(QUESTION,IBEG,IEND) WRITE(6,*)QUESTION(IBEG:IEND) ENDIF IF (LASK .EQV. .TRUE.) THEN CALL GETCHAR(MAXLENSTRING,PROFILE_ANSWER,QUESTION) ENDIF ENDIF CALL LOWTOUP(PROFILE_ANSWER,MAXLENSTRING) IF (INDEX(PROFILE_ANSWER,'FULL') .NE. 0) THEN PROFILEMODE=3 WRITE(LOGSTRING,'(A)')'full profile alignment without '// + 'sequence information' ELSE IF (INDEX(PROFILE_ANSWER,'MEMBER') .NE. 0) THEN PROFILEMODE=4 WRITE(LOGSTRING,'(A)')'sequences is the representative '// + 'of the family' ELSE IF (INDEX(PROFILE_ANSWER,'MAX') .NE. 0) THEN PROFILEMODE=5 WRITE(LOGSTRING,'(A)')'the maximal value at each '// + 'position serves as a consensus sequence' c else IF (index(profile_answer,'IGNORE') .ne. 0) THEN c profilemode=1 c WRITE(logstring,'(a)')'second profile will be ignored ' ELSE IF (.NOT.LPROFILE_1 .AND. (LPROFILE_2.OR.LPROF_2)) THEN PROFILEMODE=2 ELSE IF ( LPROFILE_1 .AND. .NOT. LPROFILE_2) THEN PROFILEMODE=1 ELSE PROFILEMODE=0 ENDIF CALL LOG_FILE(KLOG,LOGSTRING,0) C-------------------------------------------------------------------- C get metric C-------------------------------------------------------------------- METRICFILE=' ' IF (LDIALOG .EQV. .TRUE.) THEN IF ( (LPROF_1 .EQV. .TRUE. ) .OR. (LPROFILE_1 .EQV. .TRUE.) + .OR. (LPROFILE_2 .EQV. .TRUE.) .OR. + (LPROF_2 .EQV. .TRUE.)) THEN METRIC_ANSWER='PROFILE' QUESTION=' exchange metric ? /n'// + 'LACHLAN : Andrew McLachlan /n'// + 'GCG : Dayhoff used by GCG /n'// + 'STRUC : secondary structure dependent /n'// + 'IO : inside/outside dependent /n'// + 'STRUC_IO : secondary structure and I/O '// + 'dependent /n'// + '"filename" : import any metric in Maxhom '// + 'format (full pathname required ) /n'// + 'PROFILE : use metric from profile' ELSE IF (LDSSP_1 .EQV. .TRUE.) THEN QUESTION=' exchange metric ? /n'// + 'LACHLAN : Andrew McLachlan /n'// + 'GCG : Dayhoff used by GCG /n'// + 'STRUC : secondary structure dependent /n'// + 'IO : inside/outside dependent /n'// + 'STRUC_IO : secondary structure and I/O '// + 'dependent /n'// + '"filename" : import any metric in Maxhom '// + 'format (full pathname required ) /n'// + '** OTHER OPTIONS DISABLED (NEED PROFILE OR '// + 'DSSP FILE) **' ELSE QUESTION=' exchange metric ? /n'// + 'LACHLAN : Andrew McLachlan /n'// + 'GCG : Dayhoff used by GCG /n'// + '"filename" : import any metric in Maxhom '// + 'format (full pathname required ) /n'// + '** OTHER OPTIONS DISABLED (NEED PROFILE OR '// + 'DSSP FILE) **' ENDIF CALL GETCHAR(MAXLENSTRING,METRIC_ANSWER,QUESTION) ENDIF IF ( (INDEX(METRIC_ANSWER,'/').NE.0) .AND. + (CMACHINE.EQ.'UNIX') ) THEN METRICFILE=METRIC_ANSWER ELSE IF (INDEX(METRIC_ANSWER,'$') .NE.0 .OR. + INDEX(METRIC_ANSWER,']') .NE.0 .AND. + CMACHINE.EQ.'VMS' ) THEN METRICFILE=METRIC_ANSWER ELSE CALL LOWTOUP(METRIC_ANSWER,MAXLENSTRING) IF (INDEX(METRIC_ANSWER,'PROFILE') .NE. 0) THEN IF ((LPROF_1 .EQV. .TRUE.).OR.(LPROFILE_1 .EQV. .TRUE.) + .OR. (LPROFILE_2 .EQV. .TRUE.) .OR. + (LPROF_2.EQV. .TRUE.)) THEN METRICFILE='PROFILE' ELSE WRITE(LOGSTRING,'(A)') + '*** ERROR: no PROFILE to get metric' CALL LOG_FILE(KLOG,LOGSTRING,1) STOP ENDIF ELSE IF (INDEX(METRIC_ANSWER,'LACHLAN') .NE.0 ) THEN METRICFILE=METRIC_ADM ELSE IF (INDEX(METRIC_ANSWER,'GCG') .NE. 0) THEN METRICFILE=METRIC_GCG ELSE IF (INDEX(METRIC_ANSWER,'STRUC') .NE. 0) THEN METRICFILE=METRIC_STRUC ELSE IF (INDEX(METRIC_ANSWER,'IO') .NE. 0) THEN METRICFILE=METRIC_IO ELSE IF (INDEX(METRIC_ANSWER,'STRUC_IO') .NE. 0) THEN METRICFILE=METRIC_STRIO ENDIF IF ( (LPROF_1 .EQV. .TRUE.).OR.(LPROFILE_1 .EQV. .TRUE.) + .OR. (LPROFILE_2 .EQV. .TRUE.) .OR. + (LPROF_2.EQV. .TRUE.)) THEN WRITE(LOGSTRING,'(A)') + '**** WARNING: detect profile for first sequence '// + 'and/or profile(s) for /n'// + ' comparison sequence, but '// + 'metricfile is not set to /n'// + ' PROFILE, will overWRITE '// + 'PROFILE-metric ' CALL LOG_FILE(KLOG,LOGSTRING,1) ENDIF ENDIF ENDIF WRITE(LOGSTRING,'(A,A)')'metric from: ',metricfile CALL LOG_FILE(KLOG,LOGSTRING,0) C-------------------------------------------------------------------- C profile normalization C-------------------------------------------------------------------- LNORM_PROFILE=.FALSE. IF (LDIALOG .EQV. .TRUE.) THEN IF ( (LPROF_1 .EQV. .TRUE.) .OR. (LPROFILE_1 .EQV. .TRUE.) + .OR. (LPROFILE_2 .EQV. .TRUE.) .OR. + (LPROF_2 .EQV. .TRUE.)) THEN LASK=.TRUE. NORM_PROFILE_ANSWER='NO' QUESTION=' internal normalization of profile ? /n'// + 'YES : calculate open and elongation.. /n'// + 'NO : use values given by user' ELSE LASK=.FALSE. QUESTION=' normalization of profile: disabled ' NORM_PROFILE_ANSWER='disabled' ENDIF IF (LASK .EQV. .TRUE.) THEN CALL GETCHAR(MAXLENSTRING,NORM_PROFILE_ANSWER,QUESTION) ENDIF ENDIF WRITE(LOGSTRING,'(A,A)')'normalize profile: ', + NORM_PROFILE_ANSWER CALL LOG_FILE(KLOG,LOGSTRING,0) CALL LOWTOUP(NORM_PROFILE_ANSWER,MAXLENSTRING) IF (INDEX(NORM_PROFILE_ANSWER,'Y') .NE. 0) THEN LNORM_PROFILE=.TRUE. ENDIF C-------------------------------------------------------------------- C set mean and gamma of profile C-------------------------------------------------------------------- IF (LNORM_PROFILE .EQV. .TRUE.) THEN IF (LDIALOG .EQV. .TRUE.) THEN QUESTION=' mean value for profile ? /n'// + ' positve real number (like: 0.05) ' CALL GETCHAR(MAXLENSTRING,PROFILE_EPSILON_ANSWER, + QUESTION) ENDIF CALL LOWTOUP(PROFILE_EPSILON_ANSWER,MAXLENSTRING) CALL STRPOS(PROFILE_EPSILON_ANSWER,IBEG,IEND) CALL READ_REAL_FROM_STRING( + PROFILE_EPSILON_ANSWER(IBEG:IEND),profile_epsilon) IF (LDIALOG .EQV. .TRUE.) THEN QUESTION=' give factor for gap-open/gap-elong? /n'// + ' positve real number (like: 10.0) ' CALL GETCHAR(MAXLENSTRING,PROFILE_GAMMA_ANSWER,QUESTION) ENDIF CALL LOWTOUP(PROFILE_GAMMA_ANSWER,MAXLENSTRING) CALL STRPOS(PROFILE_GAMMA_ANSWER,IBEG,IEND) CALL READ_REAL_FROM_STRING(PROFILE_GAMMA_ANSWER(IBEG:IEND), + PROFILE_GAMMA) WRITE(LOGSTRING,'(A,F6.2)')'mean value for profile: ', + PROFILE_EPSILON CALL LOG_FILE(KLOG,LOGSTRING,0) WRITE(LOGSTRING,'(A,F6.2)')'factor for gap weights: ', + PROFILE_GAMMA CALL LOG_FILE(KLOG,LOGSTRING,0) ELSE PROFILE_EPSILON_ANSWER='ignored' PROFILE_GAMMA_ANSWER='ignored' PROFILE_EPSILON=0.0 PROFILE_GAMMA=0.0 ENDIF C-------------------------------------------------------------------- C SMIN: worst amino acid mismatch C METRIC GOES FROM SMIN TO +1.0 C-------------------------------------------------------------------- IF (LNORM_PROFILE .EQV. .TRUE.) THEN SMIN_ANSWER='IGNORE' ELSE IF (LDIALOG .EQV. .TRUE.) THEN IF ((LPROF_1 .EQV. .TRUE.) .OR. + (LPROFILE_1 .EQV. .TRUE.) + .OR. (LPROFILE_2 .EQV. (.TRUE.)) .OR. + (LPROF_2.EQV. .TRUE.)) THEN SMIN_ANSWER='PROFILE' QUESTION=' SMIN: worst amino acid mismatch ? /n'// + ' real number (like: -0.5) or /n'// + ' 0.0 : NO scaling /n'// + ' PROFILE : take values from profile (NO scaling)' ELSE QUESTION=' SMIN: worst amino acid mismatch ? /n'// + ' real number (like: -0.5) or /n'// + ' 0.0 : NO scaling' WRITE(SMIN_ANSWER,'(F7.2)')SMIN ENDIF CALL GETCHAR(MAXLENSTRING,SMIN_ANSWER,QUESTION) ENDIF ENDIF CALL LOWTOUP(SMIN_ANSWER,MAXLENSTRING) IF (INDEX(SMIN_ANSWER,'PROFILE') .EQ. 0 .AND. + INDEX(SMIN_ANSWER,'IGNORE') .EQ. 0) THEN CALL STRPOS(SMIN_ANSWER,IBEG,IEND) CALL READ_REAL_FROM_STRING(SMIN_ANSWER(IBEG:IEND),SMIN) IF ((LPROF_1 .EQV. .TRUE.) .OR. (LPROFILE_1 .EQV. .TRUE.) + .OR. (LPROFILE_2 .EQV. .TRUE.) .OR. + (LPROF_2 .EQV. .TRUE.)) THEN WRITE(LOGSTRING,'(A)') + '**** WARNING: detect profile for first sequence and/or '// + 'profile(s) for /n'// + ' comparison sequence, but SMIN is '// + 'not set to PROFILE, /n'// + ' rescale Profile-metric ' CALL LOG_FILE(KLOG,LOGSTRING,1) ENDIF ENDIF WRITE(LOGSTRING,'(A,A)')'smin: ',smin_answer CALL LOG_FILE(KLOG,LOGSTRING,0) C-------------------------------------------------------------------- C SMAX: best amino acid match C-------------------------------------------------------------------- IF (LNORM_PROFILE .EQV. .TRUE.) THEN SMAX_ANSWER='IGNORE' ELSE IF (LDIALOG .EQV. .TRUE.) THEN IF ((LPROF_1 .EQV. .TRUE.).OR.(LPROFILE_1 .EQV. .TRUE.) + .OR. (LPROFILE_2 .EQV. .TRUE.) .OR. + (LPROF_2 .EQV. .TRUE.) ) THEN SMAX_ANSWER='PROFILE' QUESTION=' SMAX: best amino acid match ? /n'// + ' real number (like: 1.0) or /n'// + ' 0.0 : NO scaling /n'// + ' PROFILE : take values from profile (NO scaling)' ELSE QUESTION=' SMAX: best amino acid match ? /n'// + ' real number (like: 1.0) or /n'// + ' 0.0 : NO scaling' WRITE(SMAX_ANSWER,'(F7.2)')SMAX ENDIF CALL GETCHAR(MAXLENSTRING,SMAX_ANSWER,QUESTION) ENDIF ENDIF CALL LOWTOUP(SMAX_ANSWER,MAXLENSTRING) IF (INDEX(SMAX_ANSWER,'PROFILE') .EQ. 0 .AND. + INDEX(SMAX_ANSWER,'IGNORE') .EQ. 0 ) THEN CALL STRPOS(SMAX_ANSWER,IBEG,IEND) CALL READ_REAL_FROM_STRING(SMAX_ANSWER(IBEG:IEND),SMAX) IF ((LPROF_1 .EQV. .TRUE.) .OR. (LPROFILE_1 .EQV. .TRUE.) + .OR. (LPROFILE_2 .EQV. .TRUE.) .OR. + (LPROF_2 .EQV. .TRUE.)) THEN WRITE(LOGSTRING,'(A)') + '**** WARNING: detect profile for first sequence and/or '// + 'profile(s) for /n'// + ' comparison sequence, but SMAX is '// + 'not set to PROFILE, /n'// + ' rescale Profile-metric ' CALL LOG_FILE(KLOG,LOGSTRING,1) ENDIF ENDIF WRITE(LOGSTRING,'(A,A)')'smax: ',smax_answer CALL LOG_FILE(KLOG,LOGSTRING,0) C-------------------------------------------------------------------- C GAP-OPENING DELETION-WEIGHT C DEL IS IN UNITS OF ABS(SMIN). C-------------------------------------------------------------------- IF (LNORM_PROFILE .EQV. .TRUE.) THEN OPENWEIGHT_ANSWER='IGNORE' ELSE IF (LDIALOG .EQV. .TRUE.) THEN IF ((LPROF_1 .EQV. .TRUE.).OR.(LPROFILE_1 .EQV. .TRUE.) + .OR. (LPROFILE_2 .EQV. .TRUE.) .OR. + (LPROF_2 .EQV. .TRUE.)) THEN OPENWEIGHT_ANSWER='PROFILE' QUESTION=' DELETION-WEIGHT ? (gap-opening) /n'// + ' real number (like: 3.0) or /n'// + ' PROFILE : get them from the profile' ELSE WRITE(OPENWEIGHT_ANSWER,'(F7.2)')OPEN_1 QUESTION=' DELETION-WEIGHT ? (gap-opening) /n'// + ' real number (like: 3.0)' ENDIF CALL GETCHAR(MAXLENSTRING,OPENWEIGHT_ANSWER,QUESTION) ENDIF ENDIF CALL LOWTOUP(OPENWEIGHT_ANSWER,MAXLENSTRING) IF (INDEX(OPENWEIGHT_ANSWER,'PROFILE') .EQ. 0 .AND. + INDEX(OPENWEIGHT_ANSWER,'IGNORE') .EQ. 0 ) THEN CALL STRPOS(OPENWEIGHT_ANSWER,IBEG,IEND) CALL READ_REAL_FROM_STRING(OPENWEIGHT_ANSWER(IBEG:IEND), + OPEN_1) IF ((LPROF_1 .EQV. .TRUE.) .OR. (LPROFILE_1 .EQV. .TRUE.) + .OR. (LPROFILE_2 .EQV. .TRUE.) .OR. + (LPROF_2 .EQV. .TRUE.)) THEN WRITE(LOGSTRING,'(A)') + '**** WARNING: detect profile for first sequence and/or '// + 'profile(s) for /n'// + ' comparison sequence, but gap-open is not '// + 'set to PROFILE, /n'// + ' will overWRITE PROFILE gap-opening ' CALL LOG_FILE(KLOG,LOGSTRING,1) ENDIF ENDIF WRITE(LOGSTRING,'(A,A)')'gap_open: ',openweight_answer CALL LOG_FILE(KLOG,LOGSTRING,0) C-------------------------------------------------------------------- C GAP-ELONGATION DELETION-WEIGHT C-------------------------------------------------------------------- IF (LNORM_PROFILE .EQV. .TRUE.) THEN ELONGWEIGHT_ANSWER='IGNORE' ELSE IF (LDIALOG .EQV. .TRUE.) THEN IF ((LPROF_1 .EQV. .TRUE.).OR.(LPROFILE_1 .EQV. .TRUE.) + .OR. (LPROFILE_2 .EQV. .TRUE.) .OR. + (LPROF_2 .EQV. .TRUE.)) THEN ELONGWEIGHT_ANSWER='PROFILE' QUESTION=' DELETION-WEIGHT ? (gap-elongation) /n'// + ' real number (like: 0.1) or /n'// + ' PROFILE : get them from the profile' ELSE QUESTION=' DELETION-WEIGHT ? (gap-elongation) /n'// + ' real number (like: 0.1)' WRITE(ELONGWEIGHT_ANSWER,'(F7.2)')ELONG_1 ENDIF CALL GETCHAR(MAXLENSTRING,ELONGWEIGHT_ANSWER,QUESTION) ENDIF ENDIF CALL LOWTOUP(ELONGWEIGHT_ANSWER,MAXLENSTRING) IF (INDEX(ELONGWEIGHT_ANSWER,'PROFILE') .EQ. 0 .AND. + INDEX(ELONGWEIGHT_ANSWER,'IGNORE') .EQ. 0) THEN CALL STRPOS(ELONGWEIGHT_ANSWER,IBEG,IEND) CALL READ_REAL_FROM_STRING(ELONGWEIGHT_ANSWER(IBEG:IEND), + ELONG_1) IF ((LPROF_1 .EQV. .TRUE.) .OR. (LPROFILE_1 .EQV. .TRUE.) + .OR. (LPROFILE_2 .EQV. .TRUE.) .OR. + (LPROF_2 .EQV. .TRUE.)) THEN WRITE(LOGSTRING,'(A)') + '**** WARNING: detect profile for first sequence and/or '// + 'profile(s) for /n'// + ' comparison sequence, but gap-elongation is '// + 'not set to PROFILE, /n'// + ' will overWRITE PROFILE gap-elongation ' CALL LOG_FILE(KLOG,LOGSTRING,1) ENDIF ENDIF WRITE(LOGSTRING,'(A,A)')'gap_elongation: ',elongweight_answer CALL LOG_FILE(KLOG,LOGSTRING,0) C-------------------------------------------------------------------- C CONSERVATION WEIGHTS FOR SEQUENCE 1 C-------------------------------------------------------------------- LASK=.TRUE. IF (LDIALOG .EQV. .TRUE.) THEN IF (LSWISSBASE .EQV. .TRUE.) THEN IF (LPROFILE_1 .EQV. .TRUE.) THEN WEIGHT1_ANSWER='PROFILE' QUESTION=' use conservation weights for first '// + 'sequence? /n'// + 'NO : disable conservation weights /n'// + 'PROFILE : use weights from a MAXHOM-PROFILE' ELSE WEIGHT1_ANSWER='NO' LASK=.FALSE. QUESTION=' conservation weights for 1. sequence :'// + ' DISABLED (need PROFILE or PRE-FILTER)' CALL STRPOS(QUESTION,IBEG,IEND) WRITE(6,*)QUESTION(IBEG:IEND) ENDIF ELSE IF (LPROFILE_1 .EQV. .TRUE.) THEN WEIGHT1_ANSWER='PROFILE' QUESTION=' use conservation weights for first '// + 'sequence? /n'// + 'NO : disable conservation weights /n'// + 'YES : derive them from the sequence '// + 'alignments /n'// + 'PROFILE : use weights from a MAXHOM-PROFILE' ELSE WEIGHT1_ANSWER='NO' QUESTION=' use conservation weights for first '// + ' sequence? /n'// + 'NO : disable conservation weights /n'// + 'YES : derive them from the sequence alignments' ENDIF ENDIF IF (LASK .EQV. .TRUE.) THEN CALL GETCHAR(MAXLENSTRING,WEIGHT1_ANSWER,QUESTION) ENDIF ENDIF WRITE(LOGSTRING,'(A,A)')'conservation-weights for seq 1: ', + WEIGHT1_ANSWER CALL LOG_FILE(KLOG,LOGSTRING,0) CALL LOWTOUP(WEIGHT1_ANSWER,MAXLENSTRING) IF (INDEX(WEIGHT1_ANSWER,'Y') .NE. 0) THEN LCONSERV_1=.TRUE. LPASS2=.TRUE. LCONSIMPORT=.FALSE. ELSE IF (INDEX(WEIGHT1_ANSWER,'NO') .NE. 0) THEN LCONSERV_1=.FALSE. LPASS2=.FALSE. LCONSIMPORT=.FALSE. ELSE IF (INDEX(WEIGHT1_ANSWER,'PROFILE') .NE. 0) THEN IF (LPROFILE_1 .EQV. .TRUE.) THEN LCONSIMPORT=.TRUE. LCONSERV_1=.TRUE. LPASS2=.FALSE. ELSE WRITE(LOGSTRING,'(A)') + '*** WARNING: no profile to import '// + 'weights for sequence 1 ; set weights to 1.0' CALL LOG_FILE(KLOG,LOGSTRING,1) ENDIF ENDIF c IF ((lconserv_1 .eqv. .true.) .and. (.not. lconsimport)) THEN c IF ((lswissbase .eqv. .true.) .or. c + (lnrdbbase .eqv. .true.)) THEN c WRITE(logstring,'(a)')' /n **** HEEEE COOL DOWN /n'// c + 'You want calculate the conservation-weights during a '// c + 'run against a DATABASE /n'// c + 'That would be a two pass run over the DATABASE /n'// c + 'Better get first your weights with a simple HSSP-run /n'// c + '(select FASTA option) and read them afterwards via /n'// c + 'the profile option ' c call log_file(klog,logstring,1) c stop c endif c endif C-------------------------------------------------------------------- C CONSERVATION WEIGHTS FOR SEQUENCE 2 C-------------------------------------------------------------------- LASK=.TRUE. IF (LDIALOG .EQV. .TRUE.) THEN IF ((LPROF_2 .EQV. .TRUE.).OR. + (LPROFILE_2.EQV. .TRUE.)) THEN WEIGHT2_ANSWER='PROFILE' QUESTION= + ' use conservation weights for second sequence ? /n'// + 'NO : disable conservation weights /n'// + 'PROFILE : use weights from a MAXHOM-PROFILE' ELSE WEIGHT2_ANSWER='NO' LASK=.FALSE. QUESTION=' conservation weights for 2. sequence : '// + ' *** disabled (NEED PROFILE ) ***' CALL STRPOS(QUESTION,IBEG,IEND) WRITE(6,*)QUESTION(IBEG:IEND) ENDIF IF (LASK .EQV. .TRUE.) THEN CALL GETCHAR(MAXLENSTRING,WEIGHT2_ANSWER,QUESTION) ENDIF ENDIF WRITE(LOGSTRING,'(A,A)')'conservation-weights for seq 2: ', + WEIGHT2_ANSWER CALL LOG_FILE(KLOG,LOGSTRING,0) CALL LOWTOUP(WEIGHT2_ANSWER,MAXLENSTRING) IF (INDEX(WEIGHT2_ANSWER,'NO').NE.0) THEN LCONSERV_2=.FALSE. ELSE LCONSERV_2=.TRUE. ENDIF C-------------------------------------------------------------------- C 3 way alignment C-------------------------------------------------------------------- LASK=.TRUE. IF (LDIALOG .EQV. .TRUE.) THEN IF (LPROFILE_1 .EQV. .FALSE.) THEN WAY3_ANSWER='NO' QUESTION=' derive a profile and do a profile '// + 'alignment afterwards ? /n'// + 'NO : single scan procedure/n'// + 'YES : do profile alignment' ELSE WAY3_ANSWER='NO' LASK=.FALSE. QUESTION=' 3-way alignment disabled :'// + ' ( already got a PROFILE )' CALL STRPOS(QUESTION,IBEG,IEND) WRITE(6,*)QUESTION(IBEG:IEND) ENDIF IF (LASK .EQV. .TRUE.) THEN CALL GETCHAR(MAXLENSTRING,WAY3_ANSWER,QUESTION) ENDIF ENDIF WRITE(LOGSTRING,'(A,A)')' Way3-alignment: ',way3_answer CALL LOG_FILE(KLOG,LOGSTRING,0) CALL LOWTOUP(WAY3_ANSWER,MAXLENSTRING) IF (INDEX(WAY3_ANSWER,'Y') .NE. 0) THEN L3WAY=.TRUE. ELSE IF (INDEX(WAY3_ANSWER,'NO') .NE. 0) THEN L3WAY=.FALSE. ENDIF C-------------------------------------------------------------------- C INDELs in secondary structure of sequence 1 C-------------------------------------------------------------------- LASK=.TRUE. IF (LDIALOG .EQV. .TRUE.) THEN IF (LDSSP_1 .OR. LPROFILE_1 .OR. LISTOFSEQ_1) THEN QUESTION='allow insertions/deletions in secondary '// + 'structure segments of SEQUENCE 1? [YES/NO] ' ELSE INDEL_ANSWER_1='YES' LASK=.FALSE. QUESTION='INDEL in secondary structure of SEQUENCE 1:'// + '**** disabled (need DSSP-file or PROFILES) ****' CALL STRPOS(QUESTION,IBEG,IEND) WRITE(6,*)QUESTION(IBEG:IEND) ENDIF IF (LASK ) THEN CALL GETCHAR(MAXLENSTRING,INDEL_ANSWER_1,QUESTION) ENDIF ENDIF CALL LOWTOUP(INDEL_ANSWER_1,MAXLENSTRING) IF (INDEX(INDEL_ANSWER_1,'Y') .NE. 0) THEN LINSERT_1=.TRUE. ELSE IF (INDEX(INDEL_ANSWER_1,'NO') .NE.0 ) THEN LINSERT_1=.FALSE. ELSE WRITE(6,*)'*** OPTION UNKNOWN ***' LINSERT_1=.TRUE. ENDIF IF (LINSERT_1) THEN WRITE(LOGSTRING,'(A)') + 'INDEL(S) in secondary structure of sequence 1 ALLOWED' ELSE WRITE(LOGSTRING,'(A)') + 'INDEL(s) in secondary structure of sequence 1 disallowed' ENDIF CALL LOG_FILE(KLOG,LOGSTRING,0) IF ( LPROFILE_1 .AND. LINSERT_1) THEN WRITE(LOGSTRING,'(A)')'** INFO: detect PROFILE for '// + 'sequence 1 and /n'// + ' INDEL in secondary stucture segments '// + 'of SEQUENCE 1 allowed. /n'// + ' If the PROFILE has high GAPOPEN values '// + 'for secondary /n'// + ' structure segments, this will lead to '// + '"gap-free" /n'// + ' secondary structure segments' CALL LOG_FILE(KLOG,LOGSTRING,1) WRITE(6,*) ENDIF C-------------------------------------------------------------------- C INDELs in secondary structure of sequence 2 C-------------------------------------------------------------------- LASK=.TRUE. IF (LDIALOG) THEN IF (LPROFILE_2 .OR. LDSSP_2 .OR. LISTOFSEQ_2) THEN QUESTION='allow insertions/deletions in secondary '// + 'structure segments of SEQUENCE 2 ? [YES/NO] ' ELSE INDEL_ANSWER_2='YES' LASK=.FALSE. QUESTION='INDEL in secondary structure of SEQUENCE 2:'// + ' **** disabled (need DSSP-file or PROFILES) ****' CALL STRPOS(QUESTION,IBEG,IEND) WRITE(6,*)QUESTION(IBEG:IEND) ENDIF IF (LASK ) THEN CALL GETCHAR(MAXLENSTRING,INDEL_ANSWER_2,QUESTION) ENDIF ENDIF CALL LOWTOUP(INDEL_ANSWER_2,MAXLENSTRING) IF (INDEX(INDEL_ANSWER_2,'Y') .NE.0 ) THEN LINSERT_2=.TRUE. ELSE IF (INDEX(INDEL_ANSWER_2,'N') .NE.0 ) THEN LINSERT_2=.FALSE. ELSE WRITE(6,*)'*** OPTION UNKNOWN ***' LINSERT_2=.TRUE. ENDIF IF (LINSERT_2) THEN WRITE(LOGSTRING,'(A)') + 'INDEL(s) in secondary structure of SEQUENCE 2 allowed' ELSE WRITE(LOGSTRING,'(A)') + 'INDEL(s) in secondary structure of SEQUENCE 2 disallowed' ENDIF CALL LOG_FILE(KLOG,LOGSTRING,0) IF ( LPROFILE_2 .AND. LINSERT_2) THEN WRITE(LOGSTRING,'(A)')'** WARNING: detect PROFILE for '// + 'sequence 1 and/or sequence 2 and /n'// + ' INDEL in secondary stucture segments '// + 'allowed. /n'// + ' If the PROFILE has high GAPOPEN values '// + 'for secondary /n'// + ' structure segments, this will lead to '// + '"gap-free" /n'// + ' secondary structure segments' CALL LOG_FILE(KLOG,LOGSTRING,1) WRITE(6,*) ENDIF C-------------------------------------------------------------------- C suboptimal by backward SETMATRIX C-------------------------------------------------------------------- IF (LDIALOG) THEN CALL GETCHAR(MAXLENSTRING,BACKWARD_ANSWER, + ' suboptimal traces and reliability score ? ') ENDIF WRITE(LOGSTRING,'(A,A)') + 'suboptimal traces and reliability score: ',backward_answer CALL LOG_FILE(KLOG,LOGSTRING,0) CALL LOWTOUP(BACKWARD_ANSWER,MAXLENSTRING) IF (INDEX(BACKWARD_ANSWER,'Y') .NE. 0) THEN LBACKWARD=.TRUE. IF (MAXMAT .LT. 10) THEN WRITE(6,*)' *** FATAL ERROR: need big array for '// + 'matrix, but MAXMAT is too small: ',maxmat STOP ENDIF ENDIF C-------------------------------------------------------------------- C filter for suboptimals C-------------------------------------------------------------------- IF (LDIALOG) THEN LASK=.TRUE. IF (LBACKWARD) THEN QUESTION=' filter range for suboptimals ? [real number]' ELSE LASK=.FALSE. QUESTION=' *** SUBOPTIMAL TRACES: disabled' CALL STRPOS(QUESTION,IBEG,IEND) WRITE(6,*)QUESTION(IBEG:IEND) ENDIF IF (LASK ) THEN CALL GETCHAR(MAXLENSTRING,FILTER_ANSWER,QUESTION) ENDIF ENDIF WRITE(LOGSTRING,'(A,A)') + 'filter range for suboptimal traces: ',filter_answer CALL LOG_FILE(KLOG,LOGSTRING,0) CALL STRPOS(FILTER_ANSWER,IBEG,IEND) CALL READ_REAL_FROM_STRING(FILTER_ANSWER(IBEG:IEND), + FILTER_VAL) C-------------------------------------------------------------------- C number of alignments per protein C-------------------------------------------------------------------- IF (LDIALOG) THEN CALL GETCHAR(MAXLENSTRING,NBEST_ANSWER, + ' number of alignments per protein ?') ENDIF WRITE(LOGSTRING,'(A,A)') + 'number of alignments per protein: ',nbest_answer CALL LOG_FILE(KLOG,LOGSTRING,0) CALL STRPOS(NBEST_ANSWER,IBEG,IEND) CALL READ_INT_FROM_STRING(NBEST_ANSWER(IBEG:IEND),NBEST) C-------------------------------------------------------------------- C maximum number of reported alignments C-------------------------------------------------------------------- IF (LDIALOG) THEN NGLOBALHITS=MAXHITS WRITE(QUESTION,'(A,I6,A)')' maximum number of reported '// + 'alignments ? (maximum=',maxhits,')' CALL GETCHAR(MAXLENSTRING,NGLOBALHITS_ANSWER,QUESTION) ENDIF CALL STRPOS(NGLOBALHITS_ANSWER,IBEG,IEND) CALL READ_INT_FROM_STRING(NGLOBALHITS_ANSWER(IBEG:IEND), + NGLOBALHITS) IF (NGLOBALHITS .GT. MAXHITS) THEN WRITE(LOGSTRING,'(A)')'*** WARNING: NGLOBALHITS '// + '.GT. MAXHITS, NGLOBALHITS is set to maximum' CALL LOG_FILE(KLOG,LOGSTRING,1) NGLOBALHITS=MAXHITS ENDIF WRITE(LOGSTRING,'(A,I6)') + 'maximum number of reported alignments ',nglobalhits CALL LOG_FILE(KLOG,LOGSTRING,0) C-------------------------------------------------------------------- C threshold C-------------------------------------------------------------------- IF (LDIALOG) THEN IF (LSWISSBASE .OR. LNRDBBASE) THEN THRESHOLD_ANSWER='formula-5' ENDIF IF (LNEWCURVE) THEN CALL GETCHAR(MAXLENSTRING,THRESHOLD_ANSWER, + ' select alignments according to BR 1999 threshold ? /n'// + 'ALL : NO threshold /n'// + 'VALUE=x : absolute value ; x=real value /n'// + 'CUT-x : threshold is x percent of the maximal /n'// + ' possible value (like: cut-10 ) /n'// + '"filename" : import specification from file /n'// + 'formula : use original HSSP-threshold curve /n'// + 'formula+x : use HSSP-theshold value plus "x" percent /n'// + 'formula-x : use HSSP-theshold value minus "x" percent') ELSE CALL GETCHAR(MAXLENSTRING,THRESHOLD_ANSWER, + ' select alignments according to RS 1989 threshold ? /n'// + 'ALL : NO threshold /n'// + 'VALUE=x : absolute value ; x=real value /n'// + 'CUT-x : threshold is x percent of the maximal /n'// + ' possible value (like: cut-10 ) /n'// + '"filename" : import specification from file /n'// + 'formula : use original HSSP-threshold curve /n'// + 'formula+x : use HSSP-theshold value plus "x" percent /n'// + 'formula-x : use HSSP-theshold value minus "x" percent') ENDIF ENDIF WRITE(LOGSTRING,'(A,A)')'threshold: ',threshold_answer CALL LOG_FILE(KLOG,LOGSTRING,0) ISOSIGFILE=THRESHOLD_ANSWER CALL LOWTOUP(THRESHOLD_ANSWER,MAXLENSTRING) IF (INDEX(THRESHOLD_ANSWER,'FORMULA') .NE. 0) THEN LFORMULA=.TRUE. LTHRESHOLD=.TRUE. LALL=.FALSE. ELSE IF (INDEX(THRESHOLD_ANSWER,'ALL') .NE. 0) THEN LFORMULA=.FALSE. LTHRESHOLD=.FALSE. LALL=.TRUE. ELSE IF (INDEX(THRESHOLD_ANSWER,'CUT') .NE. 0) THEN LFORMULA=.FALSE. LTHRESHOLD=.FALSE. LALL=.FALSE. IBEG=INDEX(THRESHOLD_ANSWER,'-') IF (IBEG .NE. 0) THEN CALL STRPOS(THRESHOLD_ANSWER,I,IEND) CALL READ_INT_FROM_STRING( + THRESHOLD_ANSWER(IBEG+1:IEND),I) CUTVALUE1=FLOAT(I) WRITE(6,*)' CUTVALUE 1 IS SET TO: ',CUTVALUE1 ELSE WRITE(6,*)' NO VALUE SPECIFIED; SET TO 10' CUTVALUE1=10.0 ENDIF ELSE IF (INDEX(THRESHOLD_ANSWER,'VALUE') .NE. 0) THEN LFORMULA=.FALSE. LTHRESHOLD=.FALSE. LALL=.TRUE. IBEG=INDEX(THRESHOLD_ANSWER,'=')+1 CALL READ_REAL_FROM_STRING(THRESHOLD_ANSWER(IBEG:), + CUTVALUE2) WRITE(6,*)' CUTVALUE 2 IS SET TO: ',CUTVALUE2 ENDIF IF (LFORMULA) THEN I=INDEX(ISOSIGFILE,'+') J=INDEX(ISOSIGFILE,'-') IF (I .NE. 0) THEN CALL STRPOS(ISOSIGFILE,IBEG,IEND) CALL READ_INT_FROM_STRING(ISOSIGFILE(I+1:IEND),ISAFE) WRITE(6,'(A,I3,A)')' use formula value plus', + ISAFE,' percent' ELSE IF (J .NE. 0) THEN CALL STRPOS(ISOSIGFILE,IBEG,IEND) CALL READ_INT_FROM_STRING(ISOSIGFILE(J:IEND),ISAFE) WRITE(6,'(A,I3,A)')' use formula value ', + isafe, ' percent' ELSE ISAFE=0 ENDIF ELSE IF (LTHRESHOLD) THEN CALL GETHSSPCUT(KISO,MAXCUTOFFSTEPS,ISOSIGFILE,ISOLEN, + ISOIDE,NSTEP) ENDIF C-------------------------------------------------------------------- C sort mode C-------------------------------------------------------------------- IF (LDIALOG) THEN IF (LSWISSBASE .OR. LNRDBBASE) THEN SORTMODE_ANSWER='VALUE' ENDIF CALL GETCHAR(MAXLENSTRING,SORTMODE_ANSWER, + ' alignment sorting ? '// + '/n NO : no sorting (preserve order of a list)'// + '/n DISTANCE : distance in %identity from HSSP-curve'// + '/n VALUE : internal score'// + '/n SIGMA : internal score / sigma'// + '/n SIM/WSIM : similarity / weighted by conservation weight'// + '/n IDENTITY : %identity'// + '/n ZSCORE : Gribskov Z-score'// + '/n VALFORM : apply the HSSP-formula to values'// + '/n VALPER : internal score per residue') ENDIF CALL LOWTOUP(SORTMODE_ANSWER,MAXLENSTRING) IF (INDEX(SORTMODE_ANSWER,'DIST') .NE. 0) THEN CSORTMODE='DISTANCE' ELSE IF (INDEX(SORTMODE_ANSWER,'VALU') .NE. 0) THEN CSORTMODE='VALUE' ELSE IF (INDEX(SORTMODE_ANSWER,'WSIM') .NE. 0) THEN CSORTMODE='WSIM' ELSE IF (INDEX(SORTMODE_ANSWER,'SIM') .NE. 0) THEN CSORTMODE='SIM' ELSE IF (INDEX(SORTMODE_ANSWER,'SIGMA') .NE. 0) THEN CSORTMODE='SIGMA' ELSE IF (INDEX(SORTMODE_ANSWER,'IDE') .NE. 0) THEN CSORTMODE='IDENTITY' ELSE IF (INDEX(SORTMODE_ANSWER,'VALP') .NE. 0) THEN CSORTMODE='VALPER' ELSE IF (INDEX(SORTMODE_ANSWER,'VALF') .NE. 0) THEN CSORTMODE='VALFORM' ELSE IF (INDEX(SORTMODE_ANSWER,'ZSCO') .NE. 0) THEN CSORTMODE='ZSCORE' ELSE IF (INDEX(SORTMODE_ANSWER,'NO') .NE. 0) THEN CSORTMODE='NO' ELSE CSORTMODE='DISTANCE' WRITE(LOGSTRING,'(A,A)')'**** WARNING: SORT_MODE NOT '// + 'KNOWN ; will use: ',CSORTMODE CALL LOG_FILE(KLOG,LOGSTRING,1) ENDIF WRITE(LOGSTRING,'(A,A)')'sort mode: ',csortmode CALL LOG_FILE(KLOG,LOGSTRING,0) IF (.NOT. LCONSERV_1 .AND. CSORTMODE .EQ. 'WSIM' ) THEN WRITE(LOGSTRING,'(a)')'*** WARNING: conservation weights'// + 'option is off, but sort option is "WSIM" ; '// + 'will switch to sortmode "SIM"' CALL LOG_FILE(KLOG,LOGSTRING,1) ENDIF C-------------------------------------------------------------------- C option for HSSP output C-------------------------------------------------------------------- IF (LDIALOG) THEN CALL GETCHAR(LEN(HSSP_ANSWER),HSSP_ANSWER,' HSSP output'// + '/n NO : no HSSP-file output'// + '/n YES : produce HSSP-file with default name'// + '/n filename : produce HSSP-file with given name') ENDIF TEMPNAME=' ' TEMPNAME(1:)=HSSP_ANSWER CALL LOWTOUP(HSSP_ANSWER,MAXLENSTRING) IF ( (INDEX(HSSP_ANSWER,'/') .NE. 0 ) .OR. + (INDEX(HSSP_ANSWER,'.') .NE. 0 ) ) THEN HSSP_ANSWER=' ' HSSP_ANSWER=TEMPNAME WRITE(LOGSTRING,'(A,A)')'HSSP-output: ',hssp_answer LHSSP=.TRUE. ELSE IF ( INDEX(HSSP_ANSWER,'YES') .NE. 0) THEN WRITE(LOGSTRING,'(A)')'HSSP-output: YES' LHSSP=.TRUE. ELSE IF ( INDEX(HSSP_ANSWER,'NO') .NE. 0) THEN WRITE(LOGSTRING,'(A)')'HSSP-output: NO' LHSSP=.FALSE. ELSE HSSP_ANSWER=' ' HSSP_ANSWER=TEMPNAME WRITE(LOGSTRING,'(A,A)')'HSSP-output: ',hssp_answer LHSSP=.TRUE. ENDIF CALL LOG_FILE(KLOG,LOGSTRING,0) C-------------------------------------------------------------------- LHSSP_LONG_ID=.FALSE. IF (LDIALOG) THEN CALL GETCHAR(MAXLENSTRING,HSSP_FORMAT_ANSWER, + ' HSSP file with long ID ? '// + '/n NO : normal HSSP-file output'// + '/n YES : HSSP-file with long Protein-identifiers') ENDIF CALL LOWTOUP(HSSP_FORMAT_ANSWER,MAXLENSTRING) IF ( INDEX(HSSP_FORMAT_ANSWER,'YES') .NE. 0) THEN WRITE(LOGSTRING,'(a)')'HSSP-long ID: YES' LHSSP_LONG_ID=.TRUE. HSSP_FORMAT_ANSWER='YES' ELSE IF ( INDEX(HSSP_FORMAT_ANSWER,'NO') .NE. 0) THEN WRITE(LOGSTRING,'(A)')'HSSP-long ID: NO' LHSSP_LONG_ID=.FALSE. ELSE WRITE(LOGSTRING,'(A,A)')'HSSP-long ID: ',hssp_format_answer LHSSP_LONG_ID=.TRUE. ENDIF CALL LOG_FILE(KLOG,LOGSTRING,0) C-------------------------------------------------------------------- IF (LDIALOG) THEN CALL GETCHAR(MAXLENSTRING,SAMESEQ_ANSWER, + ' show 100% identical hits ? ') ENDIF CALL LOWTOUP(SAMESEQ_ANSWER,MAXLENSTRING) IF (INDEX(SAMESEQ_ANSWER,'Y').NE.0) THEN LSHOW_SAMESEQ=.TRUE. ELSE LSHOW_SAMESEQ=.FALSE. ENDIF C-------------------------------------------------------------------- C GET GCG SIMILARITY MATRIX (HSSP-variability) C-------------------------------------------------------------------- C call getchar(MAXLENSTRING,metric_hssp_var, C + 'METRIC FILE (HSSP-VARIABILITY) ?') WRITE(LOGSTRING,'(A,A)') + 'HSSP-variability metric: ',METRIC_HSSP_VAR CALL LOG_FILE(KLOG,LOGSTRING,0) C-------------------------------------------------------------------- C get release notes of EMBL/SWISSPROT C-------------------------------------------------------------------- WRITE(LOGSTRING,'(A,A)')'SWISSPROT release notes: ',relnotes CALL LOG_FILE(KLOG,LOGSTRING,0) C-------------------------------------------------------------------- C compare 3-D-structures if both are known C-------------------------------------------------------------------- IF (LDIALOG) THEN LASK=.TRUE. IF (LISTOFSEQ_1 .OR. LDSSP_1 ) THEN QUESTION=' compare 3d-structures if both are known?/n'// + 'YES or NO' ELSE LASK=.FALSE. QUESTION=' compare 3d-structures :'// + ' *** disabled (NEED STRUCTURE INFORMATION) ***' CALL STRPOS(QUESTION,IBEG,IEND) WRITE(6,*)QUESTION(IBEG:IEND) ENDIF IF (LASK ) THEN CALL GETCHAR(MAXLENSTRING,COMPARE_ANSWER,QUESTION) ENDIF ENDIF CALL LOWTOUP(COMPARE_ANSWER,MAXLENSTRING) IF (INDEX(COMPARE_ANSWER,'Y') .NE. 0) THEN LCOMPSTR=.TRUE. WRITE(LOGSTRING,'(A)')'3D superposition: YES' ELSE WRITE(LOGSTRING,'(A)')'3D superposition: NO' ENDIF CALL LOG_FILE(KLOG,LOGSTRING,0) C-------------------------------------------------------------------- C path for Brookhaven-files in case of private files C-------------------------------------------------------------------- IF (LDIALOG) THEN IF (LCOMPSTR) THEN QUESTION=' give path for the Brookhaven files ' LASK=.TRUE. ELSE PDBPATH='OPTION DISABLED' LASK=.FALSE. PDBPATH_ANSWER='OPTION DISABLED' QUESTION=' path for Brookhaven files : ** disabled **' CALL STRPOS(QUESTION,IBEG,IEND) WRITE(6,*)QUESTION(IBEG:IEND) ENDIF IF (LASK ) THEN CALL GETCHAR(MAXLENSTRING,PDBPATH_ANSWER,QUESTION) ENDIF ENDIF IF (LCOMPSTR) THEN PDBPATH=PDBPATH_ANSWER WRITE(LOGSTRING,'(A,A)')'Brookhaven files are in: ',pdbpath CALL LOG_FILE(KLOG,LOGSTRING,0) ENDIF C-------------------------------------------------------------------- C OUTPUT FILES C-------------------------------------------------------------------- IF (LDIALOG) THEN QUESTION=' WRITE profile for first sequence ? /n'// + ' NO : disable profile output /n'// + ' YES : WRITE profile with default filename /n'// + ' "filename": WRITE profile in "filename"' CALL GETCHAR(MAXLENSTRING,PROFILEOUT_ANSWER,QUESTION) ENDIF PROFILEOUT=PROFILEOUT_ANSWER LWRITEPROFILE=.TRUE. CALL LOWTOUP(PROFILEOUT_ANSWER,MAXLENSTRING) CALL STRPOS(PROFILEOUT_ANSWER,IBEG,IEND) IF (PROFILEOUT_ANSWER(IBEG:IEND) .EQ. 'NO') THEN LWRITEPROFILE=.FALSE. PROFILEOUT='NO' ELSE IF (PROFILEOUT_ANSWER(IBEG:IEND) .EQ. 'YES') THEN LWRITEPROFILE=.TRUE. PROFILEOUT='YES' ENDIF IF ( (L3WAY .EQV. .TRUE.) .AND. + (LWRITEPROFILE .EQV. .FALSE.) ) THEN LWRITEPROFILE=.TRUE. PROFILEOUT='YES' WRITE(6,*)' you want to do an 3-way alignment or ?' WRITE(6,*)' profile-out option is set now to: true' ENDIF WRITE(LOGSTRING,'(A,A)')'profile output: ',profileout CALL LOG_FILE(KLOG,LOGSTRING,0) C-------------------------------------------------------------------- LSTRIP=.TRUE. LSTRIP_LONG=.FALSE. IF (LDIALOG) THEN IF (LSWISSBASE .OR. LNRDBBASE) THEN STRIPFILE_ANSWER='NO' ENDIF CALL GETCHAR(MAXLENSTRING,STRIPFILE_ANSWER, + 'STRIP file name ? [NO=no strip file] ') ENDIF IF ( INDEX (STRIPFILE_ANSWER,'long') .GT. 0) THEN LSTRIP_LONG=.TRUE. STRIPFILE_ANSWER='YES' ENDIF WRITE(LOGSTRING,'(A,A)')'strip output file: ',stripfile_answer CALL LOG_FILE(KLOG,LOGSTRING,0) STRIPFILE=STRIPFILE_ANSWER CALL LOWTOUP(STRIPFILE_ANSWER,MAXLENSTRING) CALL STRPOS(STRIPFILE_ANSWER,IBEG,IEND) IF (STRIPFILE_ANSWER(IBEG:IEND) .EQ. 'NO') THEN LSTRIP=.FALSE. ENDIF c-------------------------------------------------------------------- LONG_OUT=.TRUE. IF (LDIALOG) THEN IF (LSWISSBASE .OR. LNRDBBASE) THEN LONG_OUTPUT_ANSWER='NO' ENDIF CALL GETCHAR(MAXLENSTRING,LONG_OUTPUT_ANSWER, + 'long_output file name ? [NO=no output file] ') ENDIF WRITE(LOGSTRING,'(A,A)')'long output file: ', + long_output_answer CALL LOG_FILE(KLOG,LOGSTRING,0) LONGFILE=LONG_OUTPUT_ANSWER CALL LOWTOUP(LONG_OUTPUT_ANSWER,MAXLENSTRING) CALL STRPOS(LONG_OUTPUT_ANSWER,IBEG,IEND) IF (LONG_OUTPUT_ANSWER(IBEG:IEND) .EQ. 'NO') THEN LONG_OUT=.FALSE. ENDIF C-------------------------------------------------------------------- LTRACE=.TRUE. IF (LDIALOG) THEN CALL GETCHAR(MAXLENSTRING,PLOTFILE_ANSWER, + 'Trace plot file name ? [NO=no plot file] ') ENDIF WRITE(LOGSTRING,'(A,A)')'TRACE plot file: ',plotfile_answer CALL LOG_FILE(KLOG,LOGSTRING,0) CALL STRPOS(PLOTFILE_ANSWER,IBEG,IEND) PLOTFILE=PLOTFILE_ANSWER(IBEG:IEND) CALL LOWTOUP(PLOTFILE_ANSWER,MAXLENSTRING) IF (PLOTFILE_ANSWER(IBEG:IEND) .EQ. 'NO') THEN LTRACE=.FALSE. ELSE IF (PLOTFILE_ANSWER(IBEG:IEND) .EQ. 'YES') THEN LTRACE=.TRUE. ENDIF IF (COMMANDFILE_ANSWER .NE. 'NO' ) THEN CALL WRITE_MAXHOM_COM(CFILTER) STOP ENDIF c======================================================================= C INIT C======================================================================= IF (LISTOFSEQ_1) THEN WRITE(LOGSTRING,'(A,A)')'list of sequences for first '// + 'sequence: ',listfile_1 CALL LOG_FILE(KLOG,LOGSTRING,1) CALL OPEN_FILE(KLIS1,LISTFILE_1,'OLD,READONLY',LERROR) ENDIF CPARALLEL C if idproc eq host: ENDIF C====================================================================== 100 IF (IDPROC .EQ. ID_HOST) THEN WRITE(6,*)'============================================='// + '==========================' WRITE(6,*)'======================== START OF COMPARISON '// + '==========================' C total number of alignments for all sequences, no of alignments above C threshold (cons-weight) NALIGN=0 IALIGNOLD=0 LALIOVERFLOW=.FALSE. C record pointer for binary tempfile IRECORD=0 LDSSP_1=.FALSE. CSQ_1=' ' HEADER_1=' ' COMPND_1=' ' AUTHOR_1=' ' SOURCE_1=' ' ACCESSION_2=' ' PDBREF_2=' ' DO I=1,MAXSQ CRESID(I)=' ' LSTRUC_1(I)=0 LACC_1(I)=0 PDBNO_1(I)=0 BP1_1(I)=0 BP2_1(I)=0 ENDDO CSYMBOL=' ' CALL INIT_CHAR_ARRAY(1,MAXSQ,COLS_1,CSYMBOL) CALL INIT_CHAR_ARRAY(1,MAXSQ,CHAINID_1,CSYMBOL) CALL INIT_CHAR_ARRAY(1,MAXSQ,PREDSTR,CSYMBOL) CALL INIT_CHAR_ARRAY(1,MAXSQ,SHEETLABEL_1,CSYMBOL) CALL INIT_CHAR_ARRAY(1,MAXSQ,PREDSTRCORR,CSYMBOL) CSYMBOL='U' CALL INIT_CHAR_ARRAY(1,MAXSQ,STRUC_1,CSYMBOL) IF (LISTOFSEQ_1) THEN READ(KLIS1,'(A)',END=1000)FILESEQ CALL STRPOS(FILESEQ,IBEG,IEND) NAME_1=FILESEQ(:IEND) ENDIF C all chains wanted from DSSP data set KCHAIN=0 CHAINREMARK=' ' I=INDEX(NAME_1,'!')-1 IF (I .GT. 0) THEN NSELECT=1 IEND=LEN(NAME_1) DO J=IEND,I+1,-1 IF (NAME_1(J:J) .EQ. ',')NSELECT=NSELECT+1 ENDDO WRITE(6,*)' use only',nselect,' chain(s) ',name_1(i:) CHAINREMARK(1:)=NAME_1 ELSE I=LEN(NAME_1) ENDIF CALL GETPIDCODE(NAME_1(1:I),HSSPID_1) WRITE(LOGSTRING,'(A,A)')'sequence file 1: ',name_1 CALL LOG_FILE(KLOG,LOGSTRING,0) CALL CHECKFORMAT(KGETSEQ,NAME_1,SEQFORMAT,LERROR) LPROFILE_1=INDEX(SEQFORMAT,'PROFILE').NE.0 IF (INDEX(SEQFORMAT,'DSSP').NE.0 .OR. + INDEX(SEQFORMAT,'PROFILE-DSSP') .NE.0) THEN LDSSP_1=.TRUE. ENDIF NCHAIN=1 IF (LPROFILE_1) THEN CALL READPROFILE(KPROF,NAME_1,MAXSQ,NTRANS,TRANS,LDSSP_1, + N1,NCHAIN,HSSPID_1,HEADER_1,COMPND_1, + SOURCE_1,AUTHOR_1,XSMIN,XSMAX,XMAPLOW, + XMAPHIGH,PROFILEMETRIC,PDBNO_1,CHAINID_1, + CSQ_1_ARRAY,STRUC_1,NSURF_1,COLS_1, + SHEETLABEL_1,BP1_1,BP2_1,NOCC_1,GAPOPEN_1, + GAPELONG_1,CONSWEIGHT_1,SIMMETRIC_1, + MAXBOX,NBOX_1,PROFILEBOX_1) ELSE IF (LDSSP_1) THEN IF (CHAINREMARK .EQ. ' ') THEN CALL SELECT_UNIQUE_CHAIN(KGETSEQ,NAME_1,LINE) ENDIF CALL GETDSSPFORHSSP(KGETSEQ,NAME_1,MAXSQ,CHAINREMARK, + BRKID_1,HEADER_1,COMPND_1,SOURCE_1, + AUTHOR_1,N1,LRES,NCHAIN,KCHAIN,PDBNO_1, + CHAINID_1,CSQ_1_ARRAY,STRUC_1,COLS_1, + BP1_1,BP2_1,SHEETLABEL_1,NSURF_1) ELSE CALL GET_SEQ(KGETSEQ,NAME_1,TRANS,CTEMP,COMPND_1, + ACCESSION_1, + PDBREFLINE,PDBNO_1,N1,CSQ_1,STRUC_1_STRING, + NSURF_1,LTRUNCATED,LERROR) DO I=1,N1 CSQ_1_ARRAY(I)=CSQ_1(I:I) ENDDO IF (STRUC_1_STRING .EQ. ' ') THEN DO I=1,N1 STRUC_1(I)='U' ENDDO ELSE DO I=1,N1 STRUC_1(I)=STRUC_1_STRING(I:I) ENDDO ENDIF C CALL GETSEQ(KGETSEQ,MAXSQ,N1,CRESID,CSQ_1_ARRAY,STRUC_1, C + NSURF_1,LDSSP_1,NAME_1,COMPND_1,ACCESSION_1, c + pdbrefline,klog,trans,ntrans,kchain,nchain,ctemp) C convert cresid to pdb-number and chain identifier, C used in 3d superposition IF (LDSSP_1) THEN DO I=1,N1 READ(CRESID(I),'(I5,A)')PDBNO_1(I),CHAINID_1(I) ENDDO ENDIF ENDIF C ERROR DURING READ IF (N1.EQ.0) THEN WRITE(LOGSTRING,'(A,A)')'*** ERROR: READ ERROR FOR: ', + name_1 CALL LOG_FILE(KLOG,LOGSTRING,1) GOTO 900 ENDIF CALL SELECT_PDB_POINTER(KREF,DSSP_PATH,PDBREFLINE,PDBREF_1) NCHAIN_1=NCHAIN NCHAINUSED=1 DO I=1,N1 CSQ_1(I:I)=CSQ_1_ARRAY(I) IF (CSQ_1(I:I) .EQ. '!')NCHAINUSED=NCHAINUSED+1 ENDDO WRITE(6,*)' nchainused: ',nchainused C build the name for the profile-output IF (LWRITEPROFILE .AND. PROFILEOUT_ANSWER .EQ. 'YES' ) THEN CALL CONCAT_STRINGS(HSSPID_1,'.profile',profileout) ENDIF C build the name for the strip-output IF (LSTRIP .AND. STRIPFILE_ANSWER .EQ. 'YES' ) THEN CALL CONCAT_STRINGS(HSSPID_1,'.strip',stripfile) ENDIF C build the name for the long_output IF (LONG_OUT .AND. LONG_OUTPUT_ANSWER .EQ. 'YES' ) THEN CALL CONCAT_STRINGS(HSSPID_1,'.x',longfile) ENDIF C build the name for the dotplot-output IF (LTRACE .AND. PLOTFILE_ANSWER .EQ. 'YES' ) THEN CALL CONCAT_STRINGS(HSSPID_1,'.trace',plotfile) ENDIF IF (LTRACE) CALL DEL_OLDFILE(KPLOT,PLOTFILE) c init the conservation weights for sequence 1 IF (LCONSERV_1 .AND. .NOT. LCONSIMPORT ) THEN WRITE(6,*)' CALL SETCONSERVATION' CALL SETCONSERVATION(METRIC_HSSP_VAR) WRITE(LOGSTRING,'(a,a)') + 'metric for conservation weights :',metric_hssp_var CALL LOG_FILE(KLOG,LOGSTRING,0) ENDIF C overWRITE gap-open if wanted IF (OPENWEIGHT_ANSWER .NE. 'PROFILE' ) THEN DO I=1,MAXSQ GAPOPEN_1(I)=OPEN_1 ENDDO WRITE(6,*)' overWRITE gap-open penalty with: ',open_1 ENDIF c set gap-open to a high value in secondary structure segments IF (.NOT. LINSERT_1) THEN WRITE(6,*)' CALL PUNISHGAP' CALL PUNISH_GAP(N1,STRUC_1,'HE',PUNISH,GAPOPEN_1) ENDIF c overWRITE gap-elongation if wanted IF (ELONGWEIGHT_ANSWER .NE. 'PROFILE' ) THEN DO I=1,MAXSQ GAPELONG_1(I)=ELONG_1 ENDDO WRITE(6,*)' overWRITE gap-elongation penalty with: ', + elong_1 ENDIF c get the beginning and end of a profile, if boxes are specified IPROFBEG=1 IPROFEND=N1 IF (LPROFILE_1) THEN I=1 J=1 DO WHILE(SIMMETRIC_1(I,J) .EQ. 0) J=J+1 IF (J .GT. NTRANS) THEN J=1 I=I+1 IF (I .GT. N1) THEN WRITE(LOGSTRING,'(A)') + '*** ERROR: the complete profile is 0.0 ' CALL LOG_FILE(KLOG,LOGSTRING,1) ENDIF ENDIF ENDDO IPROFBEG=I I=N1 J=1 DO WHILE (SIMMETRIC_1(I,J).EQ.0) J=J+1 IF (J.GT.NTRANS) THEN J=1 I=I-1 ENDIF ENDDO IPROFEND=I IF ( (IPROFBEG .NE. 1) .OR. (IPROFEND .NE. N1) ) THEN WRITE(6,*)'INFO: start/end set to: ',iprofbeg,iprofend ENDIF ENDIF CTemp CALL REVERSEAASEQ (MAXSQ,N1,CSQ_1_array,STRUC_1,NSURF_1,LDSSP_1 ) C IF (LDSSP_1) THEN CALL LOWER_TO_CYS(CSQ_1,N1) ENDIF CALL SEQ_TO_INTEGER(CSQ_1,LSQ_1,N1,TRANSPOS) CALL GETCHAINBREAKS(N1,LSQ_1,STRUC_1,TRANS,NBREAK_1, + IBREAKPOS_1) CAUTION LSTRUC IS NOT JUST 3 STATES IF (LDSSP_1) THEN CALL STR_TO_INT(N1,STRUC_1,LSTRUC_1,STRTRANS) CALL STR_TO_CLASS(MAXSTRSTATES,STR_CLASSES,N1,STRUC_1, + STRCLASS_1,LSTRCLASS_1) CALL ACC_TO_INT(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES_1,NIOSTATES_1,IORANGE, + N1,LSQ_1,LSTRCLASS_1,NSURF_1,LACC_1) ELSE I=INDEX(STRTRANS,'U') CALL INIT_INT_ARRAY(1,N1,LSTRUC_1,I) DO I=1,MAXSTRSTATES IF ( INDEX(STR_CLASSES(I),STRUC_1(1)) .NE. 0) THEN CALL INIT_INT_ARRAY(1,N1,LSTRCLASS_1,I) CSYMBOL=STR_CLASSES(I)(1:1) ENDIF ENDDO DO I=1,N1 STRCLASS_1(I:I)=CSYMBOL ENDDO CALL INIT_INT_ARRAY(1,N1,LACC_1,1) ENDIF IF (LNORM_PROFILE) THEN MAPLOW=0.0 MAPHIGH=0.0 ELSE IF (LPROFILE_1 .AND. SMIN_ANSWER .EQ. 'PROFILE' ) THEN SMIN=XSMIN SMAX=XSMAX MAPLOW=XMAPLOW MAPHIGH=XMAPHIGH ELSE IF (LPROFILE_1 .AND. SMIN_ANSWER .NE. 'IGNORE' ) THEN MAPLOW=XMAPLOW MAPHIGH=XMAPHIGH ENDIF ENDIF c======================================================================= c fill simmetric if no profiles IF (.NOT. LPROFILE_1 .AND. (LPROFILE_2 .OR. LPROF_2)) THEN WRITE(6,*)'no metric for sequence 1' ELSE IF (METRICFILE .NE. 'PROFILE' ) THEN CALL GETSIMMETRIC(NTRANS,TRANS,MAXSTRSTATES, + MAXIOSTATES,NSTRSTATES_1,NIOSTATES_1,NSTRSTATES_2, + NIOSTATES_2,CSTRSTATES,CIOSTATES,IORANGE, + KSIM,METRICFILE,SIMORG) CALL SCALEMETRIC(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + SIMORG,SMIN,SMAX,MAPLOW,MAPHIGH) CALL ACC_TO_INT(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES_1,NIOSTATES_1,IORANGE, + N1,LSQ_1,LSTRCLASS_1,NSURF_1,LACC_1) IF (NSTRSTATES_2 .GT. 1 .OR. NIOSTATES_2 .GT. 1) THEN WRITE(6,*)' INFO: profile mode set to 6' PROFILEMODE=6 ELSE CALL FILLSIMMETRIC(MAXSQ,NTRANS,MAXSTRSTATES, + MAXIOSTATES,NSTRSTATES_1,NSTRSTATES_2, + CSTRSTATES,SIMORG,N1,LSQ_1,LSTRCLASS_1,LACC_1, + SIMMETRIC_1) IF (LNORM_PROFILE) THEN WRITE(6,*) + ' NORM_PROFILE for profile 1 not working yet' STOP ELSE WRITE(6,'(a,4(2x,f5.2)))')' CALL SCALE_PROFILE 1', + SMIN,SMAX,MAPLOW,MAPHIGH CALL SCALE_PROFILE_METRIC(MAXSQ,NTRANS,TRANS, + SIMMETRIC_1,SMIN,SMAX,MAPLOW,MAPHIGH) ENDIF ENDIF C scale profile_1 ELSE WRITE(6,*)' call scale_profile disabled' c WRITE(6,'(a,4(2x,f5.2)))')' CALL SCALE_PROFILE 1', c + smin,smax,maplow,maphigh c call scale_profile_metric(maxsq,ntrans,trans, c + simmetric_1,smin,smax,maplow,maphigh) ENDIF ENDIF C-------------------------------------------------------------------- IF (LONG_OUT) THEN CALL OPEN_FILE(KLONG,LONGFILE,'NEW,RECL=200',LERROR) C header to long output file WRITE(KLONG,'(A)') + '======================================'// + '============ MAXHOM-LONG =================='// + '===================================' CALL STRPOS(NAME_1,IBEG,IEND) WRITE(KLONG,'(A,A)')' test sequence : ', + name_1(ibeg:iend) IF (LISTOFSEQ_2) THEN WRITE(KLONG,'(A,A)')' list name : ', + listfile_2(1:50) ENDIF CALL STRPOS(NAME_2,IBEG,IEND) WRITE(klong,'(a,a)')' last name was : ', + name_2(ibeg:iend) WRITE(klong,'(a,i4)')' alignments : ',nbest WRITE(klong,'(a,a)')' sort-mode : ',csortmode WRITE(klong,'(a,a)')' weights 1 : ', + weight1_answer(1:40) WRITE(klong,'(a,a)')' weights 2 : ', + weight2_answer(1:40) WRITE(klong,'(a,f5.2)')' smin : ',smin WRITE(klong,'(a,f5.2)')' smax : ',smax WRITE(klong,'(a,f5.2)')' maplow : ',maplow WRITE(klong,'(a,f5.2)')' maphigh : ',maphigh WRITE(klong,'(a,a)')' gap_open : ', + openweight_answer(1:40) WRITE(klong,'(a,a)')' gap_elongation : ', + elongweight_answer (1:40) WRITE(klong,'(a,a)')' INDEL in sec-struc of SEQ 1: ', + indel_answer_1(1:40) WRITE(klong,'(a,a)')' INDEL in sec-struc of SEQ 2: ', + indel_answer_2(1:40) WRITE(klong,'(a,i4)')' NBEST alignments : ',nbest WRITE(klong,'(a,a)')' secondary structure alignment: ', + struc_align_answer(1:40) WRITE(klong,'(a)') + '========================================='// + '===================================================' ENDIF C-------------------------------------------------------------------- C HERE START SECOND PASS AFTER CALCULATING CONSERVATION WEIGHTS C-------------------------------------------------------------------- C initialize before comparisons: C total number of alignments for all sequences C record pointer for binary tempfile C number of alignments above threshold (cons-weight) C number of dssp files compared C C 200 IRECORD=0 IALIGNOLD=0 IDSSP=0 NRECORD=0 IALIGN_GOOD=0 NALIGN=0 IF (LSWISSBASE .OR. LNRDBBASE) THEN NFILE=0 CALL CONCAT_STRINGS(SPLIT_DB_PATH,SPLIT_DB_INDEX, + TEMPNAME) CALL CONCAT_STRING_INT( + 'OLD,DIRECT,FORMATTED,READONLY,RECL=',INDEXRECLEN,LINE) CALL OPEN_FILE(KINDEX,TEMPNAME,LINE,LERROR) READ(KINDEX,'(A6,I6,I12,I12)',REC=1)TEMPSTRING,NFILE, + NENTRIES,NAMINO_ACIDS IF (INDEX(TEMPSTRING,'BINARY') .NE. 0) THEN LBINARY=.TRUE. ELSE LBINARY=.FALSE. ENDIF c read(kindex,'(i6,i12,i12)',rec=1)nfile,nentries,namino_acids CLOSE(KINDEX) Caution cdebug c nfile=10 WRITE(6,'(A,A,I4,I12,I12)')TEMPSTRING(1:6),' nfile: ', + NFILE,NENTRIES,NAMINO_ACIDS IF ( NENTRIES .GT. MAXALIGNS) THEN WRITE(LOGSTRING,*)' *** FATAL ERROR: database '// + 'contains more entries than MAXALIGNS' CALL LOG_FILE(KLOG,LOGSTRING,1) STOP ENDIF ELSE IF (LISTOFSEQ_2) THEN CALL CONCAT_STRINGS('comparison is with a list of '// + 'sequences. /n Sequence names are in: ',listfile_2, + LOGSTRING) CALL LOG_FILE(KLOG,LOGSTRING,1) CALL OPEN_FILE(KLIS2,LISTFILE_2,'OLD,READONLY',LERROR) ENDIF LDSSP_2=.FALSE. N2IN=0 HEADER_2=' ' COMPND_2=' ' AUTHOR_2=' ' SOURCE_2=' ' CSYMBOL=' ' CSQ_2=' ' DO I=1,MAXSQ CRESID(I)=' ' PDBNO_2(I)=0 BP1_2(I)=0 BP2_2(I)=0 LACC_2(I)=0 LSTRUC_2(I)=0 CONSWEIGHT_2(I)=1.0 ENDDO CALL INIT_CHAR_ARRAY(1,MAXSQ,COLS_2,CSYMBOL) CALL INIT_CHAR_ARRAY(1,MAXSQ,CHAINID_2,CSYMBOL) CALL INIT_CHAR_ARRAY(1,MAXSQ,SHEETLABEL_2,CSYMBOL) CALL INIT_CHAR_ARRAY(1,MAXSQ,STRUC_2,CSYMBOL) CALL CONCAT_STRINGS(COREFILE,JOB_ID,CORETEMP) TEMPNAME=CORETEMP IF (COREPATH .NE. ' ') THEN CALL CONCAT_STRINGS(COREPATH,TEMPNAME,CORETEMP) ENDIF c corefile=coretemp WRITE(FILE_OPTION(1:),'(A,I6)') + 'UNFORMATTED,DIRECT,NEW,RECL=',maxrecordlen c WRITE(6,*)'master: ',coretemp(1:60) c call flush_unit(6) CALL OPEN_FILE(KCORE,CORETEMP,FILE_OPTION,LERROR) WRITE(6,*)'***********************************************' WRITE(6,*)' working...' WRITE(6,*)'***********************************************' C======================================================================= C PARALLEL VIA MESSAGE PASSING C======================================================================= IF (LPARALLEL) THEN IF (LSWISSBASE .OR. LNRDBBASE .OR. LISTOFSEQ_2 ) THEN CALL MAXHOM_PARALLEL_INTERFACE(LH1,LH2,NFILE,NALIGN, + NENTRIES,NAMINO_ACIDS) ELSE WRITE(6,*)' not implemented yet' STOP ENDIF CLOSE(KCORE) C====================================================================== C NOT PARALLEL: if only 1 processor C====================================================================== ELSE CALL GET_CPU_TIME('init phase:', + IDPROC,ITIME_OLD,ITIME_NEW,TOTAL_TIME,LOGSTRING) CALL LOG_FILE(KLOG,LOGSTRING,2) IF (LSWISSBASE) THEN DO IFILE=1,NFILE CALL OPEN_SW_DATA_FILE(KBASE,LBINARY,IFILE, + SPLIT_DB_DATA,SPLIT_DB_PATH,HOSTNAME) LENDFILE=.FALSE. DO WHILE (.NOT. LENDFILE) CALL GET_SWISS_ENTRY(MAXSQ,KBASE,LBINARY,N2IN, + NAME_2, + COMPND_2,ACCESSION_2,PDBREF_2,CSQ_2,LENDFILE) IF (.NOT. LENDFILE) THEN CALL DO_ALIGN(LH1,LH2,IDPROC, + NALIGN,NRECORD,SDEV) ENDIF ENDDO CLOSE(KBASE) WRITE(6,'(A,I4)')'files done: ',ifile ENDDO CALL GET_CPU_TIME('database scan phase:',idproc, + ITIME_OLD,ITIME_NEW,TOTAL_TIME,LOGSTRING) CALL LOG_FILE(KLOG,LOGSTRING,2) C======================================================================= ELSE IF (LFASTA_DB) THEN LHSSP_LONG_ID=.TRUE. CALL OPEN_FILE(KBASE,NAME2_ANSWER,'OLD,READONLY',LERROR) LENDFILE=.FALSE. DO WHILE (.NOT. LENDFILE) CALL GET_FASTA_DB_ENTRY(MAXSQ,KBASE,N2IN,NAME_2, + COMPND_2,ACCESSION_2,PDBREF_2,CSQ_2,LENDFILE) IF (.NOT. LENDFILE) THEN CALL DO_ALIGN(LH1,LH2,IDPROC,NALIGN,NRECORD,SDEV) ENDIF ENDDO CLOSE(KBASE) CALL GET_CPU_TIME('database scan phase:',idproc, + ITIME_OLD,ITIME_NEW,TOTAL_TIME,LOGSTRING) CALL LOG_FILE(KLOG,LOGSTRING,2) IF (LPASS2) THEN LPASS2=.FALSE. NAME_2=LISTFILE_2 LOGSTRING='*** START NOW THE SECOND PASS ***' CALL LOG_FILE(KLOG,LOGSTRING,1) GOTO 200 ENDIF C======================================================================= ELSE LENDFILE=.FALSE. DO WHILE (.NOT. LENDFILE) IF (LISTOFSEQ_2) THEN CALL READ_FILENAME(KLIS2,FILESEQ,LENDFILE,LERROR) IF (.NOT. LENDFILE .AND. .NOT. LERROR) THEN CALL STRPOS(FILESEQ,IBEG,IEND) WRITE(6,*)'file:',fileseq(ibeg:iend) NAME_2=FILESEQ(IBEG:IEND) ENDIF ENDIF IF (.NOT. LENDFILE) THEN IF (.NOT. LISTOFSEQ_2) LENDFILE=.TRUE. CALL CHECKFORMAT(KGETSEQ,NAME_2,SEQFORMAT,LERROR) IF (INDEX(SEQFORMAT,'PROFILE').NE.0) + LPROFILE_2=.TRUE. IF (INDEX(SEQFORMAT,'DSSP' ).NE.0) + LDSSP_2=.TRUE. IF (LPROFILE_2) THEN WRITE(LOGSTRING,'(A,A)')'read PROFILE 2: ', + NAME_2 CALL LOG_FILE(KLOG,LOGSTRING,1) CALL READPROFILE(KPROF,NAME_2,MAXSQ,NTRANS, + TRANS,LDSSP_2,N2IN,NCHAIN,HSSPID_2, + HEADER_2,COMPND_2,SOURCE_2,AUTHOR_2, + XSMIN,XSMAX,XMAPLOW,XMAPHIGH, + PROFILEMETRIC,PDBNO_2,CHAINID_2, + CSQ_2_ARRAY,STRUC_2,NSURF_2,COLS_2, + SHEETLABEL_2,BP1_2,BP2_2,NOCC_2, + GAPOPEN_2,GAPELONG_2,CONSWEIGHT_2, + SIMMETRIC_2,MAXBOX,NBOX_2,PROFILEBOX_2) DO I=1,N2IN CSQ_2(I:I)=CSQ_2_ARRAY(I) ENDDO caution C cstrstates,simorg and lsq_2 not known here C pass simorg and set lsq_2 IF (METRICFILE .NE. 'PROFILE') THEN WRITE(6,*)' option not possible, ask rs?' STOP c call fillsimmetric(maxsq,ntrans,maxstrstates, c + nstrstates_1,cstrstates,simorg, c + n2in,lsq_2,lstrclass_2,lacc_2,simmetric_2) ENDIF IF (SMIN_ANSWER .EQ. 'PROFILE') THEN SMIN=XSMIN SMAX=XSMAX MAPLOW=XMAPLOW MAPHIGH=XMAPHIGH ELSE IF (LPROFILE_2 .AND. SMIN_ANSWER .NE. + 'PROFILE') THEN MAPLOW=XMAPLOW MAPHIGH=XMAPHIGH ENDIF IF (OPENWEIGHT_ANSWER .NE. 'PROFILE') THEN DO I=1,MAXSQ GAPOPEN_2(I)=OPEN_1 ENDDO ENDIF IF (ELONGWEIGHT_ANSWER .NE. 'PROFILE') THEN DO I=1,MAXSQ GAPELONG_2(I)=ELONG_1 ENDDO ENDIF C reset conservation weights for sequence 2 if not wanted IF (.NOT. LCONSERV_2 ) THEN DO I=1,MAXSQ CONSWEIGHT_2(I)=1.0 ENDDO ENDIF IF (LNORM_PROFILE) THEN WRITE(6,*)'CALL NORM_PROFILE ' SMIN=0.0 SMAX=0.0 MAPLOW=0.0 MAPHIGH=0.0 CALL NORM_PROFILE(MAXSQ,NTRANS,TRANS,N2IN, + N1,LSQ_1,SIMMETRIC_2,PROFILE_EPSILON, + PROFILE_GAMMA,SMIN,SMAX,MAPLOW, + MAPHIGH,GAPOPEN_2,GAPELONG_2,SDEV) ELSE WRITE(6,*)' call scale_profile disabled' c WRITE(6,'(a,4(2x,f5.2)))')'CALL SCALE_PROFILE 2', c + smin,smax,maplow,maphigh c call scale_profile_metric(maxsq,ntrans,trans, c + simmetric_2,smin,smax,maplow,maphigh) ENDIF ELSE C all chains wanted from dssp data set CALL CHECKFORMAT(KGETSEQ,NAME_2,SEQFORMAT, + LERROR) IF (INDEX(SEQFORMAT,'DSSP') .NE. 0 .OR. + INDEX(SEQFORMAT,'PROFILE-DSSP') .NE.0) THEN LDSSP_2=.TRUE. ENDIF KCHAIN=0 TEMPNAME=' ' I=INDEX(NAME_2,'!') IF (I .GT. 0) THEN TEMPNAME(1:)=NAME_2(1:I-2) CTEMP(1:)=NAME_2(I+2:) ELSE TEMPNAME(1:)=NAME_2(1:) CTEMP=' ' ENDIF PDBREFLINE=' ' IF (LDSSP_2 .EQV. .FALSE.) THEN CALL GET_SEQ(KGETSEQ,TEMPNAME,TRANS,CTEMP, + COMPND_2,ACCESSION_2,PDBREFLINE,PDBNO_2, + N2IN,CSQ_2,STRUC_2_STRING,NSURF_2, + LTRUNCATED,LERROR) C convert cresid to pdb-number and chain identifier, used in 3d superposition C cresid from getseq is : C "1234AB" (number, alternate residue, chain identifier) C here skip alternate residue and append chain_id IF ( STRUC_2_STRING .EQ. ' ') THEN DO I=1,N2IN STRUC_2(I)='U' ENDDO ELSE DO I=1,N2IN STRUC_2(I)=STRUC_2_STRING(I:I) ENDDO ENDIF DO I=1,N2IN CSQ_2_ARRAY(I)=CSQ_2(I:I) READ(CRESID(I),'(I4,1X,A)') + PDBNO_2(I),CHAINID_2(I) ENDDO ELSE C ALL CHAINS WANTED FROM DSSP DATA SET K=0 CHAINREMARK=' ' TEMPNAME(1:)=NAME_2 I=INDEX(TEMPNAME,'!') IF (I .GT. 0) THEN KSELECT=1 IEND=LEN(TEMPNAME) DO J=IEND,I+1,-1 IF (TEMPNAME(J:J) .EQ. ',') + KSELECT=KSELECT+1 ENDDO c WRITE(6,*)' use ',kselect,' chain(s) ', c + tempname(i:) CHAINREMARK(1:)=TEMPNAME TEMPNAME(1:)=NAME_2(1:I-2) ELSE CALL SELECT_UNIQUE_CHAIN(KGETSEQ, + TEMPNAME,LINE) ENDIF J=1 CALL GETDSSPFORHSSP(KGETSEQ,TEMPNAME, + MAXSQ,CHAINREMARK, + BRKID_2,HEADER_2,COMPND_2,SOURCE_2, + AUTHOR_2,N2IN,I,J,K,PDBNO_2, + CHAINID_2,CSQ_2_ARRAY,STRUC_2,COLS_2, + BP1_2,BP2_2,SHEETLABEL_2,NSURF_2) DO I=1,N2IN CSQ_2(I:I)=CSQ_2_ARRAY(I) ENDDO c call getpidcode(name_2,pdbref_2) ENDIF CALL SELECT_PDB_POINTER(KREF,DSSP_PATH, + PDBREFLINE,PDBREF_2) ENDIF CALL DO_ALIGN(LH1,LH2,IDPROC,NALIGN,NRECORD,SDEV) ENDIF ENDDO IF (LPASS2) THEN LPASS2=.FALSE. REWIND(KLIS2) NAME_2=LISTFILE_2 LOGSTRING= + '******* START NOW THE SECOND PASS *********' CALL LOG_FILE(KLOG,LOGSTRING,1) GOTO 200 ENDIF IF (LISTOFSEQ_2) THEN CLOSE(KLIS2) ENDIF ENDIF CLOSE(KCORE) C END : NOT LPARALLEL ENDIF C======================================================================= C ONLY HOST IS DOING THE REST C======================================================================= NSELECT=NALIGN-(MIN(NALIGN,NGLOBALHITS))+1 C======================================================================= C QSORT for globally best alignments C======================================================================= CAUTION: also activate checkval in loop over selected ali below WRITE(6,*)' calculate ZSCORE' c call open_file(kdeb,'DEBUG.X','NEW',lerror) c call profilenormal(kdeb,len2_orig, c + alisortkey_global,nalign,z_order, c + zscore_temp,lerror) c close(kdeb) IF (NALIGN .GT. 10) THEN CALL MOMENT(ALISORTKEY,NALIGN,AVE,ADEV,SDEV,VAR,SKEW, + CURT) DO I=1,NALIGN ZSCORE_TEMP(I)=(ALISORTKEY(I) - AVE) / SDEV ENDDO ELSE WRITE(LOGSTRING,*)'NOT enough data for ZSCORE calculation' CALL LOG_FILE(KLOG,LOGSTRING,1) ENDIF IF (CSORTMODE .EQ. 'ZSCORE' ) THEN WRITE(6,*) 'ENTER GLOBAL QSORT for ZSCORE: SIZE',nalign CALL MAXHOM_QSORT(NALIGN,IRECPOI,IFILEPOI, + ZSCORE_TEMP) ELSE WRITE(6,*) 'ENTER GLOBAL QSORT: SIZE',nalign CALL MAXHOM_QSORT(NALIGN,IRECPOI,IFILEPOI, + ALISORTKEY) ENDIF C======================================================================= IF (NALIGN .GT. 0) THEN c IF (lcompstr) THEN c call open_file(kstruc,'struc.data','UNKNOWN,APPEND',lerror) c endif C======================================================================= C WRITE simple histogram of scores C======================================================================= c IF (lswissbase .or. lnrdbbase) THEN c call concat_strings(hsspid_1,'.histo',histofile) c call WRITE_histo(khisto,histofile,nalign,alisortkey) c endif C======================================================================= C alis stored in 1.....NALIGN C <.....all.......> C L ------> index of all ALXX C======================================================================= C if only one file C open datafile(s) for homogenous enviroment WRITE(FILE_OPTION(1:),'(A,I6)') + 'UNFORMATTED,DIRECT,OLD,RECL=',maxrecordlen CALL OPEN_FILE(KCORE,CORETEMP,FILE_OPTION,LERROR) IF (NWORKSET .GT. 0) THEN C================================================================== C send alignment requests to node C================================================================== c else LALI=NALIGN+1 IAL=0 WRITE(6,*)' send alignment requests...' DO WHILE(LALI .GE. (NALIGN - IALIGN_GOOD+2) ) LALI=LALI-1 IAL=IAL+1 IRECORD=IRECPOI(LALI) IF (IRECORD .LE. 0) THEN WRITE(6,*)' uuuppps, irecord .le. 0 ' LCONSIDER=.FALSE. ELSE IWORKER=IFILEPOI(LALI) IF (IWORKER .NE. ID_HOST) THEN IF (CSORTMODE .EQ. 'ZSCORE') THEN c zscore(ial)=zscore_temp(lali) CHECKVAL=ZSCORE_TEMP(LALI) ELSE c zscore(ial)=zscore_temp(ial) CHECKVAL=ALISORTKEY(LALI) ENDIF c WRITE(6,*)' send request ',iworker,irecord,ial CALL SEND_ALI_REQUEST(IWORKER,IRECORD,IAL, + CHECKVAL) ENDIF ENDIF ENDDO ENDIF C================================================================== C collect alignment from node (message tag is ali-number) C================================================================== LALI=NALIGN+1 IAL=0 ISEQPOS=1 ISTRPOS=1 IOPOS=1 INSPOS=1 LCONSIDER=.TRUE. LBUFFEROVERFLOW=.FALSE. WRITE(6,*)' collecting best alignments...' DO WHILE ( (LALI .GE. (NALIGN - IALIGN_GOOD+2)) .AND. + (LBUFFEROVERFLOW .EQV. .FALSE.) .AND. + (IAL+1 .LE. MAXHITS) ) LALI=LALI-1 IAL=IAL+1 IWORKER=IFILEPOI(LALI) IF (IWORKER .GE. 0) THEN IF (IWORKER .NE. ID_HOST) THEN c WRITE(6,*)' node alignment ',iworker,ial CALL GETALIGN_FROM_WORKER(IWORKER,IAL, + IFIR,LEN1,LENOCC,JFIR,JLAS,IDEL,NDEL, + VALUE,RMS,HOM,SIM,SDEV, + DISTANCE) ELSE IF (CSORTMODE .EQ. 'ZSCORE') THEN c zscore(ial)=zscore_temp(lali) CHECKVAL=ZSCORE_TEMP(LALI) ELSE c zscore(ial)=zscore_temp(ial) CHECKVAL=ALISORTKEY(LALI) ENDIF IRECORD=IRECPOI(LALI) c WRITE(6,*)' host alignment ',irecord,checkval CALL GETALIGN(KCORE,IRECORD,IFIR,LEN1,LENOCC,JFIR, + JLAS,IDEL,NDEL,VALUE, + RMS,HOM,SIM,SDEV,DISTANCE,CHECKVAL) ENDIF IF (CSORTMODE .EQ. 'ZSCORE') THEN ZSCORE(IAL)=ZSCORE_TEMP(LALI) ELSE ZSCORE(IAL)=ZSCORE_TEMP(IAL) ENDIF c store alignment pointers, identity, similarity.... C length of alignment in HSSP-output is number of occupied position C length without insertions deletions AL_IFIRST(IAL)=IFIR AL_ILAST(IAL)=IFIR+LEN1-1 AL_LEN(IAL)=LEN1 AL_RMS(IAL)=RMS AL_JFIRST(IAL)=JFIR+NSHIFTED AL_JLAST(IAL)=JLAS+NSHIFTED AL_VPERRES(IAL)=VALUE/LENOCC AL_VAL(IAL)=VALUE AL_SDEV(IAL)=SDEV AL_HOM(IAL)=HOM AL_SIM(IAL)=SIM AL_HOMLEN(IAL)=LENOCC AL_LSEQ_2(IAL)=N2IN AL_LGAP(IAL)=IDEL AL_NGAP(IAL)=NDEL AL_COMPOUND(IAL)=COMPND_2 AL_ACCESSION(IAL)=ACCESSION_2 AL_PDB_POINTER(IAL)=PDBREF_2 CALL GETPIDCODE(NAME_2,AL_EMBLPID(IAL)) TEMPNAME=NAME_2 I=INDEX(TEMPNAME,'!')-1 IF ( I .GT. 0) THEN CALL CONCAT_STRINGS(AL_EMBLPID(IAL), + TEMPNAME(I+2:),LINE) AL_EMBLPID(IAL)=LINE(1:LEN(AL_EMBLPID(IAL))) ENDIF C store alignments in buffer SEQBUFFER IF (ISEQPOS+LEN1+1 .LE. MAXSEQBUFFER) THEN DO K=1,LEN1 SEQBUFFER(ISEQPOS+K-1)=AL_2(K:K) ENDDO ISEQPOINTER(IAL)=ISEQPOS ISEQPOS=ISEQPOS+LEN1+1 SEQBUFFER(ISEQPOS)='/' ELSE WRITE(LOGSTRING,*)' MAXSEQBUFFER buffer overflow' CALL LOG_FILE(KLOG,LOGSTRING,1) LBUFFEROVERFLOW=.TRUE. ENDIF C store secondary structure and inside/outside in buffer STRBUFFER/CIOBUFFER ISTRPOINTER(IAL)=0 IOPOINTER(IAL)=0 IF (LDSSP_2) THEN IF (ISTRPOS+LEN1+1 .LE. MAXSTRBUFFER) THEN DO K=1,LEN1 STRBUFFER(ISTRPOS+K-1)=SAL_2(K:K) ENDDO ISTRPOINTER(IAL)=ISTRPOS ISTRPOS=ISTRPOS+LEN1+1 STRBUFFER(ISTRPOS)='/' ELSE WRITE(LOGSTRING,*) + ' MAXSTRBUFFER buffer overflow' CALL LOG_FILE(KLOG,LOGSTRING,1) LBUFFEROVERFLOW=.TRUE. ENDIF IF (IOPOS+LEN1+1 .LE. MAXIOBUFFER) THEN DO K=1,LEN1 WRITE(CIOBUFFER(IOPOS+K-1),'(A1)') + CIOSTATES(LACC_2(K):LACC_2(K)) IF (CIOSTATES(LACC_2(K):LACC_2(K)) .EQ. + ' ') THEN CIOBUFFER(IOPOS+K-1)='U' ELSE WRITE(CIOBUFFER(IOPOS+K-1),'(A1)') + CIOSTATES(LACC_2(K):LACC_2(K)) ENDIF ENDDO IOPOINTER(IAL)=IOPOS IOPOS=IOPOS+LEN1+1 CIOBUFFER(IOPOS)='/' ELSE WRITE(LOGSTRING,*)' MAXIOBUFFER buffer overflw' CALL LOG_FILE(KLOG,LOGSTRING,1) LBUFFEROVERFLOW=.TRUE. ENDIF ENDIF c accumulate predicted structure IF (LDSSP_2 ) THEN C convert to structure class, evaluate STRHOM and store in al_strhom IDSSP=IDSSP+1 IPOS=1 IAGR=0 DO I=IFIR,IFIR+LEN1-1 CALL STRUC_CLASS(MAXSTRSTATES,STR_CLASSES, + STRCLASS_1(I:I),CTEMP,ICLASS) CALL STRUC_CLASS(MAXSTRSTATES,STR_CLASSES, + SAL_2(IPOS:IPOS),CTEMP,JCLASS) IF (ICLASS .NE. 0 .AND. JCLASS .NE. 0) THEN STRSUM(JCLASS,I)= STRSUM(JCLASS,I) + VALUE IF (ICLASS .EQ. JCLASS ) IAGR=IAGR+1 ENDIF IPOS=IPOS+1 ENDDO C AL_STRHOM is fractional structure agreement for alignment L AL_STRHOM(IAL)=FLOAT(IAGR) / FLOAT(LEN1) ELSE AL_STRHOM(IAL)=-1.0 ENDIF C store insertions of seq2 in array INSBUFFER C Note: insertions are stored from trace in the following way: C C insertion 1. 2. 3. C <<<<==========aTIGHn====gHDFGt======eRTWQEp====<<<<< alignment C insseq: *aTIGHn*gHDFGt*eRTWQEp C here we loop from insertion number 3 to 1 and have to reverse C the string to get the right order C "aTIGHn" becomes now "nHGITa" C insbuffer: pRQWTRetGFDHgnHGITa C IF (IINS .GT. 0) THEN CALL STRPOS(INSSEQ,IBEG,IEND) IPOS=IEND IF (INSNUMBER + IINS .LE. MAXINS) THEN DO I=IINS,1,-1 IF (INSPOS+INSLEN_LOCAL(I)+2 .LE. + MAXINSBUFFER) THEN DO K=INSPOS+INSLEN_LOCAL(I)+1,INSPOS,-1 INSBUFFER(K)=INSSEQ(IPOS:IPOS) IPOS=IPOS-1 ENDDO INSNUMBER=INSNUMBER + 1 INSALI(INSNUMBER)= IAL INSPOINTER(INSNUMBER)=INSPOS INSBEG_1(INSNUMBER)=INSBEG_1_LOCAL(I) INSBEG_2(INSNUMBER)=INSBEG_2_LOCAL(I) INSLEN(INSNUMBER)=INSLEN_LOCAL(I) INSPOS=INSPOS+INSLEN_LOCAL(I)+2 IPOS =IPOS-1 ELSE WRITE(LOGSTRING,*) + ' BUFFER MAXINSBUFFER OVERFLOW' CALL LOG_FILE(KLOG,LOGSTRING,1) LBUFFEROVERFLOW=.TRUE. ENDIF ENDDO ELSE WRITE(6,*)' MAXINS overflow' LBUFFEROVERFLOW=.TRUE. ENDIF ENDIF c WRITE data for HSSP-PLOT c IF (lcompstr) THEN c WRITE(kstruc,'(1x,i4,4(2x,f7.2),2x,i4,2x,a,2x,a)') c + lenocc,hom*100.0,sim*100.0,al_strhom(ial)*100.0, c + rms,idel,hsspid_1,name_2(1:20) c endif C======================================================================= ENDIF C end loop over selected alignments ENDDO C close temporary files IF (NWORKSET .EQ. 0) THEN CLOSE(KCORE) ELSE MSGTYPE=6000 IRECORD=-1 CHECKVAL=0.0 DO ISET=1,NWORKSET CALL MP_INIT_SEND() CALL MP_PUT_INT4(MSGTYPE,LINK(ISET),IRECORD,N_ONE) CALL MP_PUT_INT4(MSGTYPE,LINK(ISET),IRECORD,N_ONE) CALL MP_PUT_REAL4(MSGTYPE,LINK(ISET),CHECKVAL,N_ONE) CALL MP_SEND_DATA(MSGTYPE,LINK(ISET)) ENDDO ENDIF C======================================================================= IF (LBUFFEROVERFLOW) THEN CALL CONCAT_STRING_INT('WARNING: INTERNAL BUFFER '// + 'OVERFLOW /n SELECTED ALIGNMENTS: ',ial,logstring) CALL LOG_FILE(KLOG,LOGSTRING,1) ENDIF WRITE(LOGSTRING,*) + 'number of alignments above choosen threshold: ',ial CALL LOG_FILE(KLOG,LOGSTRING,1) C======================================================================= NALIGN=IAL c IF (lcompstr) THEN ; close(kstruc) ; endif C======================================================================= C PREDICT C======================================================================= LPREDICTION=(IDSSP .NE. 0) IF (LPREDICTION) THEN DO I=1,N1 STRMAX=0.0 PREDSTR(I)='-' DO K=1,MAXSTRSTATES IF (STRSUM(K,I) .GT. STRMAX) THEN STRMAX=STRSUM(K,I) PREDSTR(I)=CSTRSTATES(K:K) ENDIF ENDDO ENDDO IF (LDSSP_1 ) THEN CALL MARKALI(STRCLASS_1,PREDSTR,N1,PREDSTRCORR,'+') ENDIF ENDIF C======================================================================= C WRITE STRIP file C======================================================================= IF (LSTRIP) THEN C header to STRIPFILE CALL OPEN_FILE(KSTP,STRIPFILE,'NEW,RECL=10000',lerror) WRITE(KSTP,'(A)') + '======================================'// + '============ MAXHOM-STRIP =================='// + '===================================' CALL STRPOS(NAME_1,IBEG,IEND) WRITE(KSTP,'(A,A)')' test sequence : ', + name_1(ibeg:iend) IF (LISTOFSEQ_2) THEN WRITE(KSTP,'(A,A)')' list name : ', + listfile_2(1:50) ENDIF CALL STRPOS(NAME_2,IBEG,IEND) WRITE(kstp,'(a,a)')' last name was : ', + name_2(ibeg:iend) WRITE(kstp,'(a,i6)')' seq_length : ',n1 WRITE(kstp,'(a,i4)')' alignments : ',nalign WRITE(kstp,'(a,a)')' sort-mode : ',csortmode WRITE(kstp,'(a,a)')' weights 1 : ', + weight1_answer(1:40) WRITE(kstp,'(a,a)')' weights 2 : ', + weight2_answer(1:40) WRITE(kstp,'(a,f5.2)')' smin : ',smin WRITE(kstp,'(a,f5.2)')' smax : ',smax WRITE(kstp,'(a,f5.2)')' maplow : ',maplow WRITE(kstp,'(a,f5.2)')' maphigh : ',maphigh WRITE(kstp,'(a,f5.2)')' epsilon : ', + profile_epsilon WRITE(kstp,'(a,f5.2)')' gamma : ', + profile_gamma WRITE(kstp,'(a,a)')' gap_open : ', + openweight_answer(1:40) WRITE(kstp,'(a,a)')' gap_elongation : ', + elongweight_answer(1:40) WRITE(kstp,'(a,a)')' INDEL in sec-struc of SEQ 1: ', + indel_answer_1(1:40) WRITE(kstp,'(a,a)')' INDEL in sec-struc of SEQ 2: ', + indel_answer_2(1:40) WRITE(kstp,'(a,i4)')' NBEST alignments : ',nbest WRITE(kstp,'(a,a)')' secondary structure alignment: ', + struc_align_answer(1:40) C list of best alignments to stripfile WRITE(kstp,'(a)') + '======================================'// + '============= SUMMARY ========================='// + '==================================' WRITE(kstp,'(a)') + ' IAL VAL LEN IDEL NDEL ZSCORE '// + '%IDEN STRHOM LEN2 RMS SIGMA NAME' DO IAL=1,NALIGN CALL STRPOS(AL_COMPOUND(IAL),IBEG,IEND) WRITE(KSTP, + '(i4,1x,f7.2,3(1x,i4),3(2x,f6.2),i6,f6.2,f6.3,2(1x,a))') + IAL,AL_VAL(IAL),AL_HOMLEN(IAL),AL_LGAP(IAL), + AL_NGAP(IAL),ZSCORE(IAL),AL_HOM(IAL), + AL_STRHOM(IAL),AL_LSEQ_2(IAL),AL_RMS(IAL), + AL_SDEV(IAL),AL_EMBLPID(IAL), + AL_COMPOUND(IAL)(IBEG:IEND) ENDDO c loop over linelen IF (LSTRIP_LONG .EQV. .TRUE.) THEN LINELEN=10000 ELSE LINELEN=50 ENDIF IPOS=1 JPOS=MIN(IPOS+LINELEN,N1) DO WHILE (IPOS .LE. N1+1) WRITE(KSTP,'(A)') + '===================================='// + '============ ALIGNMENTS ========================'// + '===================================' WRITE(KSTP,*) C WRITE ruler, sequence, secondary structure ,predicted secondary structure C and agreement for sequence 1 WRITE(KSTP,'(I4,A,I4,15X,A)')IPOS,' -',JPOS, + '....:....1....:....2....:....3....:....4....:....5' WRITE(KSTP,'(6X,A,15X,A)') + HSSPID_1(1:4),CSQ_1(IPOS:JPOS) DO I=IPOS,JPOS CTMPCHAR_1(I:I)=STRUC_1(I) ENDDO WRITE(KSTP,'(25X,A)')CTMPCHAR_1(IPOS:JPOS) IF (LDSSP_1) THEN DO I=IPOS,JPOS IF (CIOSTATES(LACC_1(I):LACC_1(I)).EQ.' ') THEN CTMPCHAR_1(I:I)='u' ELSE WRITE(CTMPCHAR_1(I:I),'(A1)') + CIOSTATES(LACC_1(I):LACC_1(I)) ENDIF ENDDO WRITE(KSTP,'(25X,A)')CTMPCHAR_1(IPOS:JPOS) ENDIF IF (LPREDICTION) THEN DO I=IPOS,JPOS CTMPCHAR_1(I:I)=PREDSTR(I) ENDDO WRITE(KSTP,'(25X,A)')CTMPCHAR_1(IPOS:JPOS) DO I=IPOS,JPOS CTMPCHAR_1(I:I)=PREDSTRCORR(I) ENDDO WRITE(KSTP,'(25X,A)')CTMPCHAR_1(IPOS:JPOS) ENDIF WRITE(KSTP,*) C loop over alignments DO IAL=1,NALIGN c check if overlap at actual sequence 1 position LCONSIDER=.TRUE. ILAS=AL_IFIRST(IAL) + AL_LEN(IAL)-1 IF (ILAS .LT. IPOS .OR. + JPOS .LT. AL_IFIRST(IAL) ) THEN LCONSIDER=.FALSE. ENDIF C IF OVERLAP AND THRESHOLD OK FILL OUTPUT LINES IF (LCONSIDER) THEN C MARK SEQ-IDENTITIES IN EXTRA LINE ISEQPOS=ISEQPOINTER(IAL) STRIPLINE(1)=' ' J=AL_IFIRST(IAL) DO I=1,AL_LEN(IAL) CTEMP=SEQBUFFER(ISEQPOS) AL_2(I:I)=CTEMP CALL LOWTOUP(CTEMP,1) IF (CSQ_1(J:J) .GE. 'a' .AND. + CSQ_1(J:J) .LE. 'z') THEN CTEMP2='C' ELSE CTEMP2=CSQ_1(J:J) ENDIF IF (CTEMP2 .EQ. CTEMP) THEN STRIPLINE(1)(J:J)=CTEMP2 ENDIF ISEQPOS=ISEQPOS+1 J=J+1 ENDDO c WRITE alignend sequence and secondary-structure (if known) STRIPLINE(2)=' ' WRITE(STRIPLINE(2)(AL_IFIRST(IAL):ILAS),'(A)') + AL_2(1:AL_LEN(IAL)) CALL STRPOS(STRIPLINE(2),I,IEND) J=MIN(IEND,JPOS) WRITE(KSTP,'(25X,A)')STRIPLINE(1)(IPOS:J) WRITE(KSTP,'(I4,A,1X,A10,1X,F7.2,1X,A)') + IAL,'.',AL_EMBLPID(IAL),AL_VAL(IAL), + STRIPLINE(2)(IPOS:J) IF (ISTRPOINTER(IAL) .NE. 0) THEN STRIPLINE(3)=' ' ISTRPOS=ISTRPOINTER(IAL) DO I=1,AL_LEN(IAL) SAL_2(I:I)=STRBUFFER(ISTRPOS) ISTRPOS=ISTRPOS+1 ENDDO WRITE(STRIPLINE(3)(AL_IFIRST(IAL):ILAS), + '(A)')SAL_2(1:AL_LEN(IAL)) WRITE(KSTP,'(25X,A)')STRIPLINE(3)(IPOS:J) ENDIF IF (IOPOINTER(IAL) .NE. 0) THEN STRIPLINE(4)=' ' IOPOS=IOPOINTER(IAL) DO I=AL_IFIRST(IAL),ILAS WRITE(STRIPLINE(4)(I:I),'(A)') + CIOBUFFER(IOPOS) IOPOS=IOPOS+1 ENDDO WRITE(KSTP,'(25X,A)')STRIPLINE(4)(IPOS:J) ENDIF ENDIF ENDDO c next block IPOS=IPOS+LINELEN JPOS=MIN(JPOS+LINELEN,N1) WRITE(KSTP,*) ENDDO WRITE(KSTP,'(A)') + '=================================================='// + '=================================================='// + '===================' C lstrip ENDIF C======================================================================= C HSSP-OUTPUT C======================================================================= IF (LHSSP) THEN WRITE(6,*)'CALL HSSP' CALL HSSP(NALIGN,CHAINREMARK) IF ( (L3WAY .EQV. .TRUE.).AND.(L3WAYDONE .EQV. .FALSE.) + .AND. (LPROFILE_1 .EQV. .FALSE.) ) THEN WEIGHT_MODE='EIGEN' SIGMA=0.0 BETA=1.0 CALL PREP_PROFILE(NALIGN,N1,WEIGHT_MODE,SIGMA,BETA) WRITE(6,*)' call scale_profile' CALL SCALE_PROFILE_METRIC(MAXSQ,NTRANS,TRANS, + SIMMETRIC_1,SMIN,SMAX,MAPLOW,MAPHIGH) ENDIF ENDIF C WRITE MAXHOM-PROFILE IF (LWRITEPROFILE ) THEN c IF (lWRITEprofile .and. (l3waydone .eqv. .true.) ) THEN WRITE(6,*)' CALL WRITEPROFILE' CALL WRITEPROFILE(KPROF,PROFILEOUT,MAXSQ,N1,NCHAINUSED, + HSSPID_1,HEADER_1,COMPND_1,SOURCE_1, + AUTHOR_1,SMIN,SMAX,MAPLOW,MAPHIGH, + METRICFILE,PDBNO_1,CHAINID_1,CSQ_1_ARRAY, + STRUC_1,NSURF_1,COLS_1,SHEETLABEL_1,BP1_1, + BP2_1,NOCC_1,GAPOPEN_1,GAPELONG_1, + CONSWEIGHT_1,SIMMETRIC_1,MAXBOX,NBOX_1, + PROFILEBOX_1,LDSSP_1) CALL CONCAT_STRINGS(HSSPID_1,'.xprism3',tempname) CALL OPEN_FILE(KPROF,TEMPNAME,'NEW,RECL=300',LERROR) DO I=1,N1 WRITE(KPROF,'(20(F8.3))')(SIMMETRIC_1(I,J),J=1,20) ENDDO CLOSE(KPROF) CALL CONCAT_STRINGS(HSSPID_1,'.stf',tempname) CALL OPEN_FILE(KPROF,TEMPNAME,'NEW,RECL=300',LERROR) WRITE(KPROF,*)'NAME field' WRITE(KPROF,*)'RANK 2' WRITE(kprof,*)'DIMENSIONS ',n1,' ',20 WRITE(kprof,*)'BOUNDS ',1,n1,1,20 WRITE(kprof,*)'SCALAR' WRITE(kprof,*)'ORDER ROW' WRITE(kprof,*)'DATA' DO I=1,N1 WRITE(KPROF,'(20(F8.3))')(SIMMETRIC_1(I,J),J=1,20) ENDDO CLOSE(KPROF) ENDIF c no more, wind up WRITE(6,*)' start prediction and strip' IF (LPREDICTION .AND. LSTRIP) THEN c 1. evaluate prediction STATFILE='collage-stat.data' CALL OPEN_FILE(KSTAT,STATFILE,'UNKNOWN,APPEND',LERROR) DO I=1,N1 CTMPCHAR_ARRAY_1(I)=STRCLASS_1(I:I) ENDDO CALL EVALPRED(HSSPID_1(1:4),'COLLAGE ',PREDSTR, + CTMPCHAR_ARRAY_1,N1,LDSSP_1,KSTP,KSTAT) CLOSE (KSTAT) c evaluate alignments c correlation CALL CORRELATION(AL_VAL,AL_STRHOM,NALIGN,CVALSTR) CALL CORRELATION(AL_VPERRES,AL_STRHOM,NALIGN,CPERRES) WRITE(6,'(a,2f7.3)')'seq/str correlation ', + cvalstr,cperres WRITE(kstp,'(a,f7.3)')'CVALSTR - correlation: '// + 'seq hom/struc hom: ',cvalstr WRITE(kstp,'(a,f7.3)')' CPERRES - correlation: '// + 'seq hom per res/struc hom',cperres C histogram c$$$ istep=nint( ( 1.0 / float(maxhist) ) + 0.5 ) c$$$ do i=1,maxhist ; do j=1,maxhist ;lhist(i,j)=0 ;enddo;enddo c$$$ do ial=nalign,1,-1 c$$$ i=nint(al_vperres(ial)/istep) c$$$ j=nint(al_strhom(ial)/istep) c$$$ i=min(i,maxhist) ; i=max(i,1) ; j=min(j,maxhist) c$$$ j=max(j,1) ; lhist(i,j)=lhist(i,j)+1 c$$$ enddo c$$$ WRITE(kstp,*) c$$$ WRITE(kstp,*)nalign,' events in', c$$$ + ' histogram VALPERRES(left/right) vs. AL_STRHOM(up/down)' c$$$ ipos=1 c$$$ do i=1,maxhist c$$$ WRITE(ctmpchar_1(ipos:),'(i5)')i ; ipos=ipos+5 c$$$ enddo c$$$ call strpos(ctmpchar_1,i,j) c$$$ WRITE(kstp,'(5x,a)')ctmpchar_1(:j) c$$$ do i=maxhist,1,-1 c$$$ WRITE(ctmpchar_1(1:),'(i5)')i ;ipos=6 c$$$ do j=1,maxhist c$$$ WRITE(ctmpchar_1(ipos:),'(i5)')lhist(j,i) c$$$ ipos=ipos+5 c$$$ enddo c$$$ call strpos(ctmpchar_1,ibeg,iend) c$$$ WRITE(kstp,'(a)')ctmpchar_1(:iend) c$$$ enddo C prediction ENDIF IF (LSTRIP) CLOSE(KSTP) C======================================================================= C if no alignments WRITE hssp header and process next seq ELSE CALL CONCAT_STRINGS('*** WARNING: no alignments for: ', + name_1,logstring) CALL LOG_FILE(KLOG,LOGSTRING,1) CALL OPEN_FILE(KWARN,WARNFILE,'UNKNOWN,APPEND',LERROR) CALL LOG_FILE(KWARN,LOGSTRING,0) CLOSE(KWARN) IF (LHSSP) THEN CALL HSSP(NALIGN,CHAINREMARK) ENDIF ENDIF C clean up the local binary files c IF (nworkset .eq. 0) THEN c call concat_int_string(nworkset,corefile,coretemp) c endif c WRITE(6,*)' clean up: ',kcore,coretemp CALL DEL_OLDFILE(KCORE,CORETEMP) C======================================================================= C PARALLEL C if idproc not host: C 1.) receive start signal and data from host C 2.) call the DO_ALIGN routine C 3.) send results back to host C======================================================================= ELSE IF (LPARALLEL .EQV. .TRUE.) THEN CALL NODE_INTERFACE(LH1,LH2) c close(klog) CALL MP_LEAVE STOP C end idproc .eq. host ENDIF C======================================================================= CALL GET_CPU_TIME('time after output:',idproc, + ITIME_OLD,ITIME_NEW,TOTAL_TIME,LOGSTRING) CALL LOG_FILE(KLOG,LOGSTRING,2) LAGAIN=.FALSE. 900 IF ( (L3WAY .EQV. .TRUE.) .AND. (L3WAYDONE .EQV. .FALSE.) ) THEN L3WAYDONE=.TRUE. NAME_1=PROFILEOUT METRICFILE='PROFILE' LPROFILE_1=.TRUE. LCONSIMPORT=.TRUE. LCONSERV_1=.TRUE. LPASS2=.FALSE. SMIN_ANSWER='PROFILE' SMAX_ANSWER='PROFILE' OPENWEIGHT_ANSWER='PROFILE' ELONGWEIGHT_ANSWER='PROFILE' LAGAIN=.TRUE. ENDIF IF (LISTOFSEQ_1 .EQV. .TRUE.) THEN CALL DEL_OLDFILE(KCORE,COREFILE) WRITE(6,*)'****************************************' LOGSTRING='*** START RUN FOR NEXT TEST SEQUENCE ***' CALL LOG_FILE(KLOG,LOGSTRING,1) WRITE(6,*)'****************************************' LAGAIN=.TRUE. ENDIF IF (LAGAIN .EQV. .TRUE.) GOTO 100 c1000 IF (listofseq_1 .eqv. .true.) THEN 1000 IF (LPARALLEL .EQV. .TRUE.) THEN N1=-999 CALL SEND_DATA_TO_NODE ENDIF CLOSE(KLIS1) c endif C======================================================================= IF ( IDPROC .EQ. ID_HOST) THEN CALL GET_CPU_TIME('time finish:',idproc, + ITIME_OLD,ITIME_NEW,TOTAL_TIME,LOGSTRING) CALL LOG_FILE(KLOG,LOGSTRING,2) CLOSE(KLOG) IF (LPARALLEL .EQV. .TRUE.) THEN CALL MP_LEAVE ENDIF WRITE(6,*)'Juuuppdiduu: MAXHOM normal termination' STOP ENDIF END C END MAXHOM C....................................................................... C....................................................................... C SUB acchistogram C$$$ SUBROUTINE ACCHISTOGRAM(A,B,ASTEP,BSTEP,NA,NB,LHIST,NHIST,MAXHIST) c$$$C accumulates one doublet A,B into counts LHIST(IA,IB) c$$$C total number of doublets is NHIST c$$$C LHIST initialized at first call c$$$c implicit none c$$$ integer maxhist,na,nb,nhist c$$$ integer lhist(maxhist,maxhist) c$$$ real a,astep,b,bstep c$$$ c$$$C internal c$$$ integer mhist,ia,ib c$$$ mhist=0 c$$$ IF (na.gt.maxhist) THEN c$$$ WRITE(6,*)'*** WARN NA.GT.MAXHIST' c$$$ na=maxhist c$$$ endif c$$$ IF (nb.gt.maxhist) THEN c$$$ WRITE(6,*)'*** WARN NB.GT.MAXHIST' c$$$ nb=maxhist c$$$ endif c$$$ IF (mhist.eq.0) THEN c$$$ do ia=1,na , do ib=1,nb c$$$ lhist(ia,ib)=0 c$$$ enddo, enddo c$$$ endif c$$$ mhist=mhist+1 , nhist=mhist c$$$ ia=nint(a/(astep+0.5)) ; ib=nint(b/(bstep+0.5)) c$$$ IF (ia.gt.na)ia=na ; IF (ia.lt.1)ia=1 c$$$ IF (ib.gt.nb)ib=nb ; IF (ib.lt.1)ib=1 c$$$ lhist(ia,ib)=lhist(ia,ib)+1 c$$$ RETURN c$$$ END C END ACCHISTOGRAM C....................................................................... C....................................................................... C SUB ADD_SEQ_TO_SEQBUFFER SUBROUTINE ADD_SEQ_TO_SEQBUFFER(MAXALIGNS,MAXSEQBUFFER,ADDPOS, + NALIGN,NEWSEQ,NEWSEQSTART,NEWSEQSTOP,NEWSEQNAME, + SEQBUFFER,ISEQPOINTER,AL_IFIRST,AL_ILAST,AL_JFIRST, + AL_JLAST,AL_LEN,AL_NGAP,AL_LGAP,AL_LSEQ_2, + AL_PDB_POINTER,AL_HOM,AL_SIM,AL_EXCLUDEFLAG,ACCESSION, + AL_EMBLPID,AL_COMPOUND) IMPLICIT NONE C Import INTEGER MAXALIGNS,MAXSEQBUFFER,ADDPOS, + NEWSEQSTART,NEWSEQSTOP CHARACTER NEWSEQ(*),NEWSEQNAME*(*) C Import / Export INTEGER NALIGN,ISEQPOINTER(MAXALIGNS) CHARACTER SEQBUFFER(MAXSEQBUFFER) C attributes of alignend sequences INTEGER AL_IFIRST(MAXALIGNS), AL_ILAST(MAXALIGNS), + AL_JFIRST(MAXALIGNS),AL_JLAST(MAXALIGNS), + AL_LEN(MAXALIGNS),AL_NGAP(MAXALIGNS), + AL_LGAP(MAXALIGNS),AL_LSEQ_2(MAXALIGNS) CHARACTER*12 AL_PDB_POINTER(MAXALIGNS) CHARACTER*12 ACCESSION(MAXALIGNS) CHARACTER*40 AL_EMBLPID(MAXALIGNS) CHARACTER*200 AL_COMPOUND(MAXALIGNS) CHARACTER AL_EXCLUDEFLAG(MAXALIGNS) REAL AL_HOM(MAXALIGNS),AL_SIM(MAXALIGNS) C Export C Internal INTEGER IALIGN,IPOS,I,LEN,NEWSEQLEN *----------------------------------------------------------------------* NEWSEQLEN = NEWSEQSTOP-NEWSEQSTART+1 IF ( ADDPOS .GT. NALIGN+1 ) THEN WRITE(6,'(A,I4)') ' cannot add after position ', nalign+1 RETURN ELSE IF ( ADDPOS .EQ. NALIGN+1 ) THEN ISEQPOINTER(ADDPOS) = + ISEQPOINTER(NALIGN) + AL_ILAST(NALIGN) - + AL_IFIRST(NALIGN)+1 ELSE IF ( ADDPOS .LE. NALIGN ) THEN I= ISEQPOINTER(NALIGN)+AL_ILAST(NALIGN)-AL_IFIRST(NALIGN)+1 DO IPOS=I,ISEQPOINTER(ADDPOS),-1 IF ( IPOS+NEWSEQLEN+1 .LE. MAXSEQBUFFER ) THEN C shift by newseqlen+1, because a '/' is to be inserted after newseq SEQBUFFER(IPOS+NEWSEQLEN+1 ) = SEQBUFFER(IPOS) ELSE STOP 'MAXSEQBUFFER overflow in add_seq_to_seqbuffer !' ENDIF ENDDO C insert new member into arrays al_ifirst .. at position addpos C and push following members by one DO IALIGN = NALIGN,ADDPOS,-1 AL_IFIRST(IALIGN+1)=AL_IFIRST(IALIGN) AL_ILAST(IALIGN+1)=AL_ILAST(IALIGN) AL_JFIRST(IALIGN+1)=AL_JFIRST(IALIGN) AL_JLAST(IALIGN+1)=AL_JLAST(IALIGN) AL_LEN(IALIGN+1)=AL_LEN(IALIGN) AL_NGAP(IALIGN+1)=AL_NGAP(IALIGN) AL_LGAP(IALIGN+1)=AL_LGAP(IALIGN) AL_LSEQ_2(IALIGN+1)=AL_LSEQ_2(IALIGN) AL_PDB_POINTER(IALIGN+1)=AL_PDB_POINTER(IALIGN) ACCESSION(IALIGN+1)=ACCESSION(IALIGN) AL_EMBLPID(IALIGN+1)=AL_EMBLPID(IALIGN) AL_COMPOUND(IALIGN+1)=AL_COMPOUND(IALIGN) AL_EXCLUDEFLAG(IALIGN+1)=AL_EXCLUDEFLAG(IALIGN) AL_HOM(IALIGN+1)=AL_HOM(IALIGN) AL_SIM(IALIGN+1)=AL_SIM(IALIGN) ISEQPOINTER(IALIGN+1) = ISEQPOINTER(IALIGN)+NEWSEQLEN+1 ENDDO ENDIF C insert newseq into seqbuffer C addpos............last res, '/' C = +1 LEN = NEWSEQSTART DO IPOS = ISEQPOINTER(ADDPOS),ISEQPOINTER(ADDPOS)+NEWSEQLEN-1 SEQBUFFER(IPOS) = NEWSEQ(LEN) LEN = LEN + 1 ENDDO SEQBUFFER(IPOS) = '/' AL_IFIRST(ADDPOS)=1 AL_ILAST(ADDPOS)=NEWSEQLEN AL_JFIRST(ADDPOS)=1 AL_JLAST(ADDPOS)=NEWSEQLEN AL_LEN(ADDPOS)=NEWSEQLEN AL_NGAP(ADDPOS)=0 AL_LGAP(ADDPOS)=0 AL_LSEQ_2(ADDPOS)=NEWSEQLEN AL_PDB_POINTER(ADDPOS )= ' ' ACCESSION(ADDPOS) = ' ' AL_EMBLPID(ADDPOS)= NEWSEQNAME AL_COMPOUND(ADDPOS)= ' ' AL_EXCLUDEFLAG(ADDPOS)= ' ' AL_HOM(ADDPOS)=0.0 AL_SIM(ADDPOS)=0.0 NALIGN = NALIGN + 1 RETURN END C end ADD_SEQ_TO_SEQBUFFER C....................................................................... C...................................................................... C SUB CALC_PROFILE SUBROUTINE CALC_PROFILE(MAXSQ,MAXAA,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES_1,NIOSTATES_1,NSTRSTATES_2, + NIOSTATES_2, + SCALE_FACTOR,LOG_BASE,SIGMA,BETA, + NRES,NALIGN, + AL_EXCLUDEFLAG,AL_IFIRST,AL_ILAST, + SEQBUFFER,ISEQPOINTER,NTRANS,TRANS, + SEQ_WEIGHT,OPEN_1,ELONG_1, + GAPOPEN_1, + GAPELONG_1,SIMORG,SIMMETRIC_1) INTEGER NSTRSTATES_1,NIOSTATES_1,NSTRSTATES_2,NIOSTATES_2 c common/strstates/ nstrstates_1,niostates_1,nstrstates_2, c + niostates_2 C import INTEGER MAXSQ,MAXAA,NRES,NALIGN,NTRANS REAL SCALE_FACTOR,LOG_BASE,SIGMA,BETA INTEGER AL_IFIRST(*),AL_ILAST(*),ISEQPOINTER(*) CHARACTER SEQBUFFER(*) CHARACTER TRANS*(*),AL_EXCLUDEFLAG(*) REAL SEQ_WEIGHT(*),OPEN_1,ELONG_1,GAPOPEN_1(*), + GAPELONG_1(*), + SIMORG(NTRANS,NTRANS,MAXSTRSTATES,MAXIOSTATES, + MAXSTRSTATES,MAXIOSTATES) C export REAL SIMMETRIC_1(MAXSQ,NTRANS) C internal INTEGER MAXTRANS PARAMETER (MAXTRANS= 26) INTEGER IRES,IALIGN,I,J REAL FREQUENCY(MAXTRANS), + BTOD,BTON,ZTOE,ZTOQ,XOCC,XINS,XDEL CHARACTER C1 REAL SIM_COPY(MAXTRANS,MAXTRANS) REAL INV(MAXTRANS,MAXTRANS) INTEGER INDX(MAXTRANS) REAL PROB_I(MAXTRANS) *----------------------------------------------------------------------* C================ CAUTION: pass the following values from outside CAUTION only for BLOSUM c scale_factor = 0.5 c log_base = 2.0 c scale_factor = 1.0 c log_base = 10.0 c beta=1.0 ; sigma=1.0 C 'B' and 'Z' are assigned as well to the acid as to the amide form C with respect to their frequency in EMBL/SWISSPROT 21.0 BTOD=0.524 BTON=0.445 ZTOE=0.626 ZTOQ=0.407 ILEN=LEN(TRANS) WRITE(6,*)'calc_profile' C check if MAXHOM tries do more than we implemented here :-) IF (NTRANS .GT. MAXTRANS) THEN WRITE(6,*)' WARNING: NTRANS GT MAXTRANS' WRITE(6,*)' update routine: calc_profile !!!' STOP ENDIF IF (NSTRSTATES_1 .GT. 1 .OR. NIOSTATES_1 .GT. 1 .OR. + NSTRSTATES_2 .GT. 1 .OR. NIOSTATES_2 .GT. 1) THEN WRITE(6,*)' WARNING: routine calc_profile not' WRITE(6,*)' working with STR and/or I/O dependent' WRITE(6,*)' metrices, update routine !!!' STOP ENDIF C copy "simorg" in "sim_copy" so "simorg" will be unchanged ! DO I=1,NTRANS DO J=1,NTRANS SIM_COPY(I,J)= SIMORG(I,J,1,1,1,1) ENDDO ENDDO C scale metric if necessary DO I=1,MAXTRANS DO J=1,MAXTRANS SIM_COPY(I,J)=SIM_COPY(I,J) * SCALE_FACTOR ENDDO ENDDO C de-log the matrix to get the ( P(i,j) / ( P(i) * P(j) ) ) DO I=1,MAXTRANS DO J=1,MAXTRANS SIM_COPY(I,J)= LOG_BASE ** SIM_COPY(I,J) ENDDO ENDDO C build diagonal matrix DO I=1,NTRANS DO J=1,NTRANS INV(I,J)=0.0 ENDDO INV(I,I) =1.0 ENDDO C invert matrix C NOTE: sim_copy gets changed CALL LUDCMP(SIM_COPY,MAXAA,MAXTRANS,INDX,D) DO I=1,MAXAA CALL LUBKSB(SIM_COPY,MAXAA,MAXTRANS,INDX,INV(1,I)) ENDDO C normalize to 1.0 to get the P(i) DO I=1,MAXAA SUM=0.0 DO J=1,MAXAA SUM= SUM + INV(I,J) ENDDO PROB_I(I)=SUM DO J=1,MAXAA INV(I,J)=INV(I,J) /SUM ENDDO C check SUM=0.0 DO J=1,MAXAA SUM= SUM + INV(I,J) ENDDO CALL CHECKREALEQUALITY(SUM,1.0,0.002,'sum','calc_profile') ENDDO C restore sim_copy (changed by matrix inverse) DO I=1,MAXTRANS DO J=1,MAXTRANS SIM_COPY(I,J)= SIMORG(I,J,1,1,1,1) ENDDO ENDDO C scale metric DO I=1,MAXTRANS DO J=1,MAXTRANS SIM_COPY(I,J)=SIM_COPY(I,J) * SCALE_FACTOR ENDDO ENDDO C de-log the matrix and multiply by P(i) to get the conditional probabilities: C ( P(i,j) | P(j) ) DO I=1,MAXTRANS DO J=1,MAXTRANS SIM_COPY(I,J)= ( LOG_BASE ** SIM_COPY(I,J) ) * PROB_I(I) ENDDO ENDDO C check sum rule DO J=1,MAXAA SIM=0.0 DO I=1,MAXAA SIM = SIM + SIM_COPY(I,J) ENDDO c WRITE(6,*)'sum P(i,j) | P(j): ',j,sim CALL CHECKREALEQUALITY(SIM,1.0,0.002,'sim','calc_profile') ENDDO C calculate sequence profile DO IRES=1,NRES XOCC=0.0 XINS=0.0 XDEL=0.0 DO I=1,MAXTRANS FREQUENCY(I)=0.0 ENDDO DO IALIGN=1,NALIGN IF (IRES .GE. AL_IFIRST(IALIGN) .AND. + IRES .LE. AL_ILAST(IALIGN) + .AND. AL_EXCLUDEFLAG(IALIGN) .EQ. ' ') THEN C1=SEQBUFFER( ISEQPOINTER(IALIGN) + + IRES-AL_IFIRST(IALIGN) ) C if lower case character: insertions IF (C1 .GE. 'a' .AND. C1 .LE. 'z') THEN C1=CHAR( ICHAR(C1)-32 ) XINS=XINS + SEQ_WEIGHT(IALIGN) ENDIF IF (C1 .NE. '.') THEN XOCC=XOCC + SEQ_WEIGHT(IALIGN) IF (INDEX('BZ',C1).EQ.0) THEN I=INDEX(TRANS,C1) IF (I .LE. 0 .OR. I .GT. ILEN) THEN WRITE(6,*)' GETFREQUENCY: UNKNOWN RES : ',C1 ELSE FREQUENCY(I)=FREQUENCY(I) + SEQ_WEIGHT(IALIGN) ENDIF ELSE IF (C1 .EQ. 'B') THEN WRITE(6,*)' GETFREQUENCY: convert B' I=INDEX(TRANS,'D') J=INDEX(TRANS,'N') FREQUENCY(I)=FREQUENCY(I)+(BTOD*SEQ_WEIGHT(IALIGN)) FREQUENCY(J)=FREQUENCY(J)+(BTON*SEQ_WEIGHT(IALIGN)) ELSE IF (C1 .EQ. 'Z') THEN WRITE(6,*)' GETFREQUENCY: convert Z' I=INDEX(TRANS,'E') J=INDEX(TRANS,'Q') FREQUENCY(I)=FREQUENCY(I)+(ZTOE*SEQ_WEIGHT(IALIGN)) FREQUENCY(J)=FREQUENCY(J)+(ZTOQ*SEQ_WEIGHT(IALIGN)) ENDIF ELSE C if '.' : deletion XDEL=XDEL+ SEQ_WEIGHT(IALIGN) ENDIF ENDIF ENDDO C====================== C profile SUM= 0.0 DO I=1,MAXAA SUM = SUM + FREQUENCY(I) ENDDO IF (SUM .NE. 0.0) THEN DO I=1,MAXAA FREQUENCY(I)= FREQUENCY(I) / SUM ENDDO C check sum rule for frequencies X=0.0 DO I=1,MAXAA X = X + FREQUENCY(I) ENDDO CALL CHECKREALEQUALITY(X,1.0,0.002,'freq','calc_profile') C smooth the profile C sigma: smooth dependent on the number of alignments C beta: mixing of the two models (expected <--> observed) SMOOTH= ( SUM / (SUM +SIGMA)) * BETA C do for each of the AA types in a row DO I=1,MAXAA SIM=0.0 C sum up the conditional probabilities DO J=1,MAXAA SIM = SIM + ( FREQUENCY(J) * SIM_COPY(I,J) ) ENDDO C add the observed frequencies and smooth SIMMETRIC_1(IRES,I)=( (1-SMOOTH) * SIM) + + (SMOOTH * FREQUENCY(I) ) C divide by the expected probability SIMMETRIC_1(IRES,I)=SIMMETRIC_1(IRES,I)/PROB_I(I) c simmetric_1(ires,i)= frequency(i) /prob_i(i) C log-odd IF (SIMMETRIC_1(IRES,I) .LE. 10E-3) THEN SIMMETRIC_1(IRES,I)=10E-3 ENDIF SIMMETRIC_1(IRES,I)=LOG10 ( SIMMETRIC_1(IRES,I) ) c WRITE(6,*)ires,trans(i:i),sum,frequency(i), c + sim,smooth,simmetric_1(ires,i) C gap-weights GAPOPEN_1(IRES) =OPEN_1 / (1.0 + ((XINS+XDEL)/SUM)) GAPELONG_1(IRES)=ELONG_1 / (1.0 + ((XINS+XDEL)/SUM)) ENDDO ELSE WRITE(6,*)'CALC_PROFILE: position not occupied !' C1=SEQBUFFER( ISEQPOINTER(1)+IRES-AL_IFIRST(1) ) WRITE(6,*)' sequence symbol of first sequence: ',c1 WRITE(6,*)' set profile row to 0.0' DO I=1,MAXAA SIMMETRIC_1(IRES,I)=0.0 ENDDO GAPOPEN_1(IRES) = 0.0 GAPELONG_1(IRES)= 0.0 ENDIF ENDDO C set value for chain breaks etc... to 0.0 C later there are refilled in MAXHOM (like "!" = -200.0) IX=INDEX(TRANS,'X') IB=INDEX(TRANS,'B') IZ=INDEX(TRANS,'Z') I1=INDEX(TRANS,'!') I2=INDEX(TRANS,'-') DO IRES=1,NRES SIMMETRIC_1(IRES,IX)=0.0 SIMMETRIC_1(IRES,IB)=0.0 SIMMETRIC_1(IRES,IZ)=0.0 SIMMETRIC_1(IRES,I1)=0.0 SIMMETRIC_1(IRES,I2)=0.0 ENDDO RETURN END C END CALC_PROFILE C...................................................................... C...................................................................... C SUB COPY_FIELD SUBROUTINE COPY_FIELD(CIN,COUT,IFIELD,MAXFIELD) C IMPLICIT NONE C---- local parameters INTEGER MAXFIELDLOC,MAXFIELDLENLOC PARAMETER (MAXFIELDLOC= 15) PARAMETER (MAXFIELDLENLOC= 200) C---- import CHARACTER*(MAXFIELDLENLOC) + CIN(MAXFIELDLOC) CHARACTER*200 COUT INTEGER IFIELD,MAXFIELD C internal C---- local variables INTEGER IBEG,IEND *----------------------------------------------------------------------* IF (IFIELD+1 .GT. MAXFIELD) THEN CALL STRPOS(CIN(IFIELD),IBEG,IEND) WRITE(6,*)'**** NO VALUE GIVEN FOR: ',CIN(IFIELD)(IBEG:IEND) ELSE CALL STRPOS(CIN(IFIELD+1),IBEG,IEND) IF (IEND .GE. 1) THEN COUT=CIN(IFIELD+1)(IBEG:IEND) IFIELD=IFIELD+2 ENDIF ENDIF RETURN END C END COPY_FIELD C...................................................................... C...................................................................... C SUB CORRELATION SUBROUTINE CORRELATION(SET1,SET2,NSIZE,RCORR) C classical correlation between sets of values set1(1..nsize) and C set2(1..nsize) C import c implicit none REAL SET1(*),SET2(*) INTEGER NSIZE C export REAL RCORR C internal REAL SET1SUM,SET2SUM,SET1AVRG,SET2AVRG,TOTALSUM INTEGER J IF (NSIZE .LT. 1) THEN WRITE(6,*)'*** CORRELATION: nsize=0 ' RETURN ENDIF SET1SUM=0.0 SET2SUM=0.0 DO J=1,NSIZE SET1SUM = SET1SUM + SET1(J) SET2SUM = SET2SUM + SET2(J) ENDDO SET1AVRG=SET1SUM/NSIZE SET2AVRG=SET2SUM/NSIZE SET1SUM=0.0 SET2SUM=0.0 TOTALSUM=0.0 cdebug c WRITE(6,*) 'set1avrg,set2avrg', set1avrg,set2avrg DO J=1,NSIZE TOTALSUM = TOTALSUM + (SET1(J) -SET1AVRG)*(SET2(J)-SET2AVRG) SET1SUM = SET1SUM + (SET1(J) -SET1AVRG)**2 SET2SUM = SET2SUM + (SET2(J) -SET2AVRG)**2 ENDDO IF ( SET1SUM * SET2SUM .NE. 0.0) THEN RCORR=TOTALSUM / SQRT(SET1SUM * SET2SUM) ELSE RCORR=99.9 WRITE(6,*)'*** CORRELATION: sum = 0.0 ' ENDIF RETURN END C END CORRELATION C...................................................................... C...................................................................... C SUB GETALIGN_FROM_WORKER SUBROUTINE GETALIGN_FROM_WORKER(IWORKER,IMSGTAG,IFIR,LEN1, + LENOCC,JFIR,JLAS,IDEL,NDEL, + VALUE,RMS,HOM,SIM,SDEV,DISTANCE) IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' c input INTEGER IWORKER,IMSGTAG c output INTEGER IFIR,JFIR,JLAS,IDEL,NDEL,LEN1,LENOCC REAL VALUE,SIM,SDEV,HOM,RMS,DISTANCE C internal INTEGER ISIZE INTEGER ILCONSIDER,ILDSSP_2 C init ILCONSIDER=0 ILDSSP_2=0 LCONSIDER=.FALSE. LDSSP_2=.FALSE. C receive data MSGTYPE=IMSGTAG CALL MP_RECEIVE_DATA(MSGTYPE,LINK(IWORKER)) CALL MP_GET_INT4(MSGTYPE,IWORKER,ILCONSIDER,N_ONE) CALL MP_GET_INT4(MSGTYPE,IWORKER,ILDSSP_2,N_ONE) ISIZE=LEN(NAME_2) CALL MP_GET_STRING(MSGTYPE,IWORKER,NAME_2,ISIZE) ISIZE=LEN(COMPND_2) CALL MP_GET_STRING(MSGTYPE,IWORKER,COMPND_2,ISIZE) ISIZE=LEN(ACCESSION_2) CALL MP_GET_STRING(MSGTYPE,IWORKER,ACCESSION_2,ISIZE) ISIZE=LEN(PDBREF_2) CALL MP_GET_STRING(MSGTYPE,IWORKER,PDBREF_2,ISIZE) CALL MP_GET_REAL4(MSGTYPE,IWORKER,VALUE,N_ONE) CALL MP_GET_INT4(MSGTYPE,IWORKER,IFIR,N_ONE) CALL MP_GET_INT4(MSGTYPE,IWORKER,LEN1,N_ONE) CALL MP_GET_INT4(MSGTYPE,IWORKER,LENOCC,N_ONE) CALL MP_GET_INT4(MSGTYPE,IWORKER,JFIR,N_ONE) CALL MP_GET_INT4(MSGTYPE,IWORKER,JLAS,N_ONE) CALL MP_GET_INT4(MSGTYPE,IWORKER,N2IN,N_ONE) CALL MP_GET_INT4(MSGTYPE,IWORKER,IDEL,N_ONE) CALL MP_GET_INT4(MSGTYPE,IWORKER,NDEL,N_ONE) CALL MP_GET_INT4(MSGTYPE,IWORKER,NSHIFTED,N_ONE) CALL MP_GET_REAL4(MSGTYPE,IWORKER,RMS,N_ONE) CALL MP_GET_REAL4(MSGTYPE,IWORKER,HOM,N_ONE) CALL MP_GET_REAL4(MSGTYPE,IWORKER,SIM,N_ONE) CALL MP_GET_REAL4(MSGTYPE,IWORKER,SDEV,N_ONE) CALL MP_GET_REAL4(MSGTYPE,IWORKER,DISTANCE,N_ONE) ISIZE=LEN(AL_2) CALL MP_GET_STRING(MSGTYPE,IWORKER,AL_2,ISIZE) ISIZE=LEN(SAL_2) CALL MP_GET_STRING(MSGTYPE,IWORKER,SAL_2,ISIZE) CALL MP_GET_INT4(MSGTYPE,IWORKER,IINS,N_ONE) IF (IINS .GT. 0) THEN CALL MP_GET_INT4_ARRAY(MSGTYPE,IWORKER,INSLEN_LOCAL,IINS) CALL MP_GET_INT4_ARRAY(MSGTYPE,IWORKER,INSBEG_1_LOCAL,IINS) CALL MP_GET_INT4_ARRAY(MSGTYPE,IWORKER,INSBEG_2_LOCAL,IINS) ISIZE=LEN(INSSEQ) CALL MP_GET_STRING(MSGTYPE,IWORKER,INSSEQ,ISIZE) ENDIF IF ( ILCONSIDER .EQ. 1) LCONSIDER=.TRUE. IF ( ILDSSP_2 .EQ. 1 ) LDSSP_2=.TRUE. RETURN END C END GETALIGN_FROM_WORKER C...................................................................... C....................................................................... C SUB GETSTRHOM C SUBROUTINE GETSTRHOM(S1,S2,N,HOM) c implicit none c c integer n c real hom c character*(*) s1,s2 C internal c character*1 temp1,temp2 c integer iagr,i c c IF (n .eq. 0) THEN c WRITE(6,*)'*** N=0 IN GETSTRHOM' c RETURN c endif c iagr=0 c do i=1,n c temp1=s1(i:i) , call lowtoup(temp1,1) c temp2=s2(i:i) , call lowtoup(temp2,1) c IF (temp1 .eq. temp2) THEN , iagr=iagr+1, endif c enddo c hom=float(iagr)/float(n) c c RETURN , END C END GETSTRHOM C....................................................................... C....................................................................... C SUB HELP_TEXT SUBROUTINE HELP_TEXT WRITE(6,*)' ****** YOU WILL NEVER GET THIS ... *****' RETURN END C END HELP_TEXT C....................................................................... C...................................................................... C SUB HSSP SUBROUTINE HSSP(NALIGN,CHAINREMARK) C---- C---- WRITE *.HSSP files using alignment-data RS 1988/89 C---- IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C---- import INTEGER NALIGN, + NPARALINE,NRES,LRES,NENTRIES,NRESIDUE, + I,J,ISTART,ISTOP CHARACTER*(*) CHAINREMARK C---- internal CHARACTER CTEMP*200,HSSPFILE*200,CDATE*9 CHARACTER*200 HSSPLINE,DATABASE,CPARAMETER(10) c character*1 csq_1_array(maxsq) REAL RMIN,RMAX,RELEASE C---- added br 99.01 INTEGER NALIGN_FILTER REAL DISTANCE C---- ------------------------------------------------------------------ C---- initialize C---- ------------------------------------------------------------------ DO I=1,NALIGN AL_EXCLUDEFLAG(I)=' ' ENDDO DO J=1,MAXSQ DO I=1,MAXPROFAA AL_SEQPROF(J,I)=0 ENDDO ENDDO DO J=1,MAXSQ AL_VARIABILITY(J)=0 AL_ENTROPY(J)=0 NOCC_1(J)=0 AL_NDELETION(J)=0 AL_NINS(J)=0 ENDDO C---- C---- WRITE info into string C---- C HSSP release note WRITE(HSSPLINE,'(A)')'HSSP HOMOLOGY DERIVED SECONDARY'// + ' STRUCTURE OF PROTEINS , VERSION 1.0 1991' C get swissprot release CALL SWISSPROTRELEASE(KREL,RELNOTES,RELEASE,NENTRIES,NRESIDUE) WRITE(DATABASE,'(A,F4.1,A,I6,A)')'SEQBASE RELEASE ',RELEASE, + ' OF EMBL/SWISS-PROT WITH ',NENTRIES,' SEQUENCES' c get actual date CDATE=' ' CALL GETDATE (CDATE) C get GCG-metric file and scale between 0.0 and 1.0 RMIN=0.0 RMAX=1.0 CALL GETSIMMETRIC(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES_1,NIOSTATES_1,NSTRSTATES_2, + NIOSTATES_2,CSTRSTATES,CIOSTATES, + IORANGE,KSIM,METRIC_HSSP_VAR,SIMORG) IF (NSTRSTATES_1 .NE. 1 .OR. NIOSTATES_1 .NE. 1) THEN WRITE(6,*)'**** ERROR: NSTRSTATES_1 OR NIOSTATES_1 .GT. 1' WRITE(6,*)'CHANGE CALC_VAR ROUTINE' ENDIF CALL SCALEMETRIC(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + SIMORG,RMIN,RMAX,0.0,0.0) C WRITE alignment parameter in CPARAMETER (passed to HSSPHEADER) NPARALINE=1 WRITE(CPARAMETER(NPARALINE),'(A,F4.1,A,F4.1)') + ' SMIN: ',SMIN,' SMAX: ',SMAX NPARALINE=NPARALINE+1 IF (OPENWEIGHT_ANSWER .EQ. 'PROFILE' ) THEN WRITE(CPARAMETER(NPARALINE),'(A,A)') + ' gap-open: profile',' gap-elongation: profile' ELSE WRITE(CPARAMETER(NPARALINE),'(A,F4.1,A,F4.1)') + ' gap-open: ',gapopen_1(1),' gap-elongation: ',gapelong_1(1) ENDIF IF (LPROFILE_1 .OR. LPROFILE_2) THEN NPARALINE=NPARALINE+1 WRITE(CPARAMETER(NPARALINE),'(A,A)') + ' profile from : ',name_1(1:100) ENDIF NPARALINE=NPARALINE+1 IF (LCONSERV_1 .OR. LCONSERV_2 .OR. LCONSIMPORT) THEN WRITE(CPARAMETER(NPARALINE),'(A)') + ' conservation weights: YES' ELSE WRITE(CPARAMETER(NPARALINE),'(A)') + ' conservation weights: NO' ENDIF NPARALINE=NPARALINE+1 IF (LINSERT_1) THEN WRITE(CPARAMETER(NPARALINE),'(A)') + 'InDels in secondary structure allowed: YES' ELSE WRITE(CPARAMETER(NPARALINE),'(A)') + 'InDels in secondary structure allowed: NO' ENDIF NPARALINE=NPARALINE+1 CALL CONCAT_STRINGS(' alignments sorted according to : ', + CSORTMODE,CPARAMETER(NPARALINE) ) IF (LHSSP_LONG_ID .EQV. .TRUE.) THEN NPARALINE=NPARALINE+1 CALL CONCAT_STRINGS(' LONG-ID : ', + HSSP_FORMAT_ANSWER,CPARAMETER(NPARALINE) ) ENDIF IF (HSSP_ANSWER .EQ. 'YES') THEN CALL CONCAT_STRINGS(HSSPID_1,'.hssp',HSSPFILE) ELSE HSSPFILE=HSSP_ANSWER ENDIF BRKID_1=HSSPID_1(1:4) C IF (.NOT. LDSSP_1 ) THEN NRES=N1 LRES=N1 HEADER_1=NAME_1(1:40) COMPND_1=' ' SOURCE_1=' ' AUTHOR_1=' ' ELSE NRES=N1 LRES=NRES-NCHAINUSED+1 ENDIF DO I=1,N1 CSQ_1_ARRAY(I)=CSQ_1(I:I) ENDDO CALL CALC_VAR(NALIGN,NRES,CSQ_1_ARRAY,AL_HOM, + AL_IFIRST,AL_ILAST,ISEQPOINTER, + SEQBUFFER,AL_EXCLUDEFLAG,MAXSTRSTATES,MAXIOSTATES, + NTRANS,TRANS,SIMORG,AL_VARIABILITY) CALL CALC_PROF(MAXSQ,MAXPROFAA,NRES,CSQ_1_ARRAY,NALIGN, + AL_EXCLUDEFLAG,AL_HOM,AL_IFIRST, + AL_ILAST,SEQBUFFER,ISEQPOINTER,TRANS,AL_SEQPROF, + NOCC_1,AL_NDELETION,AL_NINS,AL_ENTROPY,AL_RELENT) IF (CHAINREMARK .NE. ' ') THEN CTEMP=' ' I=INDEX(CHAINREMARK,'!') IF (I .NE. 0) THEN WRITE(CTEMP,'(A)')CHAINREMARK(I+2:) ENDIF CHAINREMARK=' ' CALL STRPOS(CTEMP,ISTART,ISTOP) WRITE(CHAINREMARK,'(A)')CTEMP(1:ISTOP) ENDIF C---- C---- finally WRITE to file (HSSPFILE) C---- C---- get number of alignments above threshold (new br 99.01) NALIGN_FILTER=0 DO I=1,NALIGN C---- new switch: parameter set in maxhom.param IF (LNEWCURVE) THEN CALL CHECKHSSPCUT99(AL_HOMLEN(I),AL_HOM(I)*100, + ISOLEN,ISOIDE,NSTEP,LFORMULA,LALL,ISAFE, + LCONSIDER,DISTANCE) ELSE CALL CHECKHSSPCUT(AL_HOMLEN(I),AL_HOM(I)*100, + ISOLEN,ISOIDE,NSTEP,LFORMULA,LALL,ISAFE, + LCONSIDER,DISTANCE) ENDIF IF ( LCONSIDER ) THEN IF ( AL_EXCLUDEFLAG(I) .EQ. ' ') THEN NALIGN_FILTER=NALIGN_FILTER+1 ENDIF ELSE AL_EXCLUDEFLAG(I)='*' ENDIF ENDDO C---- WRITE header (no table) CALL HSSPHEADER(KHSSP,HSSPFILE,HSSPLINE,HSSPID_1,CDATE,DATABASE, + CPARAMETER,NPARALINE,ISOSIGFILE,ISAFE,LFORMULA, + HEADER_1,COMPND_1,SOURCE_1,AUTHOR_1,LRES, + NCHAIN_1,NCHAINUSED,CHAINREMARK,NALIGN_FILTER) C---- WRITE data (table header and alignments) CALL WRITE_HSSP(KHSSP,MAXSQ,NALIGN,NRES,AL_EMBLPID, + AL_PDB_POINTER,AL_ACCESSION,AL_HOM,AL_SIM, + AL_IFIRST,AL_ILAST,AL_JFIRST,AL_JLAST,AL_HOMLEN, + AL_NGAP,AL_LGAP,AL_LSEQ_2,AL_COMPOUND, + ISEQPOINTER,SEQBUFFER,PDBNO_1,CHAINID_1, + CSQ_1_ARRAY,STRUC_1,COLS_1,BP1_1,BP2_1, + SHEETLABEL_1,NSURF_1,NOCC_1,AL_VARIABILITY, + AL_SEQPROF,AL_NDELETION,AL_NINS,AL_ENTROPY, + AL_RELENT,CONSWEIGHT_1,INSNUMBER,INSALI, + INSPOINTER,INSLEN,INSBEG_1,INSBEG_2,INSBUFFER, + ISOLEN,ISOIDE,NSTEP,LFORMULA,LALL,ISAFE, + AL_EXCLUDEFLAG,LCONSERV_1,LHSSP_LONG_ID) RETURN END C END HSSP C...................................................................... C...................................................................... C SUB INTERFACE SUBROUTINE INTERFACE C MAXFIELD = number of fields in line C MAXFIELDLEN = length of field in bytes C---- include parameter files INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C---- local parameters INTEGER MAXFIELD,MAXFIELDLEN PARAMETER (MAXFIELD= 15) PARAMETER (MAXFIELDLEN= 200) c output CHARACTER*(MAXFIELDLEN) + MACROLINE,CFIELD(MAXFIELD), + CSTRING(MAXFIELD),CALFANUMERIC(MAXFIELD), + CALFAMIXED(MAXFIELD),CWORD(MAXFIELD) INTEGER NFIELD,NSTRING,NALFANUMERIC,NNUMBER,NREAL, + NINTEGER,NPOSITIVE,NNEGATIVE,NWORD,NALFAMIXED + IINTEGER(MAXFIELD),IPOSITIVE(MAXFIELD), + INEGATIVE(MAXFIELD) REAL XNUMBER(MAXFIELD), XREAL(MAXFIELD) c pointers to beg and end of each field INTEGER IFIELD_POS(2,MAXFIELD) c common/interpret1/ macroline, cfield, cstring, calfanumeric, c + calfamixed,cword c common/interpret2/ nfield, nstring, nalfanumeric,nnumber,nreal, c + ninteger, npositive, nnegative, nword, c + nalfamixed, iinteger,ipositive, inegative, c + ifield_pos c common/interpret3/ xnumber, xreal C internal INTEGER ID,ILEN,IBEG,IEND CHARACTER*500 INPUTLINE CHARACTER*20 KEYWORD INTEGER IUNIT LOGICAL LEXIST,LERROR CHARACTER*500 TEMP_FILE C KEYWORDS are: C COMmandfile C BATCH C PID: C SEQuence1 C SEQuence2 C PROFile C NORM_profile C METric C conservation-WEIGht1 C sonservation-WEIGht2 C WAY3align C INDEL in secstruc C STRUC_align C RELIABILITY C FILTer_range C THREShold C SORTmode C SUPERpos C HSSP_output C OUT_FORMAT_HSSP C SAME_SEQ C PDB_path C PROFile_OUTput C LONG_OUTput C STRIP_output C DOT_plot_output C SMIN C SMAX C gapOPEN C gapELONG C MAXalign IUNIT=99 CURRENT_DIR='.' CALL GET_CURRENT_DIR(CURRENT_DIR) LBATCH=.FALSE. LDIALOG=.FALSE. LRUN=.FALSE. WRITE(6,*)' ' WRITE(6,*)'============================ MAXHOM DIALOG ======='// + '===================' WRITE(6,*)' HELP : get short help text ' WRITE(6,*)' DIALOG : answer questions step by step ' WRITE(6,*)' STATUS : show current settings ' WRITE(6,*)' RUN/START/GO : start program ' WRITE(6,*)' QUIT : stop the program ' WRITE(6,*)' "KEYWORD VALUE" : set options ' LEXIST=.FALSE. TEMP_FILE='maxhom.input' INQUIRE(FILE=TEMP_FILE,EXIST=LEXIST) IF (LEXIST .EQV. .TRUE.) THEN CALL OPEN_FILE(IUNIT,TEMP_FILE,'old',LERROR) WRITE(6,*)' =========================================' WRITE(6,*)' INFO: found local input file; will use it' WRITE(6,*)' =========================================' ENDIF KEYWORD=' ' NFIELD=1 DO WHILE (KEYWORD .NE. 'RUN' .OR. KEYWORD .NE. 'DIALOG' + .OR. KEYWORD .NE. 'GO' .OR. KEYWORD .NE. 'START') IF (.NOT. LBATCH) THEN WRITE(6,*)' ' WRITE(6,*)'=== option: HELP, DIALOG, STATUS, RUN '// + 'or "KEYWORD + value" === >' ENDIF INPUTLINE=' ' ID=1 IF (LEXIST .EQV. .TRUE.) THEN READ(IUNIT,'(A)')INPUTLINE ELSE READ(*,'(A)')INPUTLINE ENDIF IF (INPUTLINE .EQ. ' ') THEN CFIELD(ID)='STATUS' NFIELD=1 ELSE CALL INTERPRET_LINE(INPUTLINE,MAXFIELD, + MACROLINE, CFIELD, CSTRING, CALFANUMERIC, + CALFAMIXED,CWORD,NFIELD,NSTRING,NALFANUMERIC, + NNUMBER, NREAL, NINTEGER,NPOSITIVE, NNEGATIVE, + NWORD, NALFAMIXED,IINTEGER,IPOSITIVE, + INEGATIVE,XNUMBER, XREAL,IFIELD_POS) ENDIF DO WHILE(ID .LE. NFIELD) IF (MACROLINE(ID:ID) .NE. 'R' .AND. + MACROLINE(ID:ID) .NE. 'P' .AND. + MACROLINE(ID:ID) .NE. 'N') THEN ILEN=LEN(CFIELD(ID)) CALL LOWTOUP(CFIELD(ID),ILEN) CALL STRPOS(CFIELD(ID),IBEG,IEND) IF (CFIELD(ID)(IBEG:IEND) .EQ. 'RUN' .OR. + CFIELD(ID)(IBEG:IEND) .EQ. 'GO' .OR. + CFIELD(ID)(IBEG:IEND) .EQ. 'START') THEN LRUN=.TRUE. RETURN ELSE IF (CFIELD(ID)(IBEG:IEND) .EQ. 'DIALOG') THEN LDIALOG=.TRUE. RETURN ELSE IF (CFIELD(ID)(IBEG:IEND) .EQ.'HELP') THEN CALL HELP_TEXT ELSE IF (CFIELD(ID)(IBEG:IEND) .EQ.'QUIT') THEN STOP ELSE IF (INDEX(CFIELD(ID),'COM').NE.0) THEN CALL LOWTOUP(CFIELD(ID+1),ILEN) CALL COPY_FIELD(CFIELD,COMMANDFILE_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'BATCH').NE.0) THEN LBATCH=.TRUE. ELSE IF (INDEX(CFIELD(ID),'PID:').NE.0) THEN CALL LOWTOUP(CFIELD(ID+1),ILEN) CALL COPY_FIELD(CFIELD,JOB_ID,ID,NFIELD) c CALL STRPOS(COREFILE,IBEG,IEND) c CALL STRPOS(JOB_ID,ISTART,ISTOP) c COREFILE(1:)=COREFILE(IBEG:IEND)// c + JOB_ID(ISTART:ISTOP) ELSE IF (INDEX(CFIELD(ID),'SEQ').NE.0 .AND. + INDEX(CFIELD(ID),'1').NE.0 ) THEN CALL COPY_FIELD(CFIELD,NAME1_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'SEQ').NE.0 .AND. + INDEX(CFIELD(ID),'2').NE.0 ) THEN CALL COPY_FIELD(CFIELD,NAME2_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'MET').NE.0) THEN CALL COPY_FIELD(CFIELD,METRIC_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'WEIG').NE.0 .AND. + INDEX(CFIELD(ID),'1').NE.0 ) THEN CALL LOWTOUP(CFIELD(ID+1),ILEN) CALL COPY_FIELD(CFIELD,WEIGHT1_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'WEIG').NE.0 .AND. + INDEX(CFIELD(ID),'2').NE.0 ) THEN CALL LOWTOUP(CFIELD(ID+1),ILEN) CALL COPY_FIELD(CFIELD,WEIGHT2_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'INDEL_1').NE.0) THEN CALL LOWTOUP(CFIELD(ID+1),ILEN) CALL COPY_FIELD(CFIELD,INDEL_ANSWER_1,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'INDEL_2').NE.0) THEN CALL LOWTOUP(CFIELD(ID+1),ILEN) CALL COPY_FIELD(CFIELD,INDEL_ANSWER_2,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'WAY3').NE.0) THEN CALL LOWTOUP(CFIELD(ID+1),ILEN) CALL COPY_FIELD(CFIELD,WAY3_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'SAME').NE.0) THEN CALL LOWTOUP(CFIELD(ID+1),ILEN) CALL COPY_FIELD(CFIELD,SAMESEQ_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'THRES').NE.0) THEN CALL COPY_FIELD(CFIELD,THRESHOLD_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'NORM').NE.0) THEN CALL COPY_FIELD(CFIELD,NORM_PROFILE_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'MEAN').NE.0) THEN CALL COPY_FIELD(CFIELD,PROFILE_EPSILON_ANSWER,ID, + NFIELD) ELSE IF (INDEX(CFIELD(ID),'FACTOR').NE.0) THEN CALL COPY_FIELD(CFIELD,PROFILE_GAMMA_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'SORT').NE.0) THEN CALL LOWTOUP(CFIELD(ID+1),ILEN) CALL COPY_FIELD(CFIELD,SORTMODE_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'OUT_FORMAT_HSSP').NE.0) THEN CALL COPY_FIELD(CFIELD,HSSP_FORMAT_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'HSSP ').NE.0) THEN c CALL LOWTOUP(CFIELD(ID+1),ILEN) CALL COPY_FIELD(CFIELD,HSSP_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'SUPER').NE.0) THEN CALL LOWTOUP(CFIELD(ID+1),ILEN) CALL COPY_FIELD(CFIELD,COMPARE_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'STRUC').NE.0) THEN CALL LOWTOUP(CFIELD(ID+1),ILEN) CALL COPY_FIELD(CFIELD,STRUC_ALIGN_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'PDB').NE.0 .OR. + INDEX(CFIELD(ID),'BRK').NE.0 ) THEN CALL COPY_FIELD(CFIELD,PDBPATH_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'DSSP') .NE.0 ) THEN CALL COPY_FIELD(CFIELD,DSSP_PATH,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'PROF') .NE.0 .AND. + INDEX(CFIELD(ID),'OUT') .NE.0 ) THEN CALL COPY_FIELD(CFIELD,PROFILEOUT_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'PROF').NE.0) THEN CALL COPY_FIELD(CFIELD,PROFILE_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'STRIP').NE.0) THEN CALL COPY_FIELD(CFIELD,STRIPFILE_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'LONG_OUT').NE.0) THEN CALL COPY_FIELD(CFIELD,LONG_OUTPUT_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'DOT').NE.0) THEN CALL COPY_FIELD(CFIELD,PLOTFILE_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'SMIN').NE.0) THEN CALL COPY_FIELD(CFIELD,SMIN_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'SMAX').NE.0) THEN CALL COPY_FIELD(CFIELD,SMAX_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'OPEN').NE.0) THEN CALL COPY_FIELD(CFIELD,OPENWEIGHT_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'ELONG').NE.0) THEN CALL COPY_FIELD(CFIELD,ELONGWEIGHT_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'NBEST').NE.0) THEN CALL COPY_FIELD(CFIELD,NBEST_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'MAX').NE.0) THEN CALL COPY_FIELD(CFIELD,NGLOBALHITS_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'RELI').NE.0) THEN CALL LOWTOUP(CFIELD(ID+1),ILEN) CALL COPY_FIELD(CFIELD,BACKWARD_ANSWER,ID,NFIELD) ELSE IF (INDEX(CFIELD(ID),'FILT').NE.0) THEN CALL COPY_FIELD(CFIELD,FILTER_ANSWER,ID,NFIELD) ELSE IF (CFIELD(ID) .EQ. 'STATUS') THEN CALL STRPOS(COMMANDFILE_ANSWER,IBEG,IEND) WRITE(6,*)' COMMANDFILE : ', + COMMANDFILE_ANSWER(IBEG:IEND),' [YES/NO]' CALL STRPOS(NAME1_ANSWER,IBEG,IEND) WRITE(6,*)' SEQUENCE_1 : ', + NAME1_ANSWER(IBEG:IEND), + ' [filename or list of filenames]' CALL STRPOS(NAME2_ANSWER,IBEG,IEND) WRITE(6,*)' SEQUENCE_2 : ', + NAME2_ANSWER(IBEG:IEND), + ' [filename or list of filenames]' CALL STRPOS(METRIC_ANSWER,IBEG,IEND) WRITE(6,*)' METRIC : ', + METRIC_ANSWER(IBEG:IEND),' [LACHLAN/GCG/filena]' CALL STRPOS(SMIN_ANSWER,IBEG,IEND) WRITE(6,*)' SMIN : ', + SMIN_ANSWER(IBEG:IEND), + ' [real number/0.0/PROFILE]' CALL STRPOS(SMAX_ANSWER,IBEG,IEND) WRITE(6,*)' SMAX : ', + SMAX_ANSWER(IBEG:IEND), + ' [real number/0.0/PROFILE]' CALL STRPOS(OPENWEIGHT_ANSWER,IBEG,IEND) WRITE(6,*)' GAP_OPEN : ', + OPENWEIGHT_ANSWER(IBEG:IEND), + ' [real number/PROFILE]' CALL STRPOS(ELONGWEIGHT_ANSWER,IBEG,IEND) WRITE(6,*)' GAP_ELONGATION : ', + ELONGWEIGHT_ANSWER(IBEG:IEND), + ' [real number/PROFILE]' CALL STRPOS(WEIGHT1_ANSWER,IBEG,IEND) WRITE(6,*)' WEIGHT_1 : ', + WEIGHT1_ANSWER(IBEG:IEND),' [YES/NO]' CALL STRPOS(WEIGHT2_ANSWER,IBEG,IEND) WRITE(6,*)' WEIGHT_2 : ', + WEIGHT2_ANSWER(IBEG:IEND),' [YES/NO]' CALL STRPOS(NORM_PROFILE_ANSWER,IBEG,IEND) WRITE(6,*)' NORMALIZE PROFILE : ', + NORM_PROFILE_ANSWER(IBEG:IEND),' [YES/NO]' CALL STRPOS(PROFILE_EPSILON_ANSWER,IBEG,IEND) WRITE(6,*)' MEAN PROFILE : ', + PROFILE_EPSILON_ANSWER(IBEG:IEND),' [real number]' CALL STRPOS(PROFILE_GAMMA_ANSWER,IBEG,IEND) WRITE(6,*)' FACTOR GAP-WEIGHTS : ', + PROFILE_GAMMA_ANSWER(IBEG:IEND),' [real number]' CALL STRPOS(BACKWARD_ANSWER,IBEG,IEND) WRITE(6,*)' RELIABILITY SCORE : ', + BACKWARD_ANSWER(IBEG:IEND),' [YES/NO]' CALL STRPOS(FILTER_ANSWER,IBEG,IEND) WRITE(6,*)' FILTER_RANGE : ', + FILTER_ANSWER(IBEG:IEND),' [number]' CALL STRPOS(THRESHOLD_ANSWER,IBEG,IEND) WRITE(6,*)' THRESHOLD : ', + THRESHOLD_ANSWER(IBEG:IEND), + ' [FORMULA(+-x)/ALL/filename]' CALL STRPOS(SORTMODE_ANSWER,IBEG,IEND) WRITE(6,*)' SORT_MODE : ', + SORTMODE_ANSWER(IBEG:IEND), + ' [DISTANCE,VALUE,SIM/WSIM/IDENTITY/VALPER]' CALL STRPOS(PROFILE_ANSWER,IBEG,IEND) WRITE(6,*)' 2_PROFILE_OPTION : ', + PROFILE_ANSWER(IBEG:IEND),' [FULL/MEMBER/MAX/IGNORE]' CALL STRPOS(INDEL_ANSWER_1,IBEG,IEND) WRITE(6,*)' INDEL_IN_SEC_STRUC_1: ', + INDEL_ANSWER_1(IBEG:IEND),' [YES/NO]' CALL STRPOS(INDEL_ANSWER_2,IBEG,IEND) WRITE(6,*)' INDEL_IN_SEC_STRUC_2: ', + INDEL_ANSWER_2(IBEG:IEND),' [YES/NO]' CALL STRPOS(COMPARE_ANSWER,IBEG,IEND) WRITE(6,*)' SUPERPOS_IN_3-D : ', + COMPARE_ANSWER(IBEG:IEND),' [YES/NO]' CALL STRPOS(HSSP_ANSWER,IBEG,IEND) WRITE(6,*)' HSSP_OUTPUT : ', + HSSP_ANSWER(IBEG:IEND),' [YES/NO]' WRITE(6,*)' OUT_FORMAT_HSSP : ', + HSSP_FORMAT_ANSWER(IBEG:IEND),' [YES/NO]' CALL STRPOS(SAMESEQ_ANSWER,IBEG,IEND) WRITE(6,*)' SAME_SEQ_SHOW : ', + SAMESEQ_ANSWER(IBEG:IEND),' [YES/NO]' CALL STRPOS(PROFILEOUT_ANSWER,IBEG,IEND) WRITE(6,*)' PROFILE_OUTPUT : ', + PROFILEOUT_ANSWER(IBEG:IEND),' [YES/NO]' CALL STRPOS(STRIPFILE_ANSWER,IBEG,IEND) WRITE(6,*)' STRIP_OUTPUT : ', + STRIPFILE_ANSWER(IBEG:IEND),' [YES/NO/filename]' CALL STRPOS(LONG_OUTPUT_ANSWER,IBEG,IEND) WRITE(6,*)' LONG_OUTPUT : ', + LONG_OUTPUT_ANSWER(IBEG:IEND), + ' [YES/NO/filename]' CALL STRPOS(PLOTFILE_ANSWER,IBEG,IEND) WRITE(6,*)' DOT_PLOT_OUTPUT : ', + PLOTFILE_ANSWER(IBEG:IEND),' [YES/NO/filename]' CALL STRPOS(NBEST_ANSWER,IBEG,IEND) WRITE(6,*)' NBEST : ', + NBEST_ANSWER(IBEG:IEND),' [integer number]' CALL STRPOS(NGLOBALHITS_ANSWER,IBEG,IEND) WRITE(6,*)' MAXALIGN : ', + NGLOBALHITS_ANSWER(IBEG:IEND),' [integer number]' CALL STRPOS(PDBPATH_ANSWER,IBEG,IEND) WRITE(6,*)' PDB_PATH : ', + PDBPATH_ANSWER(IBEG:IEND),' [directory path]' CALL STRPOS(DSSP_PATH,IBEG,IEND) WRITE(6,*)' DSSP_PATH : ', + DSSP_PATH(IBEG:IEND),' [directory path]' ELSE CALL STRPOS(CFIELD(ID),IBEG,IEND) WRITE(6,*) + '***** OPTION UNKNOWN : ',CFIELD(ID)(IBEG:IEND) ENDIF ENDIF ID=ID+1 ENDDO ENDDO IF (LEXIST .EQV. .TRUE.) THEN CLOSE(IUNIT) ENDIF RETURN END C END INTERFACE C...................................................................... C...................................................................... C SUB LUDCMP SUBROUTINE LUDCMP(A,N,NP,INDX,D) PARAMETER (NMAX= 100) PARAMETER (TINY= 1.0E-20) DIMENSION A(NP,NP),INDX(N),VV(NMAX) C init D = 1.0 IMAX=0 C check dimension IF ( N .GT. NMAX) THEN WRITE(6,*)'ERROR: dimesnion overflow in LUDCMP' STOP ENDIF DO I=1,N AAMAX = 0.0 DO J=1,N IF (ABS(A(I,J)) .GT. AAMAX) AAMAX = ABS(A(I,J)) ENDDO IF (AAMAX .EQ. 0.0) THEN WRITE(6,*)'Singular matrix.' STOP ENDIF VV(I)=1.0 / AAMAX ENDDO DO J=1,N IF (J .GT. 1) THEN DO I=1,J-1 SUM=A(I,J) IF (I .GT. 1) THEN DO K=1,I-1 SUM = SUM - A(I,K) * A(K,J) ENDDO A(I,J)=SUM ENDIF ENDDO ENDIF AAMAX=0.0 DO I=J,N SUM=A(I,J) IF (J .GT. 1) THEN DO K=1,J-1 SUM = SUM - A(I,K) * A(K,J) ENDDO A(I,J)=SUM ENDIF DUM=VV(I) * ABS(SUM) IF (DUM .GE. AAMAX) THEN IMAX=I AAMAX=DUM ENDIF ENDDO IF (J .NE. IMAX) THEN DO K=1,N DUM = A(IMAX,K) A(IMAX,K) = A(J,K) A(J,K) = DUM ENDDO D=-D VV(IMAX)=VV(J) ENDIF INDX(J)=IMAX IF (J .NE. N) THEN IF (A(J,J) .EQ. 0.0)A(J,J)=TINY DUM=1.0 / A(J,J) DO I=J+1,N A(I,J) = A(I,J) * DUM ENDDO ENDIF ENDDO IF (A(N,N) .EQ. 0.0 )A(N,N)=TINY RETURN END C END LUDCMP C...................................................................... C...................................................................... C SUB LUBKSB SUBROUTINE LUBKSB(A,N,NP,INDX,B) DIMENSION A(NP,NP),INDX(N),B(N) II=0 DO I=1,N LL = INDX(I) SUM = B(LL) B(LL) = B(I) IF (II .NE. 0) THEN DO J=II,I-1 SUM = SUM - A(I,J) * B(J) ENDDO ELSE IF (SUM .NE. 0.0) THEN II=I ENDIF B(I)=SUM ENDDO DO I=N,1,-1 SUM=B(I) IF (I .LT. N) THEN DO J=I+1,N SUM = SUM - A(I,J) * B(J) ENDDO ENDIF B(I)=SUM/A(I,I) ENDDO RETURN END C END LUBKSB C...................................................................... C...................................................................... C SUB MAXHOM_PARALLEL_INTERFACE SUBROUTINE MAXHOM_PARALLEL_INTERFACE(LH1,LH2,NFILE,NALIGN, + NENTRIES,NAMINO_ACIDS) C import INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' c import REAL LH1(0:MAXMAT) INTEGER*2 LH2(0:MAXTRACE) c real lh(0:maxmat*2) INTEGER NFILE,NALIGN,NENTRIES,NAMINO_ACIDS C internal c integer iset,jset WRITE(6,*)' start: send init data' CALL SEND_DATA_TO_NODE CALL SEND_JOBS(LH1,LH2,NFILE,NALIGN,NENTRIES,NAMINO_ACIDS) WRITE(6,*)' call receive_results ' CALL RECEIVE_RESULTS_FROM_NODE(NALIGN) WRITE(6,*)' receive_results finished' CALL GET_CPU_TIME('time dbscan:',IDPROC, + ITIME_OLD,ITIME_NEW,TOTAL_TIME,LOGSTRING) CALL LOG_FILE(KLOG,LOGSTRING,2) RETURN END C END MAXHOM_PARALLEL_INTERFACE C...................................................................... C...................................................................... C SUB MAXHOM_QSORT SUBROUTINE MAXHOM_QSORT(N,IKEY,IFILE,V) C Non-recursive version of QUICKSORT algorithm, Wolfgang Kabsch, 1981. C The pair (V,IKEY) is ordered according to increasing values of V. C V(1) is the smallest value C 2**NSTACK values can be sorted if DIMENSION ISTACK(NSTACK,2). C ON 32 BIT MACHINE (NSTACK= nbit-2) c implicit none INTEGER N INTEGER IKEY(*),IFILE(*) REAL V(*) C internal INTEGER NSTACK PARAMETER (NSTACK= 30) INTEGER ISTACK(NSTACK,2) INTEGER J,K,IL,IR,JL,JR REAL VALK,X IF ( N .GT. 2**NSTACK ) THEN WRITE(6,*) 'QSORT OVERFLOW' STOP ENDIF J=1 ISTACK(1,1)=1 ISTACK(1,2)=N 10 JL=ISTACK(J,1) JR=ISTACK(J,2) J=J-1 20 IL=JL IR=JR K=(JL+JR)/2 C hack br: 1999-12 K can be 0! IF (K .LE. 0) K=K+1 VALK=V(K) 30 IF (V(IL) .GE. VALK) GOTO 40 IL=IL+1 GOTO 30 40 IF (V(IR) .LE. VALK) GOTO 50 IR=IR-1 GOTO 40 50 IF (IL .GT. IR) GOTO 60 c swap v and ikey: il<-->ir K=IKEY(IL) IKEY(IL)=IKEY(IR) IKEY(IR)=K C keep track of file pointer K=IFILE(IL) IFILE(IL)=IFILE(IR) IFILE(IR)=K X=V(IL) V(IL)=V(IR) V(IR)=X c end swap IL=IL+1 IR=IR-1 IF ( IL .LE. IR ) GOTO 30 60 IF ( (IR-JL) .GE. (JR-IL) ) GOTO 80 IF ( IL .GE. JR ) GOTO 70 J=J+1 IF ( J .GT. NSTACK ) THEN WRITE(6,*)J STOP' QSORT OVERFLOW' ENDIF ISTACK(J,1)=IL ISTACK(J,2)=JR 70 JR=IR GOTO 100 80 IF ( JL .GE. IR ) GOTO 90 J=J+1 IF ( J .GT. NSTACK ) THEN WRITE(6,*)J STOP' QSORT OVERFLOW' ENDIF ISTACK(J,1)=JL ISTACK(J,2)=IR 90 JL=IL 100 IF ( JL .LT. JR ) GOTO 20 IF ( J .GT. 0 ) GOTO 10 RETURN END C END MAXHOM_QSORT C...................................................................... C...................................................................... C SUB MOMENT SUBROUTINE MOMENT(DATA,N,AVE,ADEV,SDEV,VAR,SKEW,CURT) REAL DATA(*) S=0.0 DO J=1,N S=S+DATA(J) ENDDO AVE=S/N ADEV=0.0 VAR=0.0 SKEW=0.0 CURT=0.0 DO J=1,N S=DATA(J)-AVE ADEV=ADEV+ABS(S) P=S*S VAR=VAR+P P=P*S SKEW=SKEW+P P=P*S CURT=CURT+P ENDDO ADEV=ADEV/N VAR=VAR/(N-1) SDEV=SQRT(VAR) IF (VAR.NE.0.) THEN SKEW=SKEW/(N*SDEV**3) CURT=CURT/(N*VAR**2)-3. ELSE WRITE(6,*)'no skew or kurtosis when zero variance' ENDIF RETURN END C END MOMENT C...................................................................... C...................................................................... C SUB NORM_PROFILE SUBROUTINE NORM_PROFILE(MAXRES,NTRANS,TRANS,NRES_PROF,NRES_SEQ, + LSEQ,PROFILE,EPSILON,GAMMA,SMIN,SMAX,MAPLOW,MAPHIGH, + GAPOPEN,GAPELONG,SDEV) INTEGER MAXRES,NRES_PROF,NRES_SEQ,NTRANS,LSEQ(*) REAL PROFILE(MAXRES,NTRANS) REAL GAPOPEN(*),GAPELONG(*) REAL SMIN,SMAX,MAPLOW,MAPHIGH,EPSILON,GAMMA CHARACTER*(*) TRANS REAL AVE,SDEV REAL SUM,P INTEGER J,PRES LOGICAL LERROR C init SMAX=0.0 SMIN=0.0 SUM=0.0 SDEV=0.0 VAR=0.0 MAPLOW=0.0 MAPHIGH=0.0 NCOUNT= NRES_SEQ * NRES_PROF C get mean, sdev of profile-sequence pair DO IRES=1,NRES_SEQ DO PRES=1,NRES_PROF SUM = SUM + PROFILE(PRES,LSEQ(IRES)) ENDDO ENDDO AVE=SUM / NCOUNT DO IRES=1,NRES_SEQ DO PRES=1,NRES_PROF SUM = PROFILE(PRES,LSEQ(IRES)) - AVE P = SUM * SUM VAR = VAR + P ENDDO ENDDO VAR = VAR / (NCOUNT-1) SDEV = SQRT(VAR) C shift mean of the profile to epsilon * sdev SHIFT=AVE + ABS(EPSILON*SDEV) DO PRES=1,NRES_PROF DO JPOS=1,NTRANS PROFILE(PRES,JPOS)=PROFILE(PRES,JPOS) - SHIFT ENDDO ENDDO WRITE(6,*)' before: ave,sdev,shift ',ave,sdev,shift C do statistic SUM=0.0 SDEV=0.0 VAR=0.0 DO IRES=1,NRES_SEQ DO PRES=1,NRES_PROF SUM = SUM + PROFILE(PRES,LSEQ(IRES)) ENDDO ENDDO AVE=SUM / NCOUNT DO IRES=1,NRES_SEQ DO PRES=1,NRES_PROF SUM = PROFILE(PRES,LSEQ(IRES)) - AVE P = SUM * SUM VAR = VAR + P ENDDO ENDDO VAR = VAR / (NCOUNT-1) SDEV = SQRT(VAR) WRITE(6,*)' after: ave sdev ',ave,sdev C scale SMAX=-1.0E+10 SMIN=1.0E+10 DO PRES=1,NRES_PROF DO JPOS=1,NTRANS IF ( PROFILE(PRES,JPOS) .GT. SMAX ) SMAX=PROFILE(PRES,JPOS) IF ( PROFILE(PRES,JPOS) .LT. SMIN ) SMIN=PROFILE(PRES,JPOS) ENDDO ENDDO DO PRES=1,NRES_PROF GAPOPEN(PRES) = (SDEV * GAMMA) - EPSILON GAPELONG(PRES)= GAPOPEN(PRES) / 10.0 ENDDO WRITE(6,*)' smin/smax ',smin,smax WRITE(6,*)' open/elong ',gapopen(1),gapelong(1) c shi=smax c slo=smin c IF (maplow .eq. 0.0 .and. maphigh .eq. 0.0) THEN c WRITE(6,*)' scale between maplow/maphigh' c shi=abs(sdev) c slo=shi * -1.0 c endif c do pres=1,nres_prof c do jpos=1,ntrans c profile(pres,jpos)=(( profile(pres,jpos) -slo) / c + (shi-slo))*(smax-smin)+smin c enddo c enddo C======================================================================= C reset value for chain breaks etc... C add 'X' '!' and "-" J=INDEX(TRANS,'X') K=INDEX(TRANS,'!') L=INDEX(TRANS,'-') M=INDEX(TRANS,'.') IF (J .EQ. 0 .OR. K .EQ. 0 .OR. L .EQ. 0) THEN WRITE(6,*)'*** error: "X","!","-" or "." unknown in '// + 'norm_profile' ENDIF DO I=1,NRES_PROF PROFILE(I,J)=0.0 PROFILE(I,K)=0.0 PROFILE(I,L)=0.0 PROFILE(I,M)=0.0 ENDDO c======================================================================= c debug: WRITE matrix in output-file c======================================================================= CALL OPEN_FILE(99,'metric_debug.x','new,recl=500',LERROR) DO I=1,NRES_PROF WRITE(99,'(1X,25(F7.2))')(PROFILE(I,J),J=1,NTRANS) ENDDO CLOSE(99) c======================================================================= RETURN END C END NORM_PROFILE C...................................................................... C...................................................................... C SUB PREP_PROFILE SUBROUTINE PREP_PROFILE(NALIGN,NRES,WEIGHT_MODE,SIGMA,BETA) IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C import INTEGER NALIGN,NRES C used for smoothing the profile REAL SIGMA,BETA c character*(*) profileout CHARACTER WEIGHT_MODE*(*) C internal INTEGER MAXAA PARAMETER (MAXAA= 20) C================================================================== C used for de-log metrices REAL SCALE_FACTOR,LOG_BASE,WEIGHTS(MAXHITS) INTEGER I,J INTEGER LOWERPOS(NASCII) CHARACTER LOWER*26 C================================================================== C init C used to convert lower case characters from the DSSP-seq to 'C' (Cys) LOWER='abcdefghijklmnopqrstuvwxyz' CALL GETPOS(LOWER,LOWERPOS,NASCII) CALL GETPOS(TRANS,TRANSPOS,NASCII) WRITE(6,*)TRANS CALL GETSIMMETRIC(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES_1,NIOSTATES_1,NSTRSTATES_2,NIOSTATES_2, + CSTRSTATES,CIOSTATES,IORANGE, + KSIM,METRICFILE,SIMORG) IF (LDSSP_1) THEN CALL LOWER_TO_CYS(CSQ_1,NRES) ENDIF CALL SEQ_TO_INTEGER(CSQ_1,LSQ_1,NRES,TRANSPOS) CALL STR_TO_CLASS(MAXSTRSTATES,STR_CLASSES,NRES,STRUC_1, + STRCLASS_1,LSTRUC_1) CALL ACC_TO_INT(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES_1,NIOSTATES_1,IORANGE, + NRES,LSQ_1,LSTRUC_1,NSURF_1,LACC_1) CALL FILLSIMMETRIC(MAXSQ,NTRANS,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES_1,NSTRSTATES_2,CSTRSTATES,SIMORG,NRES,LSQ_1, + LSTRUC_1,LACC_1,SIMMETRIC_1) c IF (ldssp_1) THEN c do ires=1,nres c call getindex(csq_1_array(ires) ,lowerpos,i) c IF (i.ne.0)csq_1_array(ires)='C' c enddo c endif CALL ADD_SEQ_TO_SEQBUFFER(MAXHITS,MAXSEQBUFFER,1,NALIGN, + CSQ_1_ARRAY,1,NRES,HSSPID_1,SEQBUFFER,ISEQPOINTER, + AL_IFIRST,AL_ILAST,AL_JFIRST,AL_JLAST,AL_LEN, + AL_NGAP,AL_LGAP,AL_LSEQ_2,AL_PDB_POINTER, + AL_HOM,AL_SIM,AL_EXCLUDEFLAG,AL_ACCESSION, + AL_EMBLPID, + AL_COMPOUND) NBOX_1=1 PROFILEBOX_1(1,1)=1 PROFILEBOX_1(1,2)=NRES WRITE(6,*)'********** defaults *********************' CALL STRPOS(METRICFILE,I,J) WRITE(6,*)' metric : ',metricfile(i:j) WRITE(6,*)' smin : ',smin WRITE(6,*)' smax : ',smax WRITE(6,*)' maplow : ',maplow WRITE(6,*)' maphigh : ',maphigh WRITE(6,*)' open_1 : ',open_1 WRITE(6,*)' elong_1 : ',elong_1 WRITE(6,*)'*****************************************' CALL SINGLE_SEQ_WEIGHTS(NALIGN,SEQBUFFER, + ISEQPOINTER,AL_IFIRST,AL_ILAST, + WEIGHT_MODE,WEIGHTS) IF (INDEX(METRICFILE,'osum') .NE. 0) THEN SCALE_FACTOR=0.5 LOG_BASE=2.0 ELSE SCALE_FACTOR=1.0 LOG_BASE=10.0 ENDIF WRITE(6,*)'scale_factor log_base: ',scale_factor,log_base WRITE(6,*)'sigma beta: ',sigma,beta CALL CALC_PROFILE(MAXSQ,MAXAA,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES_1,NIOSTATES_1,NSTRSTATES_2, + NIOSTATES_2, + SCALE_FACTOR,LOG_BASE,SIGMA,BETA, + NRES,NALIGN, + AL_EXCLUDEFLAG,AL_IFIRST,AL_ILAST, + SEQBUFFER,ISEQPOINTER,NTRANS,TRANS, + WEIGHTS,OPEN_1,ELONG_1,GAPOPEN_1, + GAPELONG_1,SIMORG,SIMMETRIC_1) RETURN END C END PREP_PROFILE C...................................................................... C...................................................................... C SUB READ_FILENAME SUBROUTINE READ_FILENAME(KUNIT,FILENAME,LENDFILE,LERROR) CHARACTER*(*) FILENAME INTEGER KUNIT LOGICAL LENDFILE,LERROR LENDFILE= .FALSE. LERROR= .FALSE. FILENAME= ' ' READ(KUNIT,'(A)',END=100,ERR=200)FILENAME RETURN 100 LENDFILE= .TRUE. RETURN 200 LERROR= .TRUE. RETURN END C END READ_FILENAME C...................................................................... C...................................................................... C SUB RECEIVE_RESULTS_FROM_NODE SUBROUTINE RECEIVE_RESULTS_FROM_NODE(NALIGN) C import INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C export INTEGER NALIGN C local for each node c integer irecpoi(maxaligns),ifilepoi(maxaligns) c real alisortkey(maxaligns),len2_orig(maxaligns) C internal INTEGER IWORKER,IALIGN,IBEG,IEND INTEGER IALIGN_GOOD_ALL C receive result from nodes and store in GLOBAL space ILINK=1 LOGSTRING=' ' IALIGN_GOOD_ALL = IALIGN_GOOD WRITE(6,*)' receive results: nalign nworker ', nalign, + NWORKER,IALIGN_GOOD CALL FLUSH_UNIT(6) c msgtype=idtop DO IWORKER=1,NWORKER MSGTYPE=4000 ILINK=IWORKER CALL MP_RECEIVE_DATA(MSGTYPE,LINK(ILINK)) CALL MP_GET_INT4(MSGTYPE,LINK(ILINK),IALIGN,N_ONE) CALL MP_GET_INT4(MSGTYPE,LINK(ILINK),IALIGN_GOOD, + N_ONE) IALIGN_GOOD_ALL = IALIGN_GOOD_ALL + IALIGN_GOOD WRITE(6,*)ILINK,' IALIGN/GOOD/GOOD_ALL', + IALIGN,IALIGN_GOOD,IALIGN_GOOD_ALL CALL FLUSH_UNIT(6) IF (IALIGN .GT. 0) THEN MSGTYPE=5000 IBEG=NALIGN IEND=NALIGN+IALIGN-1 IF (IEND .GT. MAXALIGNS) THEN WRITE(6,*)'FATAL ERROR: MAXALIGNS OVERFLOW, INCREASE !!' CALL FLUSH_UNIT(6) STOP ENDIF CALL MP_RECEIVE_DATA(MSGTYPE,LINK(ILINK)) CALL MP_GET_REAL4(MSGTYPE,LINK(ILINK), + ALISORTKEY(IBEG),IALIGN) CALL MP_GET_INT4(MSGTYPE,LINK(ILINK), + IRECPOI(IBEG),IALIGN) CALL MP_GET_INT4(MSGTYPE,LINK(ILINK), + IFILEPOI(IBEG),IALIGN) NALIGN=NALIGN+IALIGN WRITE(LOGSTRING,'(A,4(I6))')' pid / done : ', + IWORKER,IALIGN ELSE WRITE(LOGSTRING,'(A,I6,I6)')'nothing found: ', + IWORKER,IALIGN ENDIF CALL LOG_FILE(KLOG,LOGSTRING,0) CALL FLUSH_UNIT(6) ENDDO NALIGN=NALIGN-1 IALIGN_GOOD=IALIGN_GOOD_ALL WRITE(6,*)' total done : ',nalign,ialign_good CALL FLUSH_UNIT(6) RETURN END C END RECEIVE_RESULTS_FROM_NODE C...................................................................... C...................................................................... C SUB SEND_ALI_REQUEST SUBROUTINE SEND_ALI_REQUEST(IWORKER,IRECORD,IMSGTAG,CHECKVAL) IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' c input INTEGER IWORKER,IRECORD,IMSGTAG REAL CHECKVAL MSGTYPE=6000 CALL MP_INIT_SEND() CALL MP_PUT_INT4(MSGTYPE,IWORKER,IRECORD,N_ONE) CALL MP_PUT_INT4(MSGTYPE,IWORKER,IMSGTAG,N_ONE) CALL MP_PUT_REAL4(MSGTYPE,IWORKER,CHECKVAL,N_ONE) CALL MP_SEND_DATA(MSGTYPE,LINK(IWORKER)) RETURN END C END SEND_ALI_REQUEST C...................................................................... C...................................................................... C SUB SEND_JOBS C get "ready" signal from node and send "nfile" jobs SUBROUTINE SEND_JOBS(LH1,LH2,NFILE,NALIGN,NENTRIES, + NAMINO_ACIDS) C import IMPLICIT NONE INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' INTEGER NFILE,NENTRIES,NAMINO_ACIDS c import REAL LH1(0:MAXMAT) INTEGER*2 LH2(0:MAXTRACE) c real lh(0:maxmat*2) C internal INTEGER IFLAG c integer inix INTEGER IFILE,JFILE,ISET,ILINK,I,NALIGN,IALIGN,IFIRST_ROUND INTEGER NRECORD,IPOINTER INTEGER IDONE(MAXPROC) c integer ipos,nsplit,isplit c logical lendbase,ldb_read_one LOGICAL LERROR,LENDFILE CHARACTER*200 FILENAME C init FILENAME=' ' JFILE=0 ILINK=1 MSGTYPE=0 DO I=1,MAXPROC IDONE(I)=0 ENDDO NALIGN=1 IALIGN=0 IFILE=1 IFIRST_ROUND=0 NRECORD=0 ISET=0 IPOINTER=1 IALIGN_GOOD=0 c ldb_read_one=.false. c$$$ if (ldb_read_one .eqv. .true.) THEN c$$$ nbuffer_len = 6 + len(name_2) + len(compnd_2) + c$$$ + len(ACCESSION_2) + len(pdbref_2) c$$$ lfirst_scan=.false. c$$$ nsplit=namino_acids / (nworker +1) c$$$ c$$$ iset=0 ; isplit=0 c$$$ c$$$ do while( ifile .le. nfile) c$$$ lendbase=.false. c$$$ call open_sw_data_file(kbase,ifile,split_db_data, c$$$ + split_db_path) c$$$c WRITE(6,*)ifile,nseq_warm_start,isplit,nsplit c$$$c call flush_unit(6) c$$$ c$$$ do while(lendbase .eqv. .false.) c$$$ call get_swiss_entry(maxsq,kbase,lbinary,n2in,name_2,compnd_2, c$$$ + ACCESSION_2,pdbref_2,csq_2,lendbase) c$$$ c$$$ if (lendbase .eqv. .false.) THEN c$$$ IF ( (ipointer + nbuffer_len + n2in) .gt. c$$$ + maxdatabase_buffer) THEN c$$$ WRITE(6,*)' **** FATAL ERROR ****' c$$$ WRITE(6,*)' database_buffer overflow increase' c$$$ WRITE(6,*)' dimension of MAXDATABASE_BUFFER' c$$$ STOP c$$$ endif c$$$ WRITE(cbuffer_line(1:),'(i6,a,a,a,a)')n2in,name_2, c$$$ + compnd_2,ACCESSION_2,pdbref_2 c$$$ do ipos=1,nbuffer_len c$$$ cdatabase_buffer(ipointer)= c$$$ + cbuffer_line(ipos:ipos) c$$$ ipointer=ipointer+1 c$$$ enddo c$$$ do ipos=1,n2in c$$$ cdatabase_buffer(ipointer)=csq_2(ipos:ipos) c$$$ ipointer=ipointer+1 c$$$ enddo c$$$ isplit=isplit+n2in ; nseq_warm_start=nseq_warm_start+1 c$$$ if ( (isplit .ge. nsplit) .and. c$$$ + (iset .le. nworker) ) then c$$$ iset=iset+1 ; ipointer=ipointer-1 c$$$ WRITE(6,'(a,i6,i8,i10,i8)') c$$$ + 'internal buffer: ',iset,nseq_warm_start, c$$$ + ipointer,isplit c$$$ call flush_unit(6) c$$$ c$$$ msgtype=8000 ; ilink=iset c$$$ call mp_init_send() c$$$ call mp_put_int4(msgtype,ilink,ipointer,n_one) c$$$ call mp_put_int4(msgtype,ilink,nseq_warm_start, c$$$ + n_one) c$$$ call mp_send_data(msgtype,link(ilink)) c$$$ msgtype=9000 c$$$ call mp_init_send() c$$$ call mp_put_string_array(msgtype,ilink, c$$$ + cdatabase_buffer,ipointer) c$$$ call mp_send_data(msgtype,link(ilink)) c$$$ ipointer=1 ; nseq_warm_start=0 ; isplit=0 c$$$ endif c$$$ else c$$$ close(kbase) ; ifile=ifile+1 c$$$ endif c$$$ enddo c$$$ enddo c$$$ msgtype=10000 c$$$ call mp_init_send() c$$$ call mp_put_int4(msgtype,ilink,ipointer,n_one) c$$$ call mp_cast(nworker,msgtype,link(1)) c$$$ endif CALL GET_CPU_TIME('time init:',IDPROC, + ITIME_OLD,ITIME_NEW,TOTAL_TIME,LOGSTRING) CALL LOG_FILE(KLOG,LOGSTRING,2) IPOINTER=1 IF (LISTOFSEQ_2 .EQV. .FALSE.) THEN IF (LFIRST_SCAN .EQV. .TRUE.) THEN DO WHILE (IFILE .LE. NFILE ) MSGTYPE=2000 ILINK=-1 c call mp_receive_data(msgtype,ilink) c call mp_get_int4(msgtype,ilink,ilink,n_one) C first test for messages CALL MP_PROBE(MSGTYPE,IFLAG) C if no communication is necessary do some "real" work IF ( IFLAG.EQ.0 .AND. IFILE .GE. NWORKSET*MAXQUEUE) THEN WRITE(LOGSTRING,*)' file to host: ',ifile CALL LOG_FILE(KLOG,LOGSTRING,1) CALL HOST_INTERFACE(LH1,LH2,IFILE,FILENAME,IALIGN, + NRECORD,IPOINTER) IFILE=IFILE+1 IFIRST_ROUND=1 c we have to fill the work-queue ELSE MSGTYPE=2000 CALL MP_RECEIVE_DATA(MSGTYPE,ILINK) CALL MP_GET_INT4(MSGTYPE,ILINK,ILINK,N_ONE) CALL MP_INIT_SEND() C when we communicate the fist time, we fill the queue IF (IFIRST_ROUND .EQ. 0) THEN ISET=ISET+1 JFILE=ISET DO I=1,MAXQUEUE c WRITE(6,'(a,i4,a,i4)')' file: ',jfile,' to: ',ilink c call flush_unit(6) MSGTYPE=3000 CALL MP_PUT_INT4(MSGTYPE,ILINK,JFILE,N_ONE) JFILE=JFILE+NWORKSET IFILE=IFILE+1 ENDDO CALL MP_SEND_DATA(MSGTYPE,LINK(ILINK)) C send one file-pointer to refill the work-queue ELSE c WRITE(6,'(a,i4,a,i4)')' file: ',ifile,' to: ',ilink c call flush_unit(6) MSGTYPE=3000 CALL MP_PUT_INT4(MSGTYPE,ILINK,IFILE,N_ONE) IFILE=IFILE+1 CALL MP_SEND_DATA(MSGTYPE,LINK(ILINK)) ENDIF ENDIF ENDDO LFIRST_SCAN=.FALSE. C now tell everybody that the work is done ISET=0 DO WHILE (ISET .LT. NWORKSET) MSGTYPE=2000 ILINK=-1 CALL MP_RECEIVE_DATA(MSGTYPE,ILINK) CALL MP_GET_INT4(MSGTYPE,ILINK,ILINK,N_ONE) IF (IDONE(ILINK) .EQ. 0) THEN c WRITE(6,'(a,i4)')' last from: ',ilink ; call flush_unit(6) ISET=ISET+1 IDONE(ILINK)=1 MSGTYPE=3000 IFILE=-1 CALL MP_INIT_SEND() CALL MP_PUT_INT4(MSGTYPE,ILINK,IFILE,N_ONE) CALL MP_SEND_DATA(MSGTYPE,LINK(ILINK)) c else c WRITE(6,'(a,i4)')' collect dead message: ',ilink c call flush_unit(6) ENDIF ENDDO WRITE(LOGSTRING,'(a,i6,i8,i10)')'internal buffer: ', + IDPROC,NSEQ_WARM_START,IPOINTER CALL LOG_FILE(KLOG,LOGSTRING,1) ELSE CALL HOST_INTERFACE(LH1,LH2,IFILE,FILENAME,IALIGN, + NRECORD,IPOINTER) ENDIF ELSE C =================================================================== C list of filenames C =================================================================== IFILE=0 WRITE(6,*)' load work queue: ',ilink LENDFILE=.FALSE. LERROR=.FALSE. DO ILINK=1,NWORKSET DO I=1,MAXQUEUE_LIST IF ( (LENDFILE .EQV. .FALSE.) .AND. + (LERROR .EQV. .FALSE. ) ) THEN CALL READ_FILENAME(KLIS2,FILENAME,LENDFILE, + LERROR) ENDIF IF ( (LENDFILE .EQV. .TRUE.) .OR. + (LERROR .EQV. .TRUE. ) ) THEN FILENAME='STOP' ENDIF WRITE(6,'(A,A,A,I4)')'file: ',filename(1:50), + ' to: ',ILINK CALL FLUSH_UNIT(6) MSGTYPE=9000 CALL MP_INIT_SEND() CALL MP_PUT_STRING(MSGTYPE,ILINK,FILENAME, + LEN(FILENAME)) CALL MP_SEND_DATA(MSGTYPE,LINK(ILINK)) IF ( (LENDFILE .EQV. .TRUE.) .OR. + (LERROR .EQV. .TRUE. ) ) THEN GOTO 500 ENDIF IFILE=IFILE+1 ENDDO ENDDO DO WHILE (.TRUE. ) MSGTYPE=2000 ILINK=-1 C first test for messages CALL MP_PROBE(MSGTYPE,IFLAG) C if no communication is necessary do some "real" work c IF ( iflag .eq. 0 ) THEN c ifirst_round=1 c IF ( iflag .eq. 0 .and. c + ifile .ge. (nworkset * maxqueue_list) ) THEN c ifirst_round=1 c call read_filename(klis2,filename,lendfile,lerror) c IF (lendfile .eqv. .true. .or. lerror .eqv. .true.) THEN c filename='STOP' c goto 500 c endif c WRITE(logstring,*)' host is working on file: ',ifile c call log_file(klog,logstring,1) c call host_interface(lh1,lh2,ifile,filename,ialign, c + nrecord,ipointer) c ifile=ifile+1 c we have to fill the work-queue c else MSGTYPE=2000 CALL MP_RECEIVE_DATA(MSGTYPE,ILINK) CALL MP_GET_INT4(MSGTYPE,ILINK,ILINK,N_ONE) CALL MP_INIT_SEND() C send one file-pointer to refill the work-queue c if (ifirst_round .ne. 0) then CALL READ_FILENAME(KLIS2,FILENAME,LENDFILE,LERROR) IF ( (LENDFILE .EQV. .TRUE.) .OR. + ( LERROR .EQV. .TRUE.) ) THEN FILENAME='STOP' GOTO 500 ENDIF WRITE(6,'(A,I4)')FILENAME(1:50),ILINK CALL FLUSH_UNIT(6) MSGTYPE=3000 CALL MP_PUT_STRING(MSGTYPE,ILINK,FILENAME, + LEN(FILENAME)) CALL MP_SEND_DATA(MSGTYPE,LINK(ILINK)) IFILE=IFILE+1 C when we communicate the fist time, we fill the queue c else c endif c endif ENDDO c lfirst_scan=.false. C now tell everybody that the work is done 500 ISET=0 DO WHILE (ISET .LT. NWORKSET) MSGTYPE=2000 ILINK=-1 CALL MP_RECEIVE_DATA(MSGTYPE,ILINK) CALL MP_GET_INT4(MSGTYPE,ILINK,ILINK,N_ONE) IF (IDONE(ILINK) .EQ. 0) THEN WRITE(6,'(a,i4)')' last from: ',ilink CALL FLUSH_UNIT(6) ISET=ISET+1 IDONE(ILINK)=1 MSGTYPE=3000 FILENAME='STOP' CALL MP_INIT_SEND() CALL MP_PUT_STRING(MSGTYPE,ILINK,FILENAME, + LEN(FILENAME)) CALL MP_SEND_DATA(MSGTYPE,LINK(ILINK)) c else c WRITE(6,'(a,i4)')' collect dead message: ',ilink c call flush_unit(6) ENDIF ENDDO ENDIF NALIGN=NALIGN+IALIGN WRITE(LOGSTRING,*)' host processed: ',nseq_warm_start, + IALIGN_GOOD CALL LOG_FILE(KLOG,LOGSTRING,1) RETURN END C END SEND_JOBS C...................................................................... C...................................................................... C SUB SETCONSERVATION SUBROUTINE SETCONSERVATION(METRIC_FILENAME) C 1. set conservation weights to 1.0 C 2. rescale matrix for the 22 amino residues such that the sum over C the matrix is 0.0 (or near) C this matrix is used to calculate the conservation weights (SIMCONSERV) c implicit none INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' C import CHARACTER*(*) METRIC_FILENAME C internal INTEGER NACID PARAMETER (NACID= 22) INTEGER I,J REAL XLOW,XHIGH,XMAX,XMIN,XFACTOR,SUMMAT *----------------------------------------------------------------------* C DO I=1,MAXSQ CONSWEIGHT_1(I)=1.0 ENDDO LFIRSTWEIGHT=.TRUE. C get metric CALL GETSIMMETRIC(NTRANS,TRANS,MAXSTRSTATES,MAXIOSTATES, + NSTRSTATES_1,NIOSTATES_1,NSTRSTATES_2, + NIOSTATES_2,CSTRSTATES,CIOSTATES, + IORANGE,KSIM,METRIC_FILENAME,SIMORG) c rescale matrix that the sum over matrix is +- 0.0 XLOW=0.0 XHIGH=0.0 XMAX=1.0 XMIN=-1.0 XFACTOR=100.0 C (re)store original values in simconserv() 20 DO J=1,NTRANS DO I=1,NTRANS SIMCONSERV(I,J)=SIMORG(I,J,1,1,1,1) ENDDO ENDDO c scale with xmin/xmax CALL SCALEINTERVAL(SIMCONSERV,NTRANS**2,XMIN,XMAX,XLOW,XHIGH) C RESEt the values for 'X' '!' and '-' I=INDEX(TRANS,'X') DO J=1,NTRANS SIMCONSERV(I,J)=0.0 SIMCONSERV(J,I)=0.0 ENDDO I=INDEX(TRANS,'!') DO J=1,NTRANS SIMCONSERV(I,J)=0.0 SIMCONSERV(J,I)=0.0 ENDDO I=INDEX(TRANS,'-') DO J=1,NTRANS SIMCONSERV(I,J)=0.0 SIMCONSERV(J,I)=0.0 ENDDO I=INDEX(TRANS,'.') DO J=1,NTRANS SIMCONSERV(I,J)=0.0 SIMCONSERV(J,I)=0.0 ENDDO C calculate sum over matrix (22 amino acids) after scaling SUMMAT=0.0 DO I=1,NACID DO J=1,NACID SUMMAT=SUMMAT+SIMCONSERV(I,J) ENDDO ENDDO cd WRITE(6,*)' sum: ',summat,xmin c check sum=0.0 (+- 0.01) ; if not xmin=xmin/2 ; scale again IF (SUMMAT .GT. 0.01) THEN XMIN=XMIN+(XMIN/XFACTOR) ELSE IF (SUMMAT .LT. -0.01) THEN XMIN=XMIN-(XMIN/XFACTOR) ELSE WRITE(6,*)' SETCONSERVATION: sum over matrix: ',summat WRITE(6,*)' smin is : ',xmin c kdeb=45 c call open_file(kdeb,'DEBUG.X','NEW',lerror) c do i=1,ntrans c WRITE(kdeb,'(a,26(f5.2))')trans(i:i), c + (simconserv(i,j),j=1,ntrans) c enddo c WRITE(kdeb,*)'sum over matrix: ',summat c WRITE(kdeb,*)'min,max: ',xmin,xmax c close(kdeb) RETURN ENDIF GOTO 20 END C END SETCONSERVATION C...................................................................... C...................................................................... C SUB SINGLE_SEQ_WEIGHTS SUBROUTINE SINGLE_SEQ_WEIGHTS(NALIGN,SEQBUFFER, + ISEQPOINTER,AL_IFIRST,AL_ILAST,MODE,WEIGHTS) c c input: hssp alignments c w0 -- eigenvalue iteration weights x(i) c w1 -- squared eigenvectors x(i)**2 c w2 -- sum of distances w(i)=SUM(dist(i,j)) c w3 -- exponential weight w(i)=1/SUM(exp(-dist(i,j)/dmean)) c IMPLICIT NONE C import INTEGER NALIGN INTEGER AL_IFIRST(*),AL_ILAST(*),ISEQPOINTER(*) CHARACTER SEQBUFFER(*) CHARACTER MODE*(*) C export REAL WEIGHTS(*) c INTEGER MAXALIGNS_LOC,MAXSTEP PARAMETER (MAXALIGNS_LOC= 8765) C PARAMETER (MAXALIGNS_LOC= 12345) C PARAMETER (MAXALIGNS_LOC= 21987) PARAMETER (MAXSTEP= 100) REAL TOLERANCE PARAMETER (TOLERANCE= 0.00001) REAL DIST(MAXALIGNS_LOC,MAXALIGNS_LOC) c real sim_table(maxaligns,maxaligns) c integer maxaa c real sel_press,xpower,xtemp1,xtemp2 REAL WTEMP(MAXALIGNS_LOC) c real vtemp(maxaligns,maxaligns) c INTEGER STEP,LENGTH,NPOS,I,J,K,K0,K1,KPOS CHARACTER A1,A2 REAL X,S,DMEAN *----------------------------------------------------------------------* c maxaa=19 I=LEN(MODE) CALL LOWTOUP(MODE,I) IF (NALIGN .GT. MAXALIGNS_LOC) THEN WRITE(6,*)' maxaligns overflow in single_seq_weight NALIGN=', + NALIGN STOP ENDIF DO I=1,NALIGN WEIGHTS(I)=1.0 ENDDO IF (NALIGN .LE. 1) THEN WRITE(6,*)' SINGLE_SEQ_WEIGHT: no alignments !' RETURN ENDIF C calculate distance/identity table WRITE(6,*)' calculate distance table...' DO I=1,NALIGN DIST(I,I)=0.0 c sim_table(i,i)=1.0 DO J=I+1,NALIGN LENGTH=0 NPOS=0 K0=MAX(AL_IFIRST(I),AL_IFIRST(J)) K1=MIN(AL_ILAST(I),AL_ILAST(J)) KPOS=ISEQPOINTER(I) + K0 - AL_IFIRST(I) DO K=K0,K1 NPOS=NPOS+1 A1= SEQBUFFER(KPOS) A2= SEQBUFFER(KPOS) KPOS=KPOS+1 IF (A1.EQ.A2) LENGTH=LENGTH+1 IF (A1 .GE. 'a' .OR. A2 .GE. 'a') THEN IF (A1 .GE. 'a' ) THEN A1=CHAR( ICHAR(A1)-32 ) ENDIF IF (A2 .GE. 'a' ) THEN A2=CHAR( ICHAR(A2)-32 ) ENDIF IF (A1.EQ.A2) LENGTH=LENGTH+1 ENDIF c IF (a1 .ge. 'a' .and. a1 .le. 'z') THEN c a1=char( ichar(a1)-32 ) c endif c IF (a2 .ge. 'a' .and. a2 .le. 'z') THEN c a2=char( ichar(a2)-32 ) c endif END DO DIST(I,J)= 1- (FLOAT(LENGTH)/MAX(1.0,FLOAT(NPOS)) ) c sim_table(i,j)=float(length)/max(1.0,float(npos)) c dist(i,j)= 1.00 - sim_table(i,j) DIST(J,I)=DIST(I,J) c sim_table(j,i)=sim_table(i,j) END DO END DO c WRITE(6,*) ' distances: ' c do i=1,nalign c WRITE(6,'(26i3)') (nint(100*dist(j,i)),j=1,nalign) c end do c IF (INDEX(MODE,'MAT'). NE. 0 ) THEN WRITE(6,*)' weight mode MAT NOT active ' STOP c WRITE(6,*)' preparing identity matrix...' c sel_press=0.5 c xpower= 1.0 / (1.0 - sel_press + (1.0/maxaa) ) c xtemp1= 1.0 + (1.0 / (maxaa * (1.0 - sel_press) ) ) c xtemp2= 1.0 / (maxaa * (1-sel_press) ) c do i=1,nalign ; do j=1,nalign c sim_table(i,j) = ( sim_table(i,j) * xtemp1 - xtemp2 ) c IF (sim_table(i,j) .le. tolerance) THEN c WRITE(6,*)'set sim_table to tolerance ',i,j c sim_table(i,j) = tolerance c endif c sim_table(i,j) = sim_table(i,j) **xpower c enddo ; enddo c WRITE(6,*)' calculate singular value decomposition...' c call svdcmp(sim_table,nalign,nalign,maxaligns,maxaligns,wtemp, c + vtemp) c WRITE(6,*)' calculate matrix invers...' c do i=1,nalign c if (wtemp(i) .le. 0.0001) THEN c x=0.0 c else c x= 1/wtemp(i) c endif c do j=1,nalign c sim_table(i,j) = vtemp(i,j) * x * sim_table(i,j) c weights(i) = weights(i) + sim_table(i,j) c enddo c enddo c======================================================================= c calculate one-sequence weights from a distance matrix c step 0: w(k) = 1 / N * sum[dist(k,length)] c step i: w(k)(i) = 1 / NORM * sum[dist(k,l) * w(length)(i-1)] c iterate until sum[|w(k)(i)-w(k)(i-1)|] < tolerance c======================================================================= c eigenvector iteration c======================================================================= ELSE IF (INDEX(MODE,'EIGEN') .NE. 0 .OR. + INDEX(MODE,'SQUARE') .NE. 0) THEN DO I=1,NALIGN WTEMP(I)=1.0/NALIGN END DO STEP=0 10 STEP=STEP+1 X=0.0 DO I=1,NALIGN WEIGHTS(I)=0.0 DO J=1,NALIGN WEIGHTS(I) = WEIGHTS(I) + WTEMP(J) * DIST(I,J) END DO X=X+WEIGHTS(I) END DO S=0.0 DO I=1,NALIGN S = S +(WTEMP(I)-WEIGHTS(I)/X) * (WTEMP(I)-WEIGHTS(I)/X) WTEMP(I)=WEIGHTS(I)/X END DO S=SQRT(S/NALIGN) IF ((STEP .LT. MAXSTEP) .AND. (S .GT. TOLERANCE)) GOTO 10 WRITE(6,'(A,I5,A,F10.4)')' WEIGHTS AT STEP:', STEP, + ' DIFFERENCE: ',S WRITE(6,'(13F6.3)') (NALIGN*WTEMP(I),I=1,NALIGN) ENDIF c======================================================================= c weights(i)=wtemp(i)**2 c======================================================================= IF (INDEX(MODE,'SQUARE') .NE. 0) THEN S=0.0 DO I=1,NALIGN WEIGHTS(I)=WTEMP(I) * WTEMP(I) S=S+WEIGHTS(I) END DO DO I=1,NALIGN WEIGHTS(I)=WEIGHTS(I)/S END DO WRITE(6,*) ' squared weights ' WRITE(6,'(13F6.3)') (NALIGN*WEIGHTS(I),I=1,NALIGN) c======================================================================= c weights(i)=SUM(dist(i,j)) c======================================================================= ELSE IF (INDEX(MODE,'SUM') .NE. 0) THEN S=0.0 DO I=1,NALIGN WEIGHTS(I)=0.0 DO J=1,NALIGN WEIGHTS(I)=WEIGHTS(I) + DIST(I,J) END DO S=S+WEIGHTS(I) END DO DO I=1,NALIGN WEIGHTS(I)=WEIGHTS(I)/S END DO WRITE(6,*) ' summed distance weights ' WRITE(6,'(13F6.3)') (NALIGN*WEIGHTS(I),I=1,NALIGN) c======================================================================= c weights(i)=1/SUM(exp(-dist(i,j)/dmean)) c======================================================================= ELSE IF (INDEX(MODE,'EXP') .NE. 0) THEN S=0.0 DO I=1,NALIGN DO J=I+1,NALIGN S=S+DIST(I,J) END DO END DO DMEAN=S/NALIGN/(NALIGN-1)*2 DO I=1,NALIGN S=0.0 DO J=1,NALIGN S=S+EXP(-DIST(I,J)/DMEAN) END DO IF (S.GT.0.0) THEN WEIGHTS(I)=1/S ELSE WRITE(6,*) ' warning: s=0 in weights ' WEIGHTS(I)=1.0 END IF END DO c normalize to 1.0 S=0.0 DO I=1,NALIGN S=S+WEIGHTS(I) END DO DO I=1,NALIGN WEIGHTS(I)=WEIGHTS(I)/S END DO WRITE(6,*) ' exponential distance weights ' WRITE(6,'(13F6.3)') (NALIGN*WEIGHTS(I),I=1,NALIGN) ENDIF RETURN END C end single_seq_weight C...................................................................... C...................................................................... C SUB SVDCMP SUBROUTINE SVDCMP(A,M,N,MP,NP,W,V) INTEGER MAXALIGNS_LOC PARAMETER (MAXSTEP= 100) PARAMETER (MAXALIGNS_LOC= 8765) C PARAMETER (MAXALIGNS_LOC= 12345) C PARAMETER (MAXALIGNS_LOC= 21987) DIMENSION A(MP,NP),W(NP),V(NP,NP),RV1(MAXALIGNS_LOC) *----------------------------------------------------------------------* L=0 nm=0 G=0.0 SCALE=0.0 ANORM=0.0 IF (m .gt. nmax) THEN WRITE(6,*)'***ERROR: dim. overflow for RV1 in SVDCMP' STOP endif DO 25 I=1,N L=I+1 RV1(I)=SCALE*G G=0.0 S=0.0 SCALE=0.0 IF (I.LE.M) THEN DO 11 K=I,M SCALE=SCALE+ABS(A(K,I)) 11 CONTINUE IF (SCALE.NE.0.0) THEN DO 12 K=I,M A(K,I)=A(K,I)/SCALE S=S+A(K,I)*A(K,I) 12 CONTINUE F=A(I,I) G=-SIGN(SQRT(S),F) H=F*G-S A(I,I)=F-G IF (I.NE.N) THEN DO 15 J=L,N S=0.0 DO 13 K=I,M S=S+A(K,I)*A(K,J) 13 CONTINUE F=S/H DO 14 K=I,M A(K,J)=A(K,J)+F*A(K,I) 14 CONTINUE 15 CONTINUE ENDIF DO 16 K= I,M A(K,I)=SCALE*A(K,I) 16 CONTINUE ENDIF ENDIF W(I)=SCALE *G G=0.0 S=0.0 SCALE=0.0 IF ((I.LE.M).AND.(I.NE.N)) THEN DO 17 K=L,N SCALE=SCALE+ABS(A(I,K)) 17 CONTINUE IF (SCALE.NE.0.0) THEN DO 18 K=L,N A(I,K)=A(I,K)/SCALE S=S+A(I,K)*A(I,K) 18 CONTINUE F=A(I,L) G=-SIGN(SQRT(S),F) H=F*G-S A(I,L)=F-G DO 19 K=L,N RV1(K)=A(I,K)/H 19 CONTINUE IF (I.NE.M) THEN DO 23 J=L,M S=0.0 DO 21 K=L,N S=S+A(J,K)*A(I,K) 21 CONTINUE DO 22 K=L,N A(J,K)=A(J,K)+S*RV1(K) 22 CONTINUE 23 CONTINUE ENDIF DO 24 K=L,N A(I,K)=SCALE*A(I,K) 24 CONTINUE ENDIF ENDIF ANORM=MAX(ANORM,(ABS(W(I))+ABS(RV1(I)))) 25 CONTINUE DO 32 I=N,1,-1 IF (I.LT.N) THEN IF (G.NE.0.0) THEN DO 26 J=L,N V(J,I)=(A(I,J)/A(I,L))/G 26 CONTINUE DO 29 J=L,N S=0.0 DO 27 K=L,N S=S+A(I,K)*V(K,J) 27 CONTINUE DO 28 K=L,N V(K,J)=V(K,J)+S*V(K,I) 28 CONTINUE 29 CONTINUE ENDIF DO 31 J=L,N V(I,J)=0.0 V(J,I)=0.0 31 CONTINUE ENDIF V(I,I)=1.0 G=RV1(I) L=I 32 CONTINUE DO 39 I=N,1,-1 L=I+1 G=W(I) IF (I.LT.N) THEN DO 33 J=L,N A(I,J)=0.0 33 CONTINUE ENDIF IF (G.NE.0.0) THEN G=1.0/G IF (I.NE.N) THEN DO 36 J=L,N S=0.0 DO 34 K=L,M S=S+A(K,I)*A(K,J) 34 CONTINUE F=(S/A(I,I))*G DO 35 K=I,M A(K,J)=A(K,J)+F*A(K,I) 35 CONTINUE 36 CONTINUE ENDIF DO 37 J=I,M A(J,I)=A(J,I)*G 37 CONTINUE ELSE DO 38 J= I,M A(J,I)=0.0 38 CONTINUE ENDIF A(I,I)=A(I,I)+1.0 39 CONTINUE DO 49 K=N,1,-1 DO 48 ITS=1,30 DO 41 L=K,1,-1 NM=L-1 IF ((ABS(RV1(L))+ANORM).EQ.ANORM) GOTO 2 IF ((ABS(W(NM))+ANORM).EQ.ANORM) GOTO 1 41 CONTINUE 1 C=0.0 S=1.0 DO 43 I=L,K F=S*RV1(I) IF ((ABS(F)+ANORM).NE.ANORM) THEN G=W(I) H=SQRT(F*F+G*G) W(I)=H H=1.0/H C= (G*H) S=-(F*H) DO 42 J=1,M Y=A(J,NM) Z=A(J,I) A(J,NM)=(Y*C)+(Z*S) A(J,I)=-(Y*S)+(Z*C) 42 CONTINUE ENDIF 43 CONTINUE 2 Z=W(K) IF (L.EQ.K) THEN IF (Z.LT.0.0) THEN W(K)=-Z DO 44 J=1,N V(J,K)=-V(J,K) 44 CONTINUE ENDIF GOTO 3 ENDIF C IF (ITS.EQ.30) PAUSE 'No convergence in 30 iterations' IF (ITS.EQ.30) THEN WRITE(6,*) 'No convergence in 30 iterations' ENDIF X=W(L) NM=K-1 Y=W(NM) G=RV1(NM) H=RV1(K) F=((Y-Z)*(Y+Z)+(G-H)*(G+H))/(2.0*H*Y) G=SQRT(F*F+1.0) F=((X-Z)*(X+Z)+H*((Y/(F+SIGN(G,F)))-H))/X C=1.0 S=1.0 DO 47 J=L,NM I=J+1 G=RV1(I) Y=W(I) H=S*G G=C*G Z=SQRT(F*F+H*H) RV1(J)=Z C=F/Z S=H/Z F= (X*C)+(G*S) G=-(X*S)+(G*C) H=Y*S Y=Y*C DO 45 NM=1,N X=V(NM,J) Z=V(NM,I) V(NM,J)= (X*C)+(Z*S) V(NM,I)=-(X*S)+(Z*C) 45 CONTINUE Z=SQRT(F*F+H*H) W(J)=Z IF (Z.NE.0.0) THEN Z=1.0/Z C=F*Z S=H*Z ENDIF F= (C*G)+(S*Y) X=-(S*G)+(C*Y) DO 46 NM=1,M Y=A(NM,J) Z=A(NM,I) A(NM,J)= (Y*C)+(Z*S) A(NM,I)=-(Y*S)+(Z*C) 46 CONTINUE 47 CONTINUE RV1(L)=0.0 RV1(K)=F W(K)=X 48 CONTINUE 3 CONTINUE 49 CONTINUE RETURN END C END SVDCMP C...................................................................... C...................................................................... C SUB WRITE_HISTO SUBROUTINE WRITE_HISTO(KHISTO,HISTOFILE,NALIGN,SORTVAL) c implicit none C import INTEGER KHISTO,NALIGN REAL SORTVAL(*) CHARACTER*(*) HISTOFILE C internal c integer maxbin,maxlen c integer i,ibin,nbin(maxbin),maxpop,minpop,iend c character line*(maxlen),mark LOGICAL LERROR c mark='*' c do i=1,maxlen ; line(i:i)=mark ; enddo c do i=1,maxbin ; nbin(i)=0 ; enddo CALL OPEN_FILE(KHISTO,HISTOFILE,'NEW',LERROR) c do i=1,nalign c ibin=nint( sortval(i) / sortval(nalign) * maxbin) c IF (ibin .le. 0) THEN c ibin=1 c else IF (ibin .gt. maxbin) THEN c ibin=maxbin c endif c nbin(ibin)=nbin(ibin)+1 c enddo c c maxpop=-1 ; minpop=1000000 c do i=1,maxbin c IF (nbin(i) .gt. maxpop)maxpop=nbin(i) c IF (nbin(i) .lt. minpop)minpop=nbin(i) c enddo WRITE(KHISTO,'(A,I6)') ' number of scores: ',nalign WRITE(KHISTO,'(A,F7.2)')' minimal scores: ',sortval(1) WRITE(KHISTO,'(A,F7.2)')' maximum scores: ',sortval(nalign) WRITE(KHISTO,'(A)')'_________________________________________'// + '_____________________________________________' c do i=1,maxbin c iend=nint( (float(nbin(i)) / float(maxpop)) * float(maxlen) ) c IF (iend .gt.0 ) THEN c WRITE(khisto,'(i5,a,a)')nbin(i),' |',line(1:iend) c else c WRITE(khisto,'(i5,a)')nbin(i),' |' c endif c enddo c WRITE(khisto,*) WRITE(KHISTO,'(A)')' values: ' DO I=1,NALIGN WRITE(KHISTO,'(I5,2X,F7.2)')I,SORTVAL(I) ENDDO CLOSE(KHISTO) RETURN END C END WRITE_HISTO C...................................................................... C...................................................................... C SUB WRITE_MAXHOM_COM SUBROUTINE WRITE_MAXHOM_COM(CFILTER) c implicit none INCLUDE 'maxhom.param' INCLUDE 'maxhom.common' CHARACTER*(*) CFILTER c internal INTEGER OPTCUT,LENFILENAME,IBEG,IEND,I,J,JBEG,JEND CHARACTER COMMANDFILE*200,COMEXT*4,OUTLINE*200,LINE*35 CHARACTER COMMENTLINE*200 LOGICAL LERROR C init OUTLINE=' ' OPTCUT=110 COMMENTLINE='$!==========================================='// + '============================' COMEXT='.csh' LENFILENAME=INDEX(NAME1_ANSWER,'!')-1 IF (LENFILENAME .LE. 0) LENFILENAME=LEN(NAME1_ANSWER) CALL GETPIDCODE(NAME1_ANSWER(1:LENFILENAME),HSSPID_1) CALL LOWTOUP(COMMANDFILE_ANSWER,200) CALL STRPOS(HSSPID_1,IBEG,IEND) IF (CFILTER .EQ. ' ') THEN COMMANDFILE(1:)=HSSPID_1(IBEG:IEND)//'_maxhom'//comext ELSE COMMANDFILE(1:)=HSSPID_1(IBEG:IEND)//'_hssp'//comext ENDIF CALL OPEN_FILE(KCOM,COMMANDFILE,'NEW',LERROR) C======================================================================= C UNIX c-shell script C======================================================================= COMMENTLINE='#==========================================='// + '============================' WRITE(KCOM,'(A)')'#! /bin/csh' WRITE(KCOM,'(A)')COMMENTLINE IF (CFILTER .EQ. ' ') THEN WRITE(KCOM,'(A)')'# command file to run MAXHOM' ELSE WRITE(KCOM,'(A)')'# command file to run a PRE-FILTERED MAXHOM' ENDIF WRITE(KCOM,'(A)')'goto set_enviroment' WRITE(KCOM,'(A)')'start:' WRITE(KCOM,'(A)')COMMENTLINE WRITE(KCOM,'(A)')'# This .csh file WRITEs a temporary '// + 'command file ("MAXHOM_"process_id".temp")' WRITE(KCOM,'(A)')'# containing the answers to the MAXHOM '// + 'questions.' WRITE(KCOM,'(A)')COMMENTLINE C=================================================================== IF (CFILTER .NE.' ') THEN C get sequence 1 CALL STRPOS(NAME1_ANSWER,I,J) IF (LISTOFSEQ_1) THEN WRITE(KCOM,'(A)')'# LOOP OVER FILENAMES IN LIST' WRITE(KCOM,'(A)') + 'foreach filename ( "`cat '//NAME1_ANSWER(I:J)//'`" )' ELSE WRITE(KCOM,'(A)')'set filename = '//NAME1_ANSWER(I:J) ENDIF C get identifier WRITE(KCOM,'(A)')COMMENTLINE WRITE(KCOM,'(A)')'# GET IDENTIFIER' WRITE(KCOM,'(A)')COMMENTLINE WRITE(KCOM,'(A)')' set name1 = $filename:r' WRITE(KCOM,'(A)')' set name2 = $name1:t' C convertseq WRITE(KCOM,'(A)')COMMENTLINE WRITE(KCOM,'(A)')'# CONVERT ALL FORMATS TO FASTA' WRITE(KCOM,'(A)')' echo $filename > MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "F" >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "N" >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo $name2".y" >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "N" >> MAXHOM_$$.temp' IF (CONVERTSEQ_EXE .NE. ' ') THEN CALL STRPOS(CONVERTSEQ_EXE,IBEG,IEND) WRITE(KCOM,'(A)')' echo "run convert_seq"' WRITE(KCOM,'(A,A,A)') + ' ',CONVERTSEQ_EXE(IBEG:IEND), + ' < MAXHOM_$$.temp >& /dev/null' ELSE STOP ' ERROR: CONVERTSEQ_EXE UNDEFINED ' ENDIF WRITE(KCOM,'(A)')' rm MAXHOM_$$.temp' C run FASTA IF (CFILTER .EQ. 'FASTA') THEN WRITE(KCOM,'(A)')' echo $name2".y" > MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "S" >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "1" >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "fasta.x_"$$ >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "2000" >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo " " >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "yes" >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo " " >> MAXHOM_$$.temp' IF (FASTA_EXE .NE. ' ') THEN CALL STRPOS(FASTA_EXE,IBEG,IEND) WRITE(KCOM,'(A,A,A)')' ',FASTA_EXE(IBEG:IEND), + ' -b 2000 -d 2000 -o < MAXHOM_$$.temp > fasta.x_$$' ELSE STOP ' ERROR: FASTA_EXE UNDEFINED ' ENDIF WRITE(KCOM,'(A)')' rm MAXHOM_$$.temp' WRITE(KCOM,'(A)')' rm $name2".y"' C get filter.list WRITE(KCOM,'(A)')COMMENTLINE WRITE(KCOM,'(A)')'# EXTRACT POSSIBL HITS FROM FASTA-OUTPUT' WRITE(KCOM,'(A)')' echo "fasta.x_"$$ > MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "filter.list_"$$ >> MAXHOM_$$.temp' I=ISAFE-5 IF (I.GT.0) THEN WRITE(OUTLINE,'(A,I2)')'FORMULA+',I ELSE IF (I.EQ.0) THEN WRITE(OUTLINE,'(A)')'FORMULA' ELSE WRITE(OUTLINE,'(A,I2)')'FORMULA-',ABS(I) ENDIF CALL STRPOS(OUTLINE,I,J) WRITE(KCOM,'(A)')' echo "'//OUTLINE(I:J)// + '" >> MAXHOM_$$.temp' WRITE(KCOM,'(A,I5,A)')' echo "',OPTCUT, + '" >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "distance" >> MAXHOM_$$.temp' IF (FILTER_FASTA_EXE .NE. ' ') THEN CALL STRPOS(FILTER_FASTA_EXE,IBEG,IEND) WRITE(KCOM,'(A,A,A)')' ',FILTER_FASTA_EXE(IBEG:IEND), + ' < MAXHOM_$$.temp' ELSE STOP ' ERROR: FILTER_FASTA_EXE UNDEFINED ' ENDIF WRITE(KCOM,'(A)')' rm MAXHOM_$$.temp' WRITE(KCOM,'(A)')' rm fasta.x_$$' C rename output files if wanted IF (STRIPFILE_ANSWER.NE.'NO') THEN STRIPFILE_ANSWER='$name2"_strip.x"' ENDIF IF (long_output_ANSWER.NE.'NO') THEN LONG_OUTPUT_ANSWER='$name2"_long.x"' ENDIF IF (PLOTFILE_ANSWER.NE.'NO') THEN PLOTFILE_ANSWER='$name2"_trace.x"' ENDIF C run BLASTP ELSE IF (CFILTER .EQ. 'BLASTP') THEN IF (BLASTP_EXE .NE. ' ') THEN CALL STRPOS(BLASTP_EXE,IBEG,IEND) WRITE(KCOM,'(A)')' echo "run blastp"' WRITE(KCOM,'(A,A,A)')' ',BLASTP_EXE(IBEG:IEND), + ' swiss $name2".y" B=2000 > blast.x_$$' ELSE STOP ' ERROR: BLASTP_EXE UNDEFINED ' ENDIF WRITE(KCOM,'(A)')' rm MAXHOM_$$.temp' WRITE(KCOM,'(A)')' rm $name2".y"' WRITE(kcom,'(a)')commentline WRITE(KCOM,'(A)')'# EXTRACT HITS FROM BLASTP-OUTPUT' IF (FILTER_BLASTP_EXE .NE. ' ') THEN CALL STRPOS(FILTER_BLASTP_EXE,IBEG,IEND) CALL STRPOS(sw_current,jBEG,jEND) WRITE(KCOM,'(A,A,A,A,A)')' ', + FILTER_BLASTP_EXE(IBEG:IEND),' ', + sw_current(jbeg:jend), + ' < blast.x_$$ > filter.list_$$' WRITE(kcom,'(a)')' rm blast.x_$$' ELSE STOP ' ERROR: FILTER_BLASTP_EXE UNDEFINED ' ENDIF ENDIF ENDIF C call MAXHOM LINE='" >> MAXHOM_$$.temp ' WRITE(kcom,'(a)')commentline WRITE(KCOM,'(A)')'# -------- finally call MAXHOM -------' WRITE(kcom,'(a)')commentline WRITE(KCOM,'(A)')' echo "COMMAND NO" > MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "BATCH" >> MAXHOM_$$.temp' WRITE(KCOM,'(A)')' echo "PID: "$$ >> MAXHOM_$$.temp' IF (CFILTER .NE. ' ') THEN WRITE(KCOM,'(A)')' echo "SEQ_1 "$filename >> MAXHOM_$$.temp' WRITE(KCOM,'(A)') + ' echo "SEQ_2 filter.list_"$$ >> MAXHOM_$$.temp' ELSE CALL STRPOS(NAME1_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "SEQ_1 '//NAME1_ANSWER(I:J)//LINE CALL STRPOS(NAME2_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "SEQ_2 '//NAME2_ANSWER(I:J)//LINE ENDIF CALL STRPOS(PROFILE_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "2_PROFILES '// + PROFILE_ANSWER(I:J)//LINE CALL STRPOS(METRIC_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "METRIC '//METRIC_ANSWER(I:J)//LINE CALL STRPOS(NORM_PROFILE_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "NORM_PROFILE '// + NORM_PROFILE_ANSWER(I:J)//LINE CALL STRPOS(PROFILE_EPSILON_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "MEAN_PROFILE '// + PROFILE_EPSILON_ANSWER(I:J)//LINE CALL STRPOS(PROFILE_GAMMA_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "FACTOR_GAPS '// + PROFILE_GAMMA_ANSWER(I:J)//LINE CALL STRPOS(SMIN_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "SMIN '//SMIN_ANSWER(I:J)//LINE CALL STRPOS(SMAX_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "SMAX '//SMAX_ANSWER(I:J)//LINE CALL STRPOS(OPENWEIGHT_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "GAP_OPEN '// + OPENWEIGHT_ANSWER(I:J)//LINE CALL STRPOS(ELONGWEIGHT_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "GAP_ELONG '// + ELONGWEIGHT_ANSWER(I:J)//LINE CALL STRPOS(WEIGHT1_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "WEIGHT1 '//WEIGHT1_ANSWER(I:J)//LINE CALL STRPOS(WEIGHT2_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "WEIGHT2 '//WEIGHT2_ANSWER(I:J)//LINE CALL STRPOS(WAY3_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "WAY3_ALIGN '// + WAY3_ANSWER(I:J)//LINE CALL STRPOS(INDEL_ANSWER_1,I,J) WRITE(KCOM,'(A)')' echo "INDEL_1 '// + INDEL_ANSWER_1(I:J)//LINE CALL STRPOS(INDEL_ANSWER_2,I,J) WRITE(KCOM,'(A)')' echo "INDEL_2 '// + INDEL_ANSWER_2(I:J)//LINE CALL STRPOS(BACKWARD_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "RELIABILITY '// + BACKWARD_ANSWER(I:J)//LINE CALL STRPOS(FILTER_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "FILTER_RANGE '// + FILTER_ANSWER(I:J)//LINE CALL STRPOS(NBEST_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "NBEST '//NBEST_ANSWER(I:J)//LINE CALL STRPOS(NGLOBALHITS_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "MAXALIGN '// + NGLOBALHITS_ANSWER(I:J)//LINE CALL STRPOS(THRESHOLD_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "THRESHOLD '// + THRESHOLD_ANSWER(I:J)//LINE CALL STRPOS(SORTMODE_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "SORT '//SORTMODE_ANSWER(I:J)//LINE CALL STRPOS(HSSP_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "HSSP '//HSSP_ANSWER(I:J)//LINE CALL STRPOS(SAMESEQ_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "SAME_SEQ_SHOW '// + SAMESEQ_ANSWER(I:J)//LINE CALL STRPOS(COMPARE_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "SUPERPOS '//COMPARE_ANSWER(I:J)//LINE CALL STRPOS(PDBPATH_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "PDB_PATH '//PDBPATH_ANSWER(I:J)//LINE CALL STRPOS(PROFILEOUT_ANSWER,I,J) WRITE(KCOM,'(A)')' echo "PROFILE_OUT '// + PROFILEOUT_ANSWER(I:J)//LINE CALL STRPOS(STRIPFILE_ANSWER,I,J) IF (INDEX(STRIPFILE_ANSWER,'$name2').NE.0) THEN WRITE(KCOM,'(A)')' echo "STRIP_OUT "'// + STRIPFILE_ANSWER(I:J)// + ' >> MAXHOM_$$.temp' ELSE WRITE(KCOM,'(A)')' echo "STRIP_OUT '// + STRIPFILE_ANSWER(I:J)//LINE ENDIF CALL STRPOS(long_output_ANSWER,I,J) IF (INDEX(long_output_ANSWER,'$name2').ne.0) THEN WRITE(KCOM,'(A)')' echo "LONG_OUT "'// + long_output_ANSWER(I:J)// + ' >> MAXHOM_$$.temp' ELSE WRITE(KCOM,'(A)')' echo "LONG_OUT '// + long_output_ANSWER(I:J)//LINE ENDIF CALL STRPOS(PLOTFILE_ANSWER,I,J) IF (INDEX(PLOTFILE_ANSWER,'$name2').ne.0) THEN WRITE(KCOM,'(A)')' echo "DOT_PLOT "'//PLOTFILE_ANSWER(I:J)// + ' >> MAXHOM_$$.temp' ELSE WRITE(KCOM,'(A)')' echo "DOT_PLOT '// + PLOTFILE_ANSWER(I:J)//LINE WRITE(KCOM,'(A)')' echo "RUN" >> MAXHOM_$$.temp' ENDIF WRITE(KCOM,'(A)')' maxhom -nopar < MAXHOM_$$.temp' CALT WRITE(KCOM,'(A)')' $snice maxhom < MAXHOM_$$.temp' WRITE(KCOM,'(A)')' rm MAXHOM_$$.temp' CALL STRPOS(COREFILE,IBEG,IEND) WRITE(KCOM,'(A,A,A)')' rm ',COREFILE(IBEG:IEND),'$$' WRITE(KCOM,'(A)')' rm filter.list_$$' IF (CFILTER .NE. ' ' .AND. LISTOFSEQ_1) THEN WRITE(KCOM,'(A)')'end' ENDIF WRITE(KCOM,'(A)')'exit' WRITE(KCOM,'(A)')COMMENTLINE WRITE(KCOM,'(A)')'set_enviroment:' WRITE(KCOM,'(A)')'nohup' WRITE(KCOM,'(A)')'alias rm ''rm -f''' WRITE(KCOM,'(A)')'goto start' WRITE(KCOM,'(A)')COMMENTLINE CLOSE(KCOM) C====================================================================== WRITE(6,*)'****************************************************' CALL STRPOS(COMMANDFILE,IBEG,IEND) WRITE(6,*)' wrote command file to: ',commandfile(ibeg:iend) IF (CMACHINE .EQ. 'VMS' ) THEN WRITE(6,*)'now submit this command file to a batch queue' ELSE CALL CHANGE_MODE(COMMANDFILE,'+x',i) WRITE(6,*)'to execute this file type: ' IF (I .NE. 0) THEN WRITE(6,'(A,A)')'chmod +x ',COMMANDFILE(IBEG:IEND) ENDIF WRITE(6,'(2X,A,A)')COMMANDFILE(IBEG:IEND),' > /dev/null &' ENDIF WRITE(6,*)'****************************************************' RETURN END C END WRITE_MAXHOM_COM C...................................................................... profphd-utils-1.0.10/maxhom.param0000755015075101507510000000745312012371465016217 0ustar lkajanlkajan*----------------------------------------------------------------------* * *----------------------------------------------------------------------* C INTEGER MAXPROC,MINPROC,MAXQUEUE,MAXQUEUE_LIST PARAMETER (MAXPROC= 200) PARAMETER (MINPROC= 20) PARAMETER (MAXQUEUE= 4) PARAMETER (MAXQUEUE_LIST= 5) C---- maximal length of sequence INTEGER MAXSQ,MAXALSQ C G M K C PARAMETER (MAXSQ= 10000) PARAMETER (MAXSQ= 9999) C PARAMETER (MAXSQ= 19876) PARAMETER (MAXALSQ= (MAXSQ + (MAXSQ/10) ) ) C---- score MATRIX INTEGER MAXMAT C PARAMETER (MAXMAT= 6000000) PARAMETER (MAXMAT= 1) C---- trace-back matrix INTEGER MAXTRACE,MAXALIGNS,MAXHITS, + MAXDATABASE_BUFFER,MAXBUFFER_LINE,MAXINS, + MAXSEQBUFFER,MAXSTRBUFFER,MAXINSBUFFER, + MAXIOBUFFER,MAXINSBUFFER_LOCAL C G M K C PARAMETER (MAXTRACE= 3888888) PARAMETER (MAXTRACE= 6543210) PARAMETER (MAXALIGNS= 8765) C PARAMETER (MAXALIGNS= 12345) C PARAMETER (MAXALIGNS= 21987) PARAMETER (MAXHITS= 5000) C G M K PARAMETER (MAXDATABASE_BUFFER= 3888888) PARAMETER (MAXBUFFER_LINE= 1000) PARAMETER (MAXINS= 50000) PARAMETER (MAXSEQBUFFER= 3000000) PARAMETER (MAXSTRBUFFER= 4000000) PARAMETER (MAXIOBUFFER= 4000000) PARAMETER (MAXINSBUFFER= 4321432) C---- caution on some machine the maximal strin length is 32000 PARAMETER (MAXINSBUFFER_LOCAL= 50000) C================================================================== INTEGER MAXSTRSTATES,MAXIOSTATES, + NTRANS,NSIZE,INDEXRECLEN,MAXRECORDLEN,NASCII, + MAXCUTOFFSTEPS,MAXBREAK,MAXBOX,MAXHIST,MAXPROFAA, + MAXLENSTRING C G M K PARAMETER (MAXSTRSTATES= 4) PARAMETER (MAXIOSTATES= 4) PARAMETER (NTRANS= 26) PARAMETER (NSIZE= 12) PARAMETER (INDEXRECLEN=2*NSIZE+8+8) PARAMETER (MAXRECORDLEN= 200) PARAMETER (NASCII= 256) PARAMETER (MAXCUTOFFSTEPS= 100) PARAMETER (MAXBREAK= 50) PARAMETER (MAXBOX= 10) PARAMETER (MAXHIST= 10) PARAMETER (MAXPROFAA= 20) C br: 2000-05 PARAMETER (MAXLENSTRING= 200) c INPUT/OUTPUT UNITS INTEGER KGETSEQ,KLIS1,KLIS2,KSIM,KISO,KREL,KBRK, + KBASE,KINDEX,KDEF,KCORE,KREF,KSTP, + KPLOT,KSTAT,KWARN,KSTRUC,KTAB,KCONS, + KPROF,KLOG,KHSSP,KCOM,KHISTO,KLONG,KDEB PARAMETER (KGETSEQ=10, KLIS1=11, KLIS2= 12, KSIM= 13, KISO= 14) PARAMETER (KREL= 15, KBRK= 16, KBASE= 17, KINDEX=18, KDEF= 19) PARAMETER (KREF= 20, KSTP= 21, KPLOT= 22, KSTAT= 23, KWARN=24) PARAMETER (KSTRUC= 25, KTAB= 26, KCONS= 27, KPROF= 28, KLOG= 29) PARAMETER (KHSSP= 30, KCOM= 31, KHISTO=32, KDEB= 33, KLONG=34) PARAMETER (KCORE= 35) C---- br 2003-08: switch for new (1999) HSSP curve LOGICAL LNEWCURVE PARAMETER (LNEWCURVE= .TRUE.) C PARAMETER (LNEWCURVE= .FALSE.) profphd-utils-1.0.10/metr2st_make.f0000644015075101507510000003173112012371464016440 0ustar lkajanlkajan*----------------------------------------------------------------------* * * * FORTRAN code for program CONVERT_SEQ * * conversion of sequence and alignment formats * * * *----------------------------------------------------------------------* * * * Authors: * * * * Reinhard Schneider May, 1994 version 1.0 * * LION http://www.lion-ag/ * * D-69120 Heidelberg schneider@lion-ag.de * * * * & * * * * Burkhard Rost May, 1994 version 1.1 * * Oct, 1998 version 2.0 * * EMBL/LION http://www.embl-heidelberg.de/~rost/ * * D-69012 Heidelberg rost@embl-heidelberg.de * * * *----------------------------------------------------------------------* * * * General note: - uses library lib-maxhom.f * * * * * *----------------------------------------------------------------------* PROGRAM MAKE_METRIC * C quick hack to produce structure/IO dependent metric used in MaxHom * * * * ************************************************************************ IMPLICIT NONE C---- C---- parameters C---- INTEGER NTRANS,NSTRMAX,NIOMAX,KOUT,KIN PARAMETER (NTRANS=26,NSTRMAX=3,NIOMAX=3,KOUT=10,KIN=11) INTEGER NSTR1,NSTR2,NIO1,NIO2,IL,IH,IB,IO,IE,N1,N2,N3,N4 C---- C---- C---- CHARACTER CTRANS*(NTRANS),CSTRUC*3,CIO*3 REAL STRVAL(NSTRMAX,NSTRMAX),IOVAL(NIOMAX,NIOMAX), + VALUE(NTRANS) INTEGER IORANGE(NIOMAX,NIOMAX) REAL STRIOVAL(1:(NSTRMAX*NIOMAX),1:(NSTRMAX*NIOMAX)) CHARACTER*5 CTEMP,CH INTEGER ITEMP(3,3),IT1,IT2,IHVEC(1:9),FAC_STR,FAC_SEQ, + ISTR1,IO1,I,ISTR2,IO2,J,MTRANS,NUMARGUMENTS CHARACTER*80 FILE_METRIC_SEQ,FILE_METRIC_OUT,FILE_METRIC_IN LOGICAL LREAD_METRIC_IN,LHELP REAL SIMSEQ(NTRANS,NTRANS,NSTRMAX,NIOMAX,NSTRMAX,NIOMAX), + SIMSTR C-------------------------------------------------- C---- C---- init defaults C---- FILE_METRIC_OUT='Make_metric_new.output' FILE_METRIC_SEQ= + '/home/rost/pub/topits/mat/Maxhom_McLachlan.metric' FILE_METRIC_IN= 'Make_metric_new.input' ctrans='VLIMFWYGAPSTCHRKQENDBZX!-.' C read metric from file? LREAD_METRIC_IN=.FALSE. LREAD_METRIC_IN=.TRUE. * * C-------------------------------------------------- C---- requesting input files: C---- McLachlan (seq. metric) + file_in; file_out C-------------------------------------------------- * * CALL GET_ARG_NUMBER(NUMARGUMENTS) IF (NUMARGUMENTS.GT.0) THEN CALL GET_ARGUMENT(1,FILE_METRIC_IN) END IF IF (NUMARGUMENTS.GT.1) THEN CALL GET_ARGUMENT(2,FILE_METRIC_OUT) END IF IF (NUMARGUMENTS.GT.2) THEN CALL GET_ARGUMENT(3,FILE_METRIC_SEQ) END IF IF (NUMARGUMENTS.EQ.0) THEN WRITE(6,'(T2,A)')'---' WRITE(6,'(T2,A3,T10,A)')'----', + 'you can provide three input arguments:' WRITE(6,'(T2,A3,T10,A,T35,A)')'----','1: input metric def=', + FILE_METRIC_IN WRITE(6,'(T2,A3,T10,A,T35,A)')'----','2: output metric def=', + FILE_METRIC_OUT WRITE(6,'(T2,A3,T10,A,T35,A)')'----','3: sequence metric def=', + FILE_METRIC_SEQ WRITE(6,'(T2,A)')'---' END IF * * * * C percentage of match_struc / match_seq: C (fac_str) * match_str + (10-fac_str) * match_seq FAC_STR= 10 C number of secondary structure and IO states NSTR1= 3 NSTR2= 3 NIO1= 2 NIO2= 2 C define secondary structure states: E=beta , H=helix, L= NOT E or H C cstruc='ELH' CSTRUC='ELH' IE=INDEX(CSTRUC,'E') IL=INDEX(CSTRUC,'L') IH=INDEX(CSTRUC,'H') C define match/mismatch between secondary structures states C "ie,ie" = 3.0 ==> match between two residue in a beta strand C give a value of 3.0 STRVAL(IE,IE)= 4.0 STRVAL(IE,IL)= -1.0 STRVAL(IE,IH)= -4.0 STRVAL(IL,IE)= -1.0 STRVAL(IL,IL)= 1.0 STRVAL(IL,IH)= -1.0 STRVAL(IH,IE)= -4.0 STRVAL(IH,IL)= -1.0 STRVAL(IH,IH)= 4.0 C define IO states: B=buried , I=inside, O=ouside C cio='BIO' CIO='BO' IB=INDEX(CIO,'B') C ii=index(cio,'I') IO=INDEX(CIO,'O') C define match/mismatch between IO states C "ib,ib" = 3.0 ==> match between a buried residue with a buried residue C give a value of 3.0 IOVAL(IB,IB)= 4.0 C IOVAL(IB,II)= -2.0 IOVAL(IB,IO)= -2.0 C IOVAL(II,IB)= -2.0 C IOVAL(II,II)= 1.0 C IOVAL(II,IO)= -2.0 IOVAL(IO,IB)= -2.0 C IOVAL(IO,II)= -2.0 IOVAL(IO,IO)= 1.0 C define IO classes C "ie,ib" = beta-strand + buried are residues with a %accessibility C surface area below or equal to 5 percent (depndend on residue type) IORANGE(IE,IB)= 15 C IORANGE(IE,II)= 16 IORANGE(IE,IO)= 100 IORANGE(IL,IB)= 15 C IORANGE(IL,II)= 16 IORANGE(IL,IO)= 100 IORANGE(IH,IB)= 15 C IORANGE(IH,II)= 16 IORANGE(IH,IO)= 100 C-------------------------------------------------- C---- generate the sequence match matrix C-------------------------------------------------- C---- C---- CALL GETSIMMETRIC(NTRANS,CTRANS,NSTRMAX,NIOMAX,N1,N2,N3,N4, + CTEMP,CTEMP,ITEMP,99,FILE_METRIC_SEQ,SIMSEQ) C ----------------- C-------------------------------------------------- C define match/mismatch between secondary structures and in/out C Eb Ee Lb Le Hb He C Eb 9 2 1 -6 -1 -8 C Ee 7 -6 1 -8 -3 C Lb 6 -1 1 -6 C Le 4 -6 1 C Hb 9 2 C He 7 C C---- C E->x STRIOVAL(1,1)= 9.0 STRIOVAL(1,2)= 2.0 STRIOVAL(1,3)= 1.0 STRIOVAL(1,4)= -6.0 STRIOVAL(1,5)= -1.0 STRIOVAL(1,6)= -8.0 STRIOVAL(2,2)= 7.0 C---- C L->x strioval(3,3)= 6.0 strioval(3,4)= -1.0 strioval(4,4)= 4.0 C---- C symmetrie intuition STRIOVAL(2,3)=STRIOVAL(1,4) STRIOVAL(2,4)=STRIOVAL(1,3) STRIOVAL(2,5)=STRIOVAL(1,6) STRIOVAL(2,6)=STRIOVAL(1,5) STRIOVAL(3,5)=STRIOVAL(1,3) STRIOVAL(3,6)=STRIOVAL(1,4) STRIOVAL(4,5)=STRIOVAL(1,4) STRIOVAL(4,6)=STRIOVAL(1,3) STRIOVAL(5,5)=STRIOVAL(1,1) STRIOVAL(5,6)=STRIOVAL(1,2) STRIOVAL(6,6)=STRIOVAL(2,2) C---- C symmetrie intuition DO IT1=1,6 DO IT2=1,(IT1-1) STRIOVAL(IT1,IT2)=STRIOVAL(IT2,IT1) END DO END DO C-------------------------------------------------- C read in factors and metric file? C-------------------------------------------------- IF (LREAD_METRIC_IN) THEN OPEN(KIN,FILE=FILE_METRIC_IN,STATUS='UNKNOWN',RECL=150) write(6,*)'--- read metric in' READ(KIN,'(A)',END=22114)CH write(6,'(A,T20,A)')'---',CH DO IT1=1,NSTR1*NIO1 READ(KIN,'(A3,6I8)',END=22114)CH, + (IHVEC(IT2),IT2=1,(NSTR2*NIO2)) write(6,'(A5,I2,A1,A3,T20,6I6)')'read ',IT1,':',CH, + (IHVEC(IT2),IT2=1,(NSTR2*NIO2)) DO IT2=1,(NSTR2*NIO2) STRIOVAL(IT1,IT2)=IHVEC(IT2)/10. END DO END DO LHELP=.TRUE. DO WHILE (LHELP) CH=' ' READ(KIN,'(A)',END=22114)CH IF ( (CH(1:1).EQ.' ').OR.(CH(1:1).EQ.'*') ) THEN CONTINUE ELSEIF ( (CH(1:1).NE.'F').AND. + (CH(1:1).NE.'f') ) THEN write(6,*)'x.x ch=',ch(1:5) LHELP=.FALSE. ELSE BACKSPACE(KIN) READ(KIN,'(A8,I2)',END=22114)CH,FAC_STR END IF END DO 22114 CONTINUE CLOSE(KIN) END IF c recompute factors FAC_SEQ=10-FAC_STR C---------------------------------------------------------------------- C now summarise the setting C---------------------------------------------------------------------- c write out onto printer WRITE(6,*)'--------------------------------------------------' WRITE(6,'(T5,A)')' ' IF (LREAD_METRIC_IN) THEN WRITE(6,'(T5,A,T40,A)')'file containing metric asf.:', + FILE_METRIC_IN WRITE(6,'(T5,A,T40,A)')'output will be in:',FILE_METRIC_OUT WRITE(6,'(T5,A)')'match = n*match_struc + (10-n)*match_seq ' WRITE(6,'(T5,A,T20,I5)')'where n = ',FAC_STR WRITE(6,'(T5,A)')' ' END IF WRITE(6,*)' ' WRITE(6,*)'matrix generated:' WRITE(6,'(T10,6(A2,A1,A1,A2))') + ' ',cstruc(1:1),cio(1:1),' ',' ',cstruc(1:1),cio(2:2),' ', + ' ',cstruc(2:2),cio(1:1),' ',' ',cstruc(2:2),cio(2:2),' ', + ' ',cstruc(3:3),cio(1:1),' ',' ',cstruc(3:3),cio(2:2),' ' DO ISTR1=1,NSTR1 DO IO1=1,NIO1 IT1=(ISTR1-1)*NIO1+IO1 WRITE(6,'(A1,A1,T10,6F6.2)') + CSTRUC(ISTR1:ISTR1),CIO(IO1:IO1), + (STRIOVAL(IT1,IT2),IT2=1,(NSTR2*NIO2)) END DO END DO WRITE(6,*)' ' WRITE(6,*)'--------------------------------------------------' C---------------------------------------------------------------------- C now write into file C---------------------------------------------------------------------- OPEN(KOUT,FILE=FILE_METRIC_OUT,STATUS='UNKNOWN',RECL=250) C write header info WRITE(KOUT,'(A)')'#========================================='// + '=========================================='// + '=========================================='// + '===============' WRITE(KOUT,'(A,I3)')'STRUCTURE-STATES_1:',NSTR1 WRITE(KOUT,'(A,I3)')'STRUCTURE-STATES_2:',NSTR2 WRITE(KOUT,'(A,I3)')'I/O-STATES_1:',NIO1 WRITE(KOUT,'(A,I3)')'I/O-STATES_2:',NIO2 WRITE(KOUT,'(A)')'DSSP-STRUCTURE I/O %ACC-RANGE '// + '(<= less or equal)' C write ranges for accessibility classes DO ISTR1=1,NSTR1 DO IO1=1,NIO1 WRITE(KOUT,'(4X,A,13X,A,7X,I3)')CSTRUC(ISTR1:ISTR1), + CIO(IO1:IO1),IORANGE(ISTR1,IO1) ENDDO ENDDO C write seperator lines WRITE(KOUT,'(A)')'#========================================='// + '=========================================='// + '=========================================='// + '===============' WRITE(KOUT,'(A)')'AA STR I/O V L I M F '// + 'W Y G A P S T '// + 'C H R K Q E N '// + 'D B Z' C write metric MTRANS=22 DO I=1,MTRANS DO ISTR1=1,NSTR1 DO IO1=1,NIO1 DO ISTR2=1,NSTR2 DO IO2=1,NIO2 C--- old version: combine HEL be C SIMSTR= C + strval(istr1,istr2) + ioval(io1,io2) C--- new version: 6x6 matrix SIMSTR=STRIOVAL( ((ISTR1-1)*NIO1+IO1), + ((ISTR2-1)*NIO2+IO2) ) DO J=1,22 C F*struc + (10-F)*seq VALUE(J) = (FAC_STR/10. * SIMSTR) + + (FAC_SEQ/10. * SIMSEQ(I,J,1,1,1,1)) IF (VALUE(J) .GE. 10.0) VALUE(J)=9.99 IF (VALUE(J) .LE. -10.0) VALUE(J)=-9.99 ENDDO WRITE(KOUT,'(1X,A,2X,A,A,1X,A,A,1X,22(F5.2,1X))') + CTRANS(I:I),CSTRUC(ISTR1:ISTR1), + CSTRUC(ISTR2:ISTR2),CIO(IO1:IO1), + CIO(IO2:IO2), + (VALUE(J),J=1,22) ENDDO ENDDO ENDDO ENDDO ENDDO END profphd-utils-1.0.10/profphd-utils.spec0000644015075101507510000000172712012371464017352 0ustar lkajanlkajanSummary: profphd helper utilities Name: profphd-utils Version: 1.0.8 Release: 1 License: GPL Group: Applications/Science Source: ftp://rostlab.org/%{name}/%{name}-%{version}.tar.gz URL: http://rostlab.org/ BuildRoot: %{_tmppath}/%{name}-%{version}-root BuildRequires: gcc-gfortran %description The package provides the following binary utilities: convert_seq, filter_hssp %prep %setup -q %build make DESTDIR=%{buildroot} prefix=/usr AM_FFLAGS="-O2 -fbounds-check -Wuninitialized -Wall -Wno-unused" %install rm -rf $RPM_BUILD_ROOT make DESTDIR=%{buildroot} prefix=/usr install %clean rm -rf $RPM_BUILD_ROOT %files %defattr(-,root,root) %doc AUTHORS %doc COPYING %doc ReadMe %doc ReadMe-linux %{_mandir}/*/* %{_bindir}/* %changelog * Fri Jan 13 2012 Laszlo Kajan - 1.0.8-1 - new upstream * Tue Jun 21 2011 Laszlo Kajan - 1.0.6-2 - spec now in dist root * Sat Jun 18 2011 Laszlo Kajan - 1.0.6-1 - First rpm package