From: IN%"POSTMASTER@EMBL.BITNET" "General PostMaster" 7-FEB-1990 17:31:04.36 To: HARPER@cc.Helsinki.FI CC: Subj: Automatic response to : GET SOFTWARE:BIOBIT.2 Received: from jnet-daemon by cc.Helsinki.FI; Wed, 7 Feb 90 17:30 EET DST Received: From EMBL(NETSERV) by FINUHB with Jnet id 6487 for HARPER@FINUH; Wed, 7 Feb 90 17:30 O Date: Wed, 07 Feb 90 16:04:11 From: EMBL Network File Server Subject: Automatic response to : GET SOFTWARE:BIOBIT.2 To: HARPER@cc.Helsinki.FI Reply-to: General PostMaster Organisation: European Molecular Biology Laboratory Postal-address: Meyerhofstrasse 1, 6900 Heidelberg, W. Germany 2222222222 222 2222222222 2222222222 222 2222222222222 2222222222 2222222222 2222222222 222 222 222 222 222 222 222 222 222 222 222222222 222 222 222 222222222 222 222 222222222 222 222 222 222222222 222 222 222 222 222 222 222 222 222 222 222 2222222222 222 2222222222 2222222222 222 222 2222222222 222 2222222222 2222222222 222 222 No 2 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -= INDEPENDENT NEWSLETTER PRODUCED BY HELSINKI UNIVERSITY =- << EDITED BY ROBERT HARPER >> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% If you are reading this edition of BIOBIT did you ever stop and wonder how it came to you... magic isn't it? Well back in the dim past of biological networking (say half a year ago) there were three separate sources of information BIONET, BIOTECH and SEQNET. These have now been amalgamated into a "super" net called BIOSCI which takes care of the distribution of all the postings that you receive in your mail box. But BIOSCI is only the PUBLIC face of what is happening on the networks... the tip of the iceberg. There is alot of activity that remains un-noticed because it is hidden from the Internet/EARN/BITNET/NETNORTH user. There is more going on in the world that what you happen to read in your mailbox. This version of BIOBIT has a look under the surface, and reports on what happens behind the scenes at BIONET. There is an article by Dave Kristofferson from BIONET and then an interview which looks at some aspects of how BIONET functions. The main point to arise from the interview is why is there not a European equivalent of BIONET? There is a proverb which says that a fool can ask more questions than seven wise men can answer... so here are my questions for the European community to answer. 1) Is there a need for a central molecular biology computing resource in Europe? 2) How would it be financed? 3) Where would it be situated? 4) Who would be responsible for its running and upkeep? 5) Who would be able to use it, and at what cost? If you have answers to ANY of these questions then you can post them e-MAIL to HARPER@FINFUN or BIOBIT@COM.KPO.FI if you want to keep it private, and if you want to discuss the issue upfront then reply to this forum at your nearest BIOSCI node. Or for LIFESCI aficionados you can now send e-MAIL to RPRLSCI@TECHNION with SUBMIT BIOBIT as the first line of your text. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% << The Dave Kristofferson Article >> The BIONET National Computer Resource for Molecular Biology David Kristofferson, Ph.D. BIONET Resource Manager (kristoff@net.bio.net) BIONET is the national computer network for molecular biology. It is run as a non-profit service for the scientific community by IntelliGenetics, Inc. with funding provided by the Division of Research Resources at the National Institutes of Health (Grant Number RR01685-06) and also by the individual resource users. BIONET is used by over 800 laboratories in the U.S., Canada, Europe, and Japan comprising a total of over 2800 researchers. These investigators have access to some of the most powerful software available for nucleic acid and protein sequence analysis. BIONET also provides scientists with recent releases of the major molecular biology databanks including those for nucleic acid and protein sequences. Finally, the BIONET community has access to an extensive electronic mail and bulletin board system which puts them in contact with other users of the Resource as well as scientists on other electronic networks around the world. The cost of the service to academic or non-profit researchers in the United States is $400 per year per laboratory. This sum covers ALL costs including telecommunications with the BIONET DEC 2065 computer via the Telenet or Compuserve public data networks and unlimited processing time on the computer. (Users outside the U.S. are not assessed the $400 fee but must pay their own telecommunication costs.) Because most U.S. colleges and universities have access to a local Telenet or Compuserve number, no additional phone charges are incurred when contacting the BIONET computer. BIONET also distributes free copies of the public-domain Kermit communications software. Thus, beyond paying the subscription fee, all a scientist needs is a personal computer, workstation, or terminal, and a modem to access an impressive amount of software, information, and communications facilities. NIH regulations require that each principal investigator have his/her own account, but subaccounts can be set up at no extra charge for subinvestigators in the same lab. Recently, BIONET has received a generous donation of equipment from Sun Microsystems. This will enable the resource to retire the aging DEC 2065 and provide increased compute power to its user community. The new BIONET central resource will consist of a Sun 3/280 file server together with a network of diskless Sun 3/60's. This configuration will allow for future expansion as BIONET usage grows. In addition scientists with Sun or VAX computers can establish "BIONET satellite" computers. These scientists will do most of their computing locally but will still have network access to BIONET for the purpose of exchanging electronic mail and bulletin board notices. The majority of the software on BIONET consists of the commercial IntelliGenetics sequence analysis programs. The programs are used for analyzing nucleic acid and protein sequences. There are programs for managing large-scale DNA sequencing projects and several programs that search the major sequence databases to find sequence similarities and draw the resulting alignments. Two new programs have recently been added to BIONET: GENALIGN for aligning several protein or nucleic acid sequences simultaneously and DDMATRIX for graphical comparison of sequence similarities. The software also can be used to analyze restriction enzyme data. The MAP program determines the order of fragments in restriction maps. The CLONER program uses map data graphically to plan cloning experiments. Assistance in using the software is available by calling BIONET's technical support number (415)324-GENE. Besides the IntelliGenetics programs on BIONET, there is additional software donated by the academic community. Among the programs are Roufa's sequence analysis package for the PC, Zuker's RNA-folding program, Pearson's FASTP, FASTN, and FASTA database searching programs, and Kanehisa's "Ideas" package. A library of free PC software is available to users. Besides molecular biology software the lending library includes communications software, utilities, and even a free editor program (MicroEMACS). BIONET has prepared a version of Pearson's FASTA program to run as a mail server on Sun fileservers. This allows users to conduct sequence database searches by submitting a query sequence in an electronic mail message. BIONET provides access to the latest releases of the GenBank and EMBL nucleic acid sequence databases in addition to the PIR and SWISS-PROT protein sequence databanks. Dr. Richard Roberts of the Cold Spring Harbor Laboratory provides BIONET with his comprehensive restriction enzyme database. IntelliGenetics maintains databanks of cloning vectors and consensus sequence patterns for use with its programs. Finally, several smaller databases are available on the system. One of BIONET's main purposes is to foster communication among the scientists. Researchers can exchange information, manuscripts, data, etc., with anyone else on an electronic network by using the electronic mail facility. Electronic mail can be sent to non-BIONET users via the ARPANET and BITNET computer networks. Besides individual exchanges through the mail, investigators can contact a large number of scientists by sending messages to one of the numerous electronic bulletin boards. Bulletin boards are available for requesting experimental reagents, for discussing scientific topics such as oncogenes, gene expression, plant molecular biology, or molecular evolution, etc., and also for obtaining information about computer hardware and software. Recently BIONET joined together with other computer centers to form an international bulletin board network called BIOSCI. The network of bulletin boards has been expanded to include contributions submitted via the University of Uppsala in Sweden, the SERC Daresbury Laboratory in the U.K., and the Univeristy College in Dublin, Ireland. Bulletins can be posted to either BIONET or any of the other three sites. Each site automatically redistributes its postings to the others as well as to many readers in their respective regions. This makes it even easier for scientists on the EARN and JANET networks to post and receive bulletin board messages. The bulletins are also available from BIONET on the USENET network for computer systems running the UNIX operating system. The usefulness of the BIONET resource has been reflected in its steady growth. Between 20 to 40 new labs join per month. Sixteen Telenet ports, six Compuserve ports and six direct dial ports are available to access the system. Unfortunately, the demand for time on the system has reached the point where it is causing a slowdown in response time during peak hours. Sun's equipment donation is helping to alleviate this problem. The new Sun configuration will allow the Resource to add new computing hardware as demand increases. This will allow the resource to continue to expand and enhance its role as a facilitator of important discoveries in molecular biology. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Well that was the article and since I am a person who only gets mail droping into my mailbox, and who has had no hands-on experiance of running anything on the BIONET. I decided to ask a few more questions. I extracted some passages from the above article and asked Dave Kristofferson to comment on them. Here is the first "electronic hot pistol" interview %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% <> >> BIONET also provides scientists with recent releases of the >> major molecular biology databanks including those for nucleic acid and protein sequences. > 1) What are the databases and how are they used? > Who is the contact person? The GenBank and EMBL nucleic acid sequence databases are available on BIONET as well as the SWISS-PROT and PIR (NBRF) protein sequence databases. A database of restriction enzymes is supplied by Richard Roberts of Cold Spring Harbor. IntelliGenetics (IG) supplies its Vectorbank (tm) database of cloning vector sequences and restriction maps as well as KeyBank (tm), a database of consensus and other sequence information for use with the IG QUEST database pattern searching program. In addition to the above, the "Genetic Variations of Drosophila" database from John Lindsley, the LiMB database of molecular biology databases (Los Alamos), an SV40 mutant database (Jim Pipas), and a list of laboratory protocols (Dr. Bajwa, a BIONET user) are on-line. Access to the databases is limited to direct logins to the system and requests for accounts can be directed to the electronic address bionet@bionet-20.bio.net. > > Finally, the BIONET community has access to an > > extensive electronic mail and bulletin board system which puts > > them in contact with other users of the Resource as well as > > scientists on other electronic networks around the world. > > 2) What does the BBoard look like from the inside... example > perhaps. (BITNET users only get the messages dropping into there > mailboxes every now and then) BIONET users can read bboards by calling the appropriate bboard file while in the MM mail program. Alternately, they can "subscribe" to a bulletin board, i.e., arrange to see any new messages at login. In this later format message headers are presented as follows at login: Reading file: METHODS-AND-REAGENTS Tue, 11 Oct 88 09:23 +1200 CHCHMEDS%otago.ac.nz@RELAY.CS.NET Lambda Cloning vector (419 chars; more?) If the user decides to read the message he/she pushes Y or the spacebar and the text of the message scrolls across the screen a page at a time. Pressing N skips to the next message without displaying the text. > > The majority of the software on BIONET consists of the commercial > > IntelliGenetics sequence analysis programs. > > 3) Short summary of the programmes that can be used would be useful > and who to contact for support if you can not get the thing > to work. A SUMMARY OF THE INTELLIGENETICS PROGRAMS (last update 8-23-88 - D.K.) PROGRAM NAME FUNCTION ------------ -------- [B GENED Entry (from keyboard, files, or digitizer) and editing of nucleic acid and protein sequences. FINDSEQ Searches for and retrieves sequences based on accession numbers, author names, species names, organism names, locus names, and keywords. MAP Generates and displays all possible restriction maps consistent with restriction enzyme cleavage data. SIZER Calculates restriction fragment lengths using data from digest gels. GEL Merges overlapping regions from sequencing gels into a consensus sequence. Maintains a history of a sequencing project and produces sequence data files for use in other programs. SEQ Analyzes nucleic acid sequences. Finds restriction sites, internal repeats, palindromes, open reading frames, translates DNA, analyzes base composition and codon usage, finds homology between and aligns two similar sequences. PEP Analyzes polypeptide sequences. Reverse translates proteins and searches for possible restriction sites, does Chou & Fasman secondary structure prediction, determines hydropathicity, amino acid composition, and molecular weight, searches for homology between and aligns two similar sequences. CLONER Models complex recombinant DNA experiments and provides detailed recordkeeping of plasmid constructions. Predicts the results of restriction enzyme digestions. Tests the results of insertion direction and insertional inactivation. Adds biological sites and regions to recombinant DNA maps. Creates and edits restriction maps of vectors. If used with a Tektronix 4014 terminal emulator (see DDMATRIX below) CLONER will produce graphical output of plasmid restriction maps, etc. IFIND Rapidly searches databases for sequences which are homologous to a given sequence. Saves the results of a search, displays histograms of similarity scores, and aligns two chosen sequences. QUEST Rapidly searches databases for specified patterns of characters, keywords, sequences, nucleotide patterns, etc. Handles Boolean (AND OR) search patterns. Has a "language" for handling ambiguous patterns, i.e., allows multiple possibilities at specified positions in the search pattern. GENALIGN A powerful multiple sequence alignment program which can align up to 49 protein or nucleic acid sequences and generate a consensus sequence. DDMATRIX A dot-matrix homology program for aligning two sequences. Can work on an alphanumeric terminal, but a Tektronix 4014 emulator such as found in the communications software packages Smarterm 240 or PC-Plot IV (IBM PC) or Versaterm (Mac) will allow better graphical output. Technical support is available Monday through Friday either by phone or e-mail to bionet@bionet-20.bio.net. > > BIONET has prepared a version of Pearson's FASTA program to run as > > a mail server on Sun fileservers. This allows users to conduct > > sequence database searches by submitting a query sequence in an > > electronic mail message. > > 4) Short sample session of some BIONET programme (Most sessions are longer than one would want to read in a mail message. The following just starts up a program and shows the HELP facility.) @seq > O < O| |O IntelliGenetics > O < SEQ - Sequence Analysis System For information enter one of these commands after "SEQ:": Introduction - Gives an overview of this program ? - Lists commands Help - Lists commands with short explanations Help topics - Shows a list of informative topics Help help - Explains how to use on-line assistance SEQ: help Here is a menu of all valid "SEQ: " commands: ALIgn - The rigorous SEQ similarity search Base-comp - Counts the numbers of each base in a sequence CODon-tables - Counts codon frequencies in each reading frame COMment - Displays comments from a specified sequence file DEFault-parameters - Sets all SEQ parameters to default values DELete - Delete sequences from SEQ DIRectory - List files in your or any other directory DINucleotides - Counts dinucleotides in both frames combined DOuble-strand-print - Displays DNA sequences and their complementary strands GEned - Enters the GENED program to edit or enter a sequence GRipe - Send suggestions or complaints to IntelliGenetics staff HElp - Print information about a specified command or topic Introduction - An overview of SEQ and how to use it LExicon - Finds unique oligonucleotides LIst - Lists the sequence names with short descriptions LOad - Enters sequences from files Match - Sets bases equivalent to each other in SEARCH and ALIGN Quit - Leaves the SEQ program RECord - Copies the output of SEQ commands to a file REStore-parameters - Replaces current parameter settings with those from a file RETrieve - Loads a single sequence from a specified file RIch - Find regions containing high levels of sets of two bases SAve-parameters - Writes current parameter settings in a file SEarch - Finds regions of DNA that are similar or complementary SHow-parameters - Lists all the current SEQ parameter settings SINgle-strand-print - Displays the specified sequences SITe - Searches for restriction sites, shows maps and fragment lists SPlice - Combines or fragments specified sequences TRAnslate - Translates DNA sequences into peptide sequences TRInucleotides - Counts trinucleotides in all frames combined Write - Writes sequences or portions of sequences in a file Help TOPIcs - Typing "Help TOPIcs" will list some help topics which are not commands. SEQ: quit No changes made since last save, file(s) not written. It's 10 October 1988 17:55:18 Do svedanya! @ > > One of BIONET's main purposes is to foster communication among the > > scientists. Researchers can exchange information, manuscripts, > > data, etc., with anyone else on an electronic network by using the > > electronic mail facility. > > 5) Again I suspect that the MM system is used for PRIVATE > messaging which never hits the PUBLIC sector (BIOSCI for > example). It would be nice to know that network activity in > PRIVATE is as equally active as PUBLIC activity... and are there > any European users? MM is used a lot for private messages, far more than for public postings. I can't give you accurate statistics other than to say that the MM program is used about 4000 times per month on the system. I have mixed emotions about encouraging European users to get on to our system. Our computer facilities were getting saturated just trying to handle our load in the Western Hemisphere. Our usual experience with European accounts is that people request accounts and then don't use them because of the high telecommunications costs. The response on the system varies dramatically from country to country in Europe. Some people have told me that it is just like using a computer in the next room while others found it intolerable. It would make far more sense to set up counterparts to BIONET in Europe, Japan, etc., and I would be happy to let people examine our system for that purpose. Economically it doesn't make sense for people on your side of the pond to connect all the way over here to compute. >> I have mixed emotions about encouraging European users to get on to our system. > 6) So what would it need for a European system to be set up? The EMBL is supposed to be setting up some kind of BIONET-like facility and the SERC lab at Daresbury is doing likewise although I really don't know what stage either of these facilities are in. When we consider the costs to the USA user the $400 is only charged U.S. users and goes to cover their telecommunications costs. They have unlimited access to the system for this one annual fee (no additional phone bills or anything else), but we found that an average figure of $400 per year per lab covered the entire U.S. telecommunications bills since we were able to negotiate volume rates with the Telenet and Compuserve public data networks. In this respect BIONET is the best offer available in the U.S. and is why we have always been overburdened and overworked here. I can not say what the European situation would be. Users outside of the U.S. don't pay us anything for their accounts, but have to cover their own telecommunications costs. The operating expenses of BIONET are paid for by the NIH (BIONET is non-profit) and our grant was for about $3.7 million US over the last five years. Anyone who does this in Europe will have to come up with some source of funding. I can forewarn you though that anyone opening their machine to the outside has to do it with care or else their current users will have all their CPU cycles robbed from them. In addition you will be inundated with requests to sign up for the system and constant questions and demands from users when you are up and going. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% END OF BIOBIT No 2