Tutorials-1.5.0/0000755000175000001560000000000012107467033012350 5ustar bneronsisTutorials-1.5.0/Sequences_formats.xml0000644000175000001560000001617512105207316016564 0ustar bneronsis Sequences_formats 1.0 Sequences formats common used formats for sequences N. Joly data formats

This document illustrates some common formats used for sequences representation.

EMBL
 ID MMVASPHOS standard; RNA; EST; 140 BP.
 AC X97897;
 DE M.musculus mRNA for protein homologous to
 DE vasodilator-stimulated phosphoprotein
 SQ Sequence 140 BP; 25 A; 58 C; 39 G; 17 T; 1 other;
    ttctcccaga agctgactct atggngaccc cgagagagac tgagcagaac 60
    ccccgcaccc ctgcacttcc aatcaggggc gccccgggag cactccccgt 120
    ccgccctccg cgcagccatg                                  140
 //
                  
FASTA
 >MMVASPHOS
 ttctcccagaagctgactctatggngaccccgagagagactgagcagaacctggagccag
 ccccgcacccctgcacttccaatcaggggcgccccgggagcactccccgtggcgcgccgc
 ccgccctccgcgcagccatg
                  
GCG
!!NA_SEQUENCE 1.0
 (No documentation)
 dna1.txt  Length: 88  Nov 22, 2001 14:38  Type: N  Check: 3818  ..

        1  TAGTCGTAGT CGGAGCGATG CTGACGATGA CGATGACGAT CGTAGCTGAT

       51  CGATCGAGCT GATGCTGATC GAGCTAGCTG ATCGATCG

                  
GDE
 #sample1
 TTCAAGAGAAACAGCGGCCAAGGAAAAGACTCGGCATGATTGTCCATAGCTTACAAAGCG
 #sample2
 TTCAAGAGAAACAGCGGCTGGGGGAAAGACTCGTCCTGATTGCCTGTAGATGGTAAAGCG

                  
GENBANK
LOCUS       HUMHBV1       130 bp    DNA         PRI     17-JUN-1993
DEFINITION  Human DNA/endogenous Hepatitis B virus (HBV) DNA, left
            host viral junction.
ACCESSION   M15770
BASE COUNT       32 a     43 c     29 g     26 t
ORIGIN
      1 agcgggcagt gcagctgctt ggacagcagg ggtgtttctt caacccaggc
     61 ctcctgtcac aacaggccca ttcaattctg aacctgcaag ccaactccaa
    121 cctcttttcc cagggggaac caaaaaccct
//

                  
IG
; comment
U03518
AACCTGCGGAAGGATCATTACCGAGTGCGGGTCCTTTGGGCCCAACCTCCCATCCGTGTC
TATTGTACCCTGTTGCTTCGGCGGGCCCGCCGCTTGTCGGCCGCCGGGGGGGCGCCTCTG
TGAGTTGATTGAATGCAATCAGTTAAAACTTTCAACAATGGATCTCTTGGTTCCGGC1
                  
NBRF
>P1;CCHU
cytochrome c [validated] - human
MGDVEKGKKIFIMKCSQCHTVEKGGKHKTGPNLHGLFGRKTGQAPGYSYTAANKNKGIIW
GEDTLMEYLENPKKYIPGTKMIFVGIKKKEERADLIAYLKKATNE*
                  
NBRF (pir)
>P1;CCHU
cytochrome c [validated] - human
MGDVEKGKKIFIMKCSQCHTVEKGGKHKTGPNLHGLFGRKTGQAPGYSYTAANKNKGIIW
GEDTLMEYLENPKKYIPGTKMIFVGIKKKEERADLIAYLKKATNE*
                  
CODATA
ENTRY           CCHU  #type complete
TITLE           cytochrome c [validated] - human
ACCESSIONS      A31764; A05676; I55192; A00001
SUMMARY         #length 105  #molecular-weight 11749  #checksum 3247
SEQUENCE
                 5        10        15        20        25        30
       1 M G D V E K G K K I F I M K C S Q C H T V E K G G K H K T G
      31 P N L H G L F G R K T G Q A P G Y S Y T A A N K N K G I I W
      61 G E D T L M E Y L E N P K K Y I P G T K M I F V G I K K K E
      91 E R A D L I A Y L K K A T N E
///
                  
RAW
ttctcccagaagctgactctatggngaccccgagagagactgagcagaacctggagccag
ccccgcacccctgcacttccaatcaggggcgccccgggagcactccccgtggcgcgccgc
ccgccctccgcgcagccatg
                  

Warning : This format cannot handle more than one sequence per file.

SWISSPROT
ID   100K_RAT STANDARD; PRT; 149 AA.
AC   Q62671;
DE   100 kDa protein (EC 6.3.2.-).
SQ   SEQUENCE 149 AA; 17004 MW; D06484B8BC29112E CRC64;
     MMSARGDFLN YALSLMRSHN DEHSDVLPVL DVCSLKHVAY VFQALIYWIK
     PQLERKRTRE LLELGIDNED SEHENDDDTS QSATLNDKDD ESLPAETGQN
     SITIRPPDDQ HLPTANTCIS RLYVPLYSSK QILKQKLLLA IKTKNFGFV
//
                  
Tutorials-1.5.0/phylip_phyml_format.xml0000644000175000001560000000567012064637766017206 0ustar bneronsis phylip_phyml_format Phylip/Phyml formats incompatibilities problem with phylip and phyml alignment formats B. Néron

Phylip format is defined as following:

"The name should be ten characters in length, filled out to the full ten characters by blanks if shorter. Any printable ASCII/ISO character is allowed in the name, except for parentheses ("(" and ")"), square brackets ("[" and "]"), colon (":"), semicolon (";") and comma (","). If you forget to extend the names to ten characters in length by blanks, the program will get out of synchronization with the contents of the data file, and an error message will result." (for complete description of phylip format see: http://bioweb2.pasteur.fr/docs/phylip/doc/main.html)

Phyml declare using Phylip format but instead it uses a variant format:

"One slight difference with PHYLIP format deals with sequence name lengths. While PHYLIP format limits this length to ten characters, PhyML can read up to hundred character long sequence names. Blanks and the symbols “(),:” are not allowed within sequence names"

These two formats are in some case non compliant. The program we use to detect sequences format and convert them is not able to use Phyml format, so it convert sequence in Phylip. When sequences name are longer than 10 characters the resulting file is not compliant to use with Phyml (the sequence names are truncated to 10 characters and there is no space between name and sequence). It exists a workaround to this problem.

  1. from an alignment in fasta format, use fastaRename, to produce 2 files:
    • one with your sequences where the name was replaced by a short id
    • one with the mapping between real names and the ids
  2. use the file with short ids in your analyses as Phyml, MorePhyml,...
  3. finally to have a tree with the original names use nw_rename and use the tree generated at the previous step and the file of mapping from the first step. It will produce a tree with the right names.
Tutorials-1.5.0/Alignment_formats.xml0000644000175000001560000001616012072520764016551 0ustar bneronsis Alignment_format 1.0 Alignment formats common used formats for aligned sequences N. Joly data formats

This document illustrates some common formats used for aligned sequences representation.

CLUSTAL
CLUSTAL W (1.82) multiple sequence alignment
MALK_ECOLI      MASVQLQNVTKAWGEVVVSKDINLDIHEGEFVVFVGPSGCGKSTL
MALK_SALTY      MASVQLRNVTKAWGDVVVSKDINLDIHDGEFVVFVGPSGCGKSTL
MALK_ENTAE      MASVQLRNVTKAWGDVVVSKDINLEIQDGEFVVFVGPSGCGKSTL
MALK_PHOLU      MSSVTLRNVSKAYGETIISKNINLEIQEGEF--------------
                *:** *:**:**:*:.::**:***:*::***
MALK_ECOLI      LRMIAGLETITSGDLACRRLHKEPGV
MALK_SALTY      LRMIAGLETITSGDLACRRLHQEPGV
MALK_ENTAE      LRMIAGLETVTSGDL-----------
MALK_PHOLU      LRM-----------------------
                ***

Warning : Names must not contain spaces or exceed 30 characters.

FASTA
>MALK_ECOLI
MASVQLQNVTKAWGEVVVSKDINLDIHEGEFVVFVGPSGCGKSTLLRMIA
GLETITSGDLACRRLHKEPGV
>MALK_SALTY
MASVQLRNVTKAWGDVVVSKDINLDIHDGEFVVFVGPSGCGKSTLLRMIA
GLETITSGDLACRRLHQEPGV
>MALK_ENTAE
MASVQLRNVTKAWGDVVVSKDINLEIQDGEFVVFVGPSGCGKSTLLRMIA
GLETVTSGDL-----------
>MALK_PHOLU
MSSVTLRNVSKAYGETIISKNINLEIQEGEF--------------LRM--
---------------------
MEGA
#mega
!Title Multiple Sequence Alignment;
#MALK_ECOLI
MASVQLQNVTKAWGEVVVSKDINLDIHEGEFVVFVGPSGCGKSTLLRMIA
GLETITSGDLACRRLHKEPGV
#MALK_SALTY
MASVQLRNVTKAWGDVVVSKDINLDIHDGEFVVFVGPSGCGKSTLLRMIA
GLETITSGDLACRRLHQEPGV
#MALK_ENTAE
MASVQLRNVTKAWGDVVVSKDINLEIQDGEFVVFVGPSGCGKSTLLRMIA
GLETVTSGDL-----------
#MALK_PHOLU
MSSVTLRNVSKAYGETIISKNINLEIQEGEF-------------------
---------------------
MSF
!!AA_MULTIPLE_ALIGNMENT 1.0
PileUp of: @pep.list
 
 msf.seq MSF: 55 Type: P Nov 22, 2001 11:02 Check: 2529 ..
 
 Name: m73237 Len: 655 Check: 7493 Weight: 1.00
 Name: l28824 Len: 655 Check: 5456 Weight: 1.00
 Name: u04379 Len: 655 Check: 9580 Weight: 1.00
//
        1                                                   50
m73237  ~~~~~MADSA NHLPFFFGQI TREEAEDYLV QGGMSDGLYL LRQSRNYLGG
l28824  MASSGMADSA NHLPFFFGNI TREEAEDYLV QGGMSDGLYL LRQSRNYLGG
u04379  ~~~~~MPDPA AHLPFFYGSI SRAEAEEHLK LAGMADGLFL LRQCLRSLGG
 
        51
m73237  ~~~~~
l28824  ~~~~~
u04379  AACG*        

Warning : This format cannot handle more than 500 sequences in a single alignment.

NEXUS
#NEXUS
begin data;
  dimensions ntax=2 nchar=89;
  format datatype=Protein interleave gap=- missing='.';
  matrix
[           1                                               50]
btdDm       -------AQQQQHHLHMQQAQHH-----------LHLSH------QQAQQ
TSp1        --------------------AEH-----------PSLR--------GTPL
[           51                          80]
btdDm       YACPICSKKFSRSDHLSKHKKTHF------
TSp1        FACPICNKRFMRSDHLAKHVKTHN------
    ;
endblock;

Warning : Text enclosed in brackets is considered as comment, and thus ignored.

PHYLIP

Sequential (PHYLIPS):

     2   109
ATISA1    GSPNLYQ-GGRKPWHSINFICAHDGFTLADLVTYNNKNNLANGEENNDG
          ENHNYSWNCGEEGDFASISVKRLRKRQMRNFFVSLMVSQGVPMIYMGDE
          YGHTKGGN---
POTISA1   GSPNLYQKGGRKPWNSINFVCAHDGFTLADLVTYNNKHNLANGEDNKDG
          ENHNNSWNCGEEGEFASIFVKKLRKRQMRNFFLCLMVSQGVPMIYMGDE
          YGHTKGGN---

Interleaved (PHYLIPI):

     2   109
ATISA1    GSPNLYQ-GGRKPWHSINFICAHDGFTLADLVTYNNKNNLANGEENND
POTISA1   GSPNLYQKGGRKPWNSINFVCAHDGFTLADLVTYNNKHNLANGEDNKD
          GENHNYSWNCGEEGDFASISVKRLRKRQMRNFFVSLMVSQGVPMIYMG
          GENHNNSWNCGEEGEFASIFVKKLRKRQMRNFFLCLMVSQGVPMIYMG
          DEYGHTKGGN---
          DEYGHTKGGN---

Warning : Species names may not contain characters `( ) : ; , [ ]' and exceed 10 characters.

STOCKHOLM
# STOCKHOLM 1.0
MALK_ECOLI MASVQLQNVTKAWGEVVVSKDINLDIHEGEFVVFVGPSGCGKSTLLRMIA
MALK_SALTY MASVQLRNVTKAWGDVVVSKDINLDIHDGEFVVFVGPSGCGKSTLLRMIA
MALK_ENTAE MASVQLRNVTKAWGDVVVSKDINLEIQDGEFVVFVGPSGCGKSTLLRMIA
MALK_PHOLU MSSVTLRNVSKAYGETIISKNINLEIQEGEF-------------------
MALK_ECOLI RCHLFREDGTACRRLHKEPGV
MALK_SALTY RCHLFREDGSACRRLHQEPGV
MALK_ENTAE ---------------------
MALK_PHOLU ---------------------
//
Tutorials-1.5.0/registration.xml0000644000175000001560000000354512024317347015613 0ustar bneronsis registration 1.0 benefits to register benefits to register H. Ménager

What are the benefits of a registered account?

You can use the portal without registering at all. This means that you are welcome as a guest. The results of your jobs will be available for a limited time, even if you exit the portal for a short while. However, you will not be able to use them if you connect from another place.

Registration enables you to store your bookmarked data and results on the server, without any time limit (as long as we can provide enough disk space of course!). You can also connect using the same e-mail and password from any place.

How to register?

Click on the sign-in link located on the top right of the page, then click on register. Enter your e-mail, and pick a password (this password is specific to the portal). Once submitted, you will receive an e-mail requesting your confirmation for this registration. You can then save your data, and access them from anywhere.

Tutorials-1.5.0/setpbystep.xml0000644000175000001560000001252012024317347015274 0ustar bneronsis stepbystep 1.0 Step by step tutorial how to use Mobyle step by step H. Ménager

Step by step tutorial

Welcome to Mobyle, a portal to run bioinformatics analyses. This tutorial is interactive: it helps you to open forms, navigate through the portal and fill some fields. At any time, you can come back here by clicking on the Tutorials tab above.

Running an analysis

Let's start by running a first analysis: predict a protein membrane topology with the toppred program. Clickinghere opens the toppred form. The first field is for the protein sequence. You can paste the following data:

>sp|P02914|MALK_ECOLI MALTOSE/MALTODEXTRIN TRANSPORT ATP-BINDING PROTEIN MALK.
            MASVQLQNVTKAWGEVVVSKDINLDIHEGEFVVFVGPSGCGKSTLLRMIAGLETITSGDL
            FIGEKRMNDTPPAERGVGMVFQSYALYPHLSVAENMSFGLKPAGAKKEVINQRVNQVAEV
            LQLAHLLDRKPKALSGGQRQRVAIGRTLVAEPSVFLLDEPLSNLDAALRVQMRIEISRLH
            KRLGRTMIYVTHDQVEAMTLADKIVVLDAGRVAQVGKPLELYHYPADRFVAGFIGSPKMN
            FLPVKVTATAIDQVQVELPMPNRQQVWLPVESRDVQVGANMSLGIRPEHLLPSDIADVIL
            EGEVQVVEQLGNETQIHIQIPSIRQNLVYRQNDVVLVEEGATFAIGLPPERCHLFREDGT
            ACRRLHKEPGV
         
Then launch the program by clicking on the Run button.

When finished, the results are available under the Jobs tab. The output starts by some details on the job, followed by the output files. You can return to the form anytime by clicking on its tab, at the top of the page.

Using the databox

You can input data either by copy/paste, file loading, database fetch, or selection of the output from a previous analysis. Let us try this with some examples.

- file loading

If you don't have any sequence file, you can save the following sequences into a file on your disk.
Then open the clustalw-multialign form and load your sequences file. Then run the program.

- database

Go back to our toppred form, hit the "DB" checkbox and enter: MALK_ECOLI into the input field. This fetchs you the corresponding Swissprot entry.

- previous analysis

Open the protpars form and hit the "Result" checkbox. A menu is available providing access to the sequences previously aligned by Clustalw. Note that you could also have directly input your alignment from the clustalw result : go to the alignment, select "protpars-infile" from the "pipe it to" menu below the box, and hit "go".

Retrieving results later

Results will be available as long as the server keeps the corresponding files. To access to the results of a previously run job, use the "Jobs" drawer on the left.

Searching for programs

The left panel displays 4 "drawers", in order to help you navigate in the portal. The first one provides a list of the available programs, classified by bioinformatics domain. By default, all programs are listed, but you can filter this long list by providing a search term in the search for field. For instance, enter the term "matrix" - notice that you don't need to hit return to start the search. Only the program having the word "matrix" in their description will be displayed. When mousing over the name of the program, the information bullet displays the matched text.