The MAPGAPS program version 1.0.1: July 27, 2009.  Written by:
   Andrew F. Neuwald
     The Institute for Genome Sciences and 
     the Department of Biochemistry & Molecular Biology,
     University of Maryland School of Medicine, 801 West Baltimore St.
     BioPark II, Room 617, Baltimore, MD 21201

Uncompress mapgaps.tar.gz by typing "gunzip mapgaps1_0_1.tar.gz"
untar the file by typing "tar xvf mapgaps1_0_1.tar"

QUICK START: just run the 'run_example' script in the example directory to see how
    MAPGAPS works.   

DETAILS: The mapgaps1_0_1/bin directory contains the following MAPGAPS procedures (compiled 
    for Linux x86_64 (64 bit)):

 fa2cma:  Procedure for converting a fasta-formatted multiple sequence alignment 
          into cma-format, which is the format required to create a array of multiple
          sequence alignments--the type of input file required by the run_map procedure. 
          Each cma file follows the convention used by PSI-BLAST, where the first sequence
          serves as a master sequence against which the remaining sequences are aligned.
          Thus each amino acid residue in the first sequence corresponds to a column in the 
          alignment whereas each deletion '-' in the first sequence corresponds 
          to insertions in the alignment. For these reasons, the first sequence is 
          typically a consensus sequence.

 press:   A simple routine for removing from a fasta file extraneous newline characters,
          which fa2cma input files must not contain.

             press command line syntax: press < infile > outfile.

 run_map: The MAP procedure for creating a multiple-profile alignment from an array of 
          cma-formatted multiple sequence alignments (infile.cma) and a corresponding 
          cma-formatted template alignment (infile.tpl). The template alignment consists 
          of multiple consensus sequences: a consensus sequence of the template itself 
          (the first sequence in the alignment) followed by other consensus sequences, 
          one for each of the alignments in the array of alignments. The order of the
          consensus sequence in the template must be the same as in the array and the 
          first sequence in each of these alignments must be identical to the corresponding 
          consensus sequence in the template.  The MAP procedure outputs a multiple-profile 
          alignment file (<infile>.mpa), which is required by the GAPS procedure.

          The run_map procedure can also generate an array of 'excluded' profiles 
          from an optional input array of excluded profile alignments (infile.xpa). This 
          requires the command line option -exclude. The output excluded profiles and
          corresponding query sequences are placed into the files infile.xup and infile.xpq,
          respectively.

 run_gaps: GAPS procedure for searching either a small set or an entire database of 
          protein sequences for matches against an input multiple-profile alignment
          (defined by the <infile>.mpa + <infile>.tpl files), which serves as the query.
          It can also search a database using only a template alignment, in which case
          it will create a *.mpa file that can be used for a subsequent search. 

 cma2fa: procedure for converting a cma-formatted multiple sequence alignment into 
         fasta-format. This is useful for converting the output alignment created by 
         the run_gaps program into standard fasta format.

 cma2rtf: routine for converting a cma file into a rich text format file, which can be 
         viewed using MS Word. 

 run_convert: routine for mapping a family template file onto a superfamily template file.

The script "run_example" (in the example directory) demonstrates the use of all of these 
         procedures (except for the press routine).