Help - Abalign Web Server

Help / Documentation

How to use the Abalign Web Server

Abalign is a comprehensive multiple sequence alignment platform for B-cell receptor immune repertoires. Follow these steps to analyze your antibody sequences:

Step-by-Step Guide

Input Sequences:
- Paste your antibody sequences in FASTA format into the text area, OR
- Upload a FASTA file containing your sequences
- Click "Load Example" to see the required format
- Supports both protein and nucleotide sequences
Configure Parameters:
- Task Name: Optional name to identify your analysis
- Numbering Scheme: IMGT, Kabat, Chothia, or Martin
- Species: Select from 20+ supported species including human, mouse, macaque, etc.
- Chain Type: Heavy or Light chain
- Nucleotide Sequence: Check if input is nucleotide (will be translated to protein)
- Sequence Deduplication: Remove duplicate sequences
Clonotype Parameters (Optional):
- CDR3 Identity Threshold: Set the threshold for CDR3 sequence identity to group sequences into the same clonotype (Range: 0-1, Default: 1.0). A value of 1.0 means CDR3 sequences must be identical, while lower values (e.g., 0.6) allow sequences with 60 percent CDR3 identity to be grouped together
- Shannon Entropy Threshold: Set the threshold for Shannon entropy at each CDR3 position (Range: 0-10, Default: 1.0). Positions with entropy above this threshold will be represented with ? in the consensus sequence. Higher values provide more stringent filtering
Notification (Optional): Provide email address for completion notification
Submit Analysis: Click "Run Abalign" to start processing
Results: You'll be redirected to a processing page. Bookmark the result URL for later access

Input Format

FASTA Format Example

>Seq1_Heavy
EVQLVESGGGLVQPGGSLRLSCAASGFTFSSYAMSWVRQAPGKGLEWVSAISGSGGSTYYADSVKGRFTISRDNSKNTLYLQMNSLRAEDTAVYYCAKDRLGRGNFDYWGQGTLVTVSS
>Seq2_Heavy
EVQLVETGGGLVQPGGSLRLSCAASGFTFSDYYMYWVRQAPGKGLEWVSAINSGGRSTYYPDSVKGRFTISRDNSKNTLYLQMNSLRAEDTAVYYCARGPYYAMDYWGQGTLVTVSS

Requirements

Standard FASTA format with sequence headers starting with ">"
Both protein and nucleotide sequences are supported
Maximum file size: 16 MB
No limit on number of sequences, but processing time increases with size

Output Files Description

Upon completion, you will receive a ZIP file containing multiple analysis results:

Universal Output Files (All Analyses)

File Name	Description
`alignment_output.fas`	Multiple sequence alignment of antibody variable regions
`alignment_output.fas.overview.csv`	Main results file - Comprehensive sequence information and statistics
`alignment_output.fas.temp.txt`	Antibody alignment separated by framework regions (FRs) and complementarity determining regions (CDRs) with * delimiters
`alignment_output.fas.number.txt`	MSA unified column numbering file based on antibody numbering scheme, providing standardized position numbers for each column in the alignment
`v_gene_info.txt`	V and J gene assignment information
`v_gene_info.txt.vabundance.txt`	V and J gene abundance statistics
`v_gene_info.txt.tempv.txt`	V and J gene temporary analysis file
`v_gene_info.txt.clonotype.csv`	Clonotype analysis results
`v_gene_info.txt.clonotype_index.csv`	Clonotype and corresponding sequence indices
`v_gene_info.txt.clonotype_seqs.csv`	Comprehensive clonotype and sequence information

Additional Files for Nucleotide Input

File Name	Description
`translated_protein.fasta`	Protein sequences translated from nucleotide input using six reading frames
`alignment_output.fas.pro`	Multiple sequence alignment of translated protein variable regions
`alignment_output.fas.temp.txt.pro`	Protein alignment separated by FRs and CDRs with * delimiters
`alignment_output.fas.number.txt.pro`	MSA unified column numbering file for protein sequences based on antibody numbering scheme, providing standardized position numbers for each column in the protein alignment

Recommendation: Start with alignment_output.fas.overview.csv for comprehensive sequence analysis results.

Processing Information

Processing Time: Varies based on sequence number and length (typically 1-10 minutes)
Concurrent Jobs: Maximum 4 jobs can run simultaneously
Results Storage: Results are temporarily stored and will be available for download after completion
Status Monitoring: The processing page automatically updates when your job is complete

Note: Make sure to bookmark or save the result URL provided on the processing page, as it contains your unique job identifier.

Important Data Retention Policy

Your uploaded data is not permanently stored - Input files are deleted immediately after processing
Results are available for 14 days only - All result files will be automatically deleted after 14 days
Download your results promptly - We recommend downloading your results immediately after completion
No data recovery - Once deleted, results cannot be recovered