Help / Documentation
How to use the Abalign Web Server
Abalign is a comprehensive multiple sequence alignment platform for B-cell receptor immune repertoires. Follow these steps to analyze your antibody sequences:
Step-by-Step Guide
- Input Sequences:
- Paste your antibody sequences in FASTA format into the text area, OR
- Upload a FASTA file containing your sequences
- Click "Load Example" to see the required format
- Supports both protein and nucleotide sequences
- Configure Parameters:
- Task Name: Optional name to identify your analysis
- Numbering Scheme: IMGT, Kabat, Chothia, or Martin
- Species: Select from 20+ supported species including human, mouse, macaque, etc.
- Chain Type: Heavy or Light chain
- Nucleotide Sequence: Check if input is nucleotide (will be translated to protein)
- Sequence Deduplication: Remove duplicate sequences
- Clonotype Parameters (Optional):
- CDR3 Identity Threshold: Set the threshold for CDR3 sequence identity to group sequences into the same clonotype (Range: 0-1, Default: 1.0). A value of 1.0 means CDR3 sequences must be identical, while lower values (e.g., 0.6) allow sequences with 60 percent CDR3 identity to be grouped together
- Shannon Entropy Threshold: Set the threshold for Shannon entropy at each CDR3 position (Range: 0-10, Default: 1.0). Positions with entropy below this threshold will be represented with ? in the consensus sequence. Lower values provide more stringent filtering
- Notification (Optional): Provide email address for completion notification
- Submit Analysis: Click "Run Abalign" to start processing
- Results: You'll be redirected to a processing page. Bookmark the result URL for later access
Input Format
FASTA Format Example
>Seq1_Heavy
EVQLVESGGGLVQPGGSLRLSCAASGFTFSSYAMSWVRQAPGKGLEWVSAISGSGGSTYYADSVKGRFTISRDNSKNTLYLQMNSLRAEDTAVYYCAKDRLGRGNFDYWGQGTLVTVSS
>Seq2_Heavy
EVQLVETGGGLVQPGGSLRLSCAASGFTFSDYYMYWVRQAPGKGLEWVSAINSGGRSTYYPDSVKGRFTISRDNSKNTLYLQMNSLRAEDTAVYYCARGPYYAMDYWGQGTLVTVSS
Requirements
- Standard FASTA format with sequence headers starting with ">"
- Both protein and nucleotide sequences are supported
- Maximum file size: 16 MB
- No limit on number of sequences, but processing time increases with size
Output Files Description
Upon completion, you will receive a ZIP file containing multiple analysis results:
Universal Output Files (All Analyses)
File Name | Description |
---|---|
alignment_output.fas |
Multiple sequence alignment of antibody variable regions |
alignment_output.fas.overview.csv |
Main results file - Comprehensive sequence information and statistics |
alignment_output.fas.temp.txt |
Antibody alignment separated by framework regions (FRs) and complementarity determining regions (CDRs) with * delimiters |
alignment_output.fas.number.txt |
MSA unified column numbering file based on antibody numbering scheme, providing standardized position numbers for each column in the alignment |
v_gene_info.txt |
V and J gene assignment information |
v_gene_info.txt.vabundance.txt |
V and J gene abundance statistics |
v_gene_info.txt.tempv.txt |
V and J gene temporary analysis file |
v_gene_info.txt.clonotype.csv |
Clonotype analysis results |
v_gene_info.txt.clonotype_index.csv |
Clonotype and corresponding sequence indices |
v_gene_info.txt.clonotype_seqs.csv |
Comprehensive clonotype and sequence information |
Additional Files for Nucleotide Input
File Name | Description |
---|---|
translated_protein.fasta |
Protein sequences translated from nucleotide input using six reading frames |
alignment_output.fas.pro |
Multiple sequence alignment of translated protein variable regions |
alignment_output.fas.temp.txt.pro |
Protein alignment separated by FRs and CDRs with * delimiters |
alignment_output.fas.number.txt.pro |
MSA unified column numbering file for protein sequences based on antibody numbering scheme, providing standardized position numbers for each column in the protein alignment |
Recommendation: Start with
alignment_output.fas.overview.csv
for comprehensive sequence analysis results.
Processing Information
- Processing Time: Varies based on sequence number and length (typically 1-10 minutes)
- Concurrent Jobs: Maximum 4 jobs can run simultaneously
- Results Storage: Results are temporarily stored and will be available for download after completion
- Status Monitoring: The processing page automatically updates when your job is complete
Note: Make sure to bookmark or save the result URL provided on the processing page, as it contains your unique job identifier.
Important Data Retention Policy
- Your uploaded data is not permanently stored - Input files are deleted immediately after processing
- Results are available for 14 days only - All result files will be automatically deleted after 14 days
- Download your results promptly - We recommend downloading your results immediately after completion
- No data recovery - Once deleted, results cannot be recovered