# INTEGRATE-Neo **Repository Path**: scnet-lib/INTEGRATE-Neo ## Basic Information - **Project Name**: INTEGRATE-Neo - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2023-10-24 - **Last Updated**: 2023-10-24 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # INTEGRATE-Neo INTEGRATE-Neo is a gene fusion neoantigen discovering tool using next-generation sequencing data. It is written in C++ and Python. - Python - Perl - awk - GCC If not, please install these languages or tools. You may also need to install some prerequisite tools: - [BWA](https://sourceforge.net/projects/bio-bwa) - [HLAminer v1.3](http://www.bcgsc.ca/platform/bioinfo/software/hlaminer) - [NetMHC v4.0](http://www.cbs.dtu.dk/services/NetMHC/output.php) HLAminer and NetMHC are also included in the vendor directory here. To compile the C++ part of this pipeline, you may need to install [CMAKE](https://cmake.org/) ### Installation Download INTEGRATE-Neo at https://github.com/ChrisMaherLab/INTEGRATE-Neo. Run the installation script: ```sh $ cd INTEGRATE-Neo-V-1.2.0 $ chmod +x install.sh $ ./install.sh -o /opt/bin/ ``` Note that you can choose wherever you like to install the software. It can be different from "/opt/bin/". Now you have installed: - integrate-neo together with the modules of integrate-neo that can be used as standalone tools: - fusionBedpeAnnotator - fusionBedpeSubsetter - runHLAminer - HLAminerToTsv - runAddNetMHC4Result - runNetMHC4WithSMCRNABedpe A setup.ini and a rule.txt file are also at your destination directory now. If you don't like them to be there, copy them to the place you like. But remember to use the --setup-file and --rule-file options to run integrate-neo if you moved them. ### setup Remember to edit the setup.ini file before your first running the pipeline. The one in the installation packages are using example paths like "/SOME/PATH/...". For the HLAminer reference HLA alleles, i.e. HLA_ABC_CDS.fasta, remember to index it with bwa before the first run. ### input If you type the following (or python ./integrate-neo.py --help): ```sh $ ./integrate-neo.py ``` you can see the 14 parameters and explanations. The following are the required options: -1/--fastq1 -2/--fastq2 -f/--fusion-bedpe -r/--reference -g/--gene-model The --fastq[1/2] and --reference options are clear enough, the FASTQ and FASTA formats for sequencing reads and human reference genome. The --fusion-bedpe option requires a BEDPE format for gene fusions. This BEDPE format follows the standardized format provided by The ICGC-TCGA DREAM Somatic Mutation Calling - RNA Challenge ([SMC-RNA](http://dreamchallenges.org/)). The --gene-model option requires a gene annotation genePhred file. Download the gtf file from Ensembl: GRCh37, e.g., v75: ftp://ftp.ensembl.org/pub/release-75/gtf/homo_sapiens/Homo_sapiens.GRCh37.75.gtf.gz GRCh38, e.g., v86: ftp://ftp.ensembl.org/pub/release-86/gtf/homo_sapiens/Homo_sapiens.GRCh38.86.gtf.gz and run the following command for v75: ```sh $ gunzip Homo_sapiens.GRCh37.75.gtf.gz $ ./gtfToGenePred -genePredExt -geneNameAsName2 Homo_sapiens.GRCh37.75.gtf Homo_sapiens.GRCh37.75.genePred ``` for v86: ```sh $ gunzip Homo_sapiens.GRCh38.86.gtf.gz. $ ./gtfToGenePred -genePredExt -geneNameAsName2 Homo_sapiens.GRCh38.86.gtf Homo_sapiens.GRCh38.86.genePred ``` FASTA files can also be downloaded at Ensembl: v75: ftp://ftp.ensembl.org/pub/release-75/fasta/homo_sapiens/dna/Homo_sapiens.GRCh37.75.dna_sm.primary_assembly.fa.gz v86: ftp://ftp.ensembl.org/pub/release-86/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz ### output The output is in BEDPE format, the first 11 columns follows the SMC-RNA format. columns 12-19 are: - Epitope sequence - Epitope Affinity (nanoMolar) - HLA allele - HLA category - HLA score - HLA e-value - HLA confidence ### Important The chromosome names in the reference genome, the gene models, and the fusions should be consistent. ### Examples Examples are provided for you to test the code. ### Enjoy! ### Release notes: 12-23-2016: INTEGRATE-Neo v 1.2.0 updated BedpeAnnotator to v 0.2.0, which includes a new column for transcript Ids, a new column for lengths of nucleotides in the coding regions at 5p transcripts, a new column for whether the peptides are in-frame, and a new column for whether the fusion transcript follows canonical dinucleotides. 01-17-2017: INTEGRATE-Neo v 1.2.1 updated BedpeAnnotator to v 0.2.1, which includes a bug fixing.