# iLLMAC **Repository Path**: gausshuang/iLLMAC ## Basic Information - **Project Name**: iLLMAC - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-10-12 - **Last Updated**: 2024-10-12 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # iLLMAC: instruction-tuned LLM for Assessment of Cancer ## Introduction Inspired by the success of large language model in natural language understanding, we herein present a LLM-based model – instruction-tuned LLM for Assessment of Cancer (iLLMAC) – that can detect cancer using cfDNA end-motif profiles. We developed this model with cfDNA sequencing data curated from 2451 individuals. The sequencing modalities include whole genome sequencing, bisulfite sequencing and 5-hydroxymethylcytosine sequencing. We evaluated the performance of the model in the diagnosis of cancer and detection of HCC with internal- and external-testing sets. We demonstrated that iLLMAC is able to achieve high detection accuracy on different modalities of cfDNA data. Besides the development of iLLMAC, our study presents a new paradigm for cfDNA-based cancer diagnosis. ## System requirements - Operating systems: CentOS 7. - [Python](https://docs.conda.io/en/latest/miniconda.html) (version == 3.7). - [PyTorch](https://pytorch.org) (version == 1.13.1+cu116). - [transformers](https://huggingface.co/docs/transformers/index) (version == 4.28.1). This example was tested with the following environment. However, it should work on the other platforms. ## Installation guide - Following instruction from [miniconda](https://docs.conda.io/en/latest/miniconda.html) to install Python. - Use the following command to install required packages. ```bash # Install with GPU support. Check https://pytorch.org for more information. #+The following cmd install PyTorch compiled with cuda 118. pip install torch --index-url https://download.pytorch.org/whl/cu118 # If GPU not available, install the PyTorch compiled for CPU. pip install torch --index-url https://download.pytorch.org/whl/cpu # Install transformers, tokenizers and prettytable pip install transformers==4.28.1 tokenizers==0.13.3 prettytable ``` - The installation process will take about an hour. This heavily depends on your network bandwidth. ## Demo - Clone `iLLMAC` locally from Github ```bash git clone https://github.com/deeplearningplus/iLLMAC.git gzip -d ./data/* ``` - Instructions to train iLLMAC: ```bash bash iLLMAC_train.sh ``` The trained model will be saved in `out` when the above command finishes running. We uploaded a pretrained model in [BaiduDisk](https://pan.baidu.com/s/1ZjZTFRkdpbOUsfFjCtH3Mg?pwd=1234)(Password:1234) for this tutorial. Download it, and put it in `llama/7b-32` folder. - Instructions to evaluate iLLMAC: ```bash bash iLLMAC_predict.sh ``` ## How to run on your own data prepare the training data in the same format as `data/train_data_points-v2.1-64.json` and run `iLLMAC_train.sh`.