# TableLoRA **Repository Path**: mirrors_microsoft/TableLoRA ## Basic Information - **Project Name**: TableLoRA - **Description**: Code for ACL'25 main paper "TableLoRA: Low-rank Adaptation on Table Structure Understanding for Large Language Models" - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-08-06 - **Last Updated**: 2025-08-30 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # TableLoRA: Low-rank Adaptation on Table Structure Understanding for Large Language Models

Tabular data are crucial in many fields and their understanding by large language models (LLMs) under high parameter efficiency paradigm is important. However, directly applying parameter-efficient fine-tuning (PEFT) techniques to tabular tasks presents significant challenges, particularly in terms of better table serialization and the representation of twodimensional structured information within a one-dimensional sequence. To address this, we propose TableLoRA, a module designed to improve LLMs’ understanding of table structure during PEFT. It incorporates special tokens for serializing tables with special token encoder and uses 2D LoRA to encode low-rank information on cell positions. Experiments on four tabular-related datasets demonstrate that TableLoRA consistently outperforms vanilla LoRA and surpasses various table encoding methods tested in control experiments. These findings reveal that TableLoRA, as a tablespecific LoRA, enhances the ability of LLMs to process tabular data effectively, especially in low-parameter settings, demonstrating its potential as a robust solution for handling tablerelated tasks. ## Quick Start Our implementation builds upon the [Llama Factory](https://github.com/hiyouga/LLaMA-Factory/tree/main) framework with the following key modifications: 1. **Data Preprocessing**: - Added special tokens to input data before training in Llama Factory. - Use code from the [`table_preprocess`](table_preprocess/) directory. ```shell python table_preprocess.py --dataset_name --prompt_tuning True ``` 2. **Table LoRA Data Processing**: - Modified table_lora processing to recognize special tokens and include `position_id` as model input. - Replace corresponding functions in `src/llamafactory/data/` with our implementations in the [`data`](data/) folder. 3. **Table LoRA Integration**: - Extended PEFT to support table LoRA. - Add code from our [`table_lora`](table_lora/) directory to `src/llamafactory/`. - Initialize with this code at entry points: ```python from table_lora.table_lora import load_table_lora load_table_lora() ``` After implementing these changes, you can train table LoRA using the modified Llama Factory framework. ## Citation If you find this repository useful, please considering giving ⭐ or citing: ``` @inproceedings{he2025tablelora, title = "{T}able{L}o{RA}: Low-rank Adaptation on Table Structure Understanding for Large Language Models", author = "He, Xinyi and Liu, Yihao and Zhou, Mengyu and He, Yeye and Dong, Haoyu and Han, Shi and Yuan, Zejian and Zhang, Dongmei", editor = "Che, Wanxiang and Nabende, Joyce and Shutova, Ekaterina and Pilehvar, Mohammad Taher", booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)", month = jul, year = "2025", address = "Vienna, Austria", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2025.acl-long.1090/", pages = "22376--22391", ISBN = "979-8-89176-251-0", } ``` ## Contributing This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com. When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA. This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments. ## Trademarks This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow [Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general). Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.