# texera **Repository Path**: mirrors_apache/texera ## Basic Information - **Project Name**: texera - **Description**: No description available - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-08-07 - **Last Updated**: 2025-08-10 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README

Texera - Collaborative Data Science and AI/ML Using Workflows

texera-logo
Texera supports scalable data computation and enables advanced AI/ML techniques.
"Collaboration" is a key focus, and we enable an experience similar to Google Docs, but for data science.

Official Site | Publications | Video | Blog | Getting Started

Static Badge Static Badge Static Badge Static Badge Static Badge Static Badge Static Badge

# Goals * Provide data science as cloud services; * Provide a browser-based GUI to form a workflow without writing code; * Allow non-IT people to access data science; * Support collaborative data science; * Allow users to interact with the execution of a job; * Support huge volumes of data efficiently. # Workflow GUI The Texera interface supports real-time collaboration on data science projects, allowing seamless sharing of data and workflows with easy access to AI/ML techniques and efficient management of public and private resources. The workflow in the use case shown below includes data cleaning, ML model training, and validation. ![texera-screenshot](https://github.com/user-attachments/assets/4384b8f5-3a9a-4bbc-a804-1dadd156ebb3) # Publications (Computer Science) * (5/2025) **Responsive Retrieval of Consistent States in Pipelined Executions of Dataflows** Shengquan Ni, and Chen Li _To appear in HILDA Workshop at SIGMOD 2025_ * (11/2024) **IcedTea: Efficient and Responsive Time-Travel Debugging in Dataflow Systems** Shengquan Ni, Yicong Huang, Zuozhi Wang, and Chen Li _To appear in VLDB 2025_ * (8/2024) **Pasta: A Cost-Based Optimizer for Generating Pipelining Schedules for Dataflow DAGs** Xiaozhen Liu, Yicong Huang, Xinyuan Lin, Avinash Kumar, Sadeem Alsudais, and Chen Li _To appear in SIGMOD 2025_ * (7/2024) **Texera: A System for Collaborative and Interactive Data Analytics Using Workflows** Zuozhi Wang, Yicong Huang, Shengquan Ni, Avinash Kumar, Sadeem Alsudais, Xiaozhen Liu, Xinyuan Lin, Yunyan Ding, and Chen Li _In VLDB 2024, Scalable Data Science track_ | [PDF](https://www.vldb.org/pvldb/vol17/p3580-wang.pdf) | [Slides](https://chenli.ics.uci.edu/files/vldb2024-texera-presentation.pdf) * (3/2024) **Demonstration of Udon: Line-by-line Debugging of User-Defined Functions in Data Workflows** Yicong Huang, Zuozhi Wang, and Chen Li _In SIGMOD 2024 **Best Demo Runner-Up Award🏆**_ | [PDF](https://dl.acm.org/doi/10.1145/3626246.3654756) * (2/2024) **Data Science Tasks Implemented with Scripts versus GUI-Based Workflows:** The Good, the Bad, and the Ugly Alexander K Taylor, Yicong Huang, Junheng Hao, Xinyuan Lin, Xiusi Chen, Wei Wang, and Chen Li _In DataPlat Workshop at ICDE 2024_ | [PDF](https://ieeexplore.ieee.org/abstract/document/10555112) | [Slides](https://chenli.ics.uci.edu/files/icde2024-dataplat-workshop.pdf)
Expand All * (8/2023) **Building a Collaborative Data Analytics System: Opportunities and Challenges** Zuozhi Wang, Chen Li _In Tutorial at VLDB 2023_ | [PDF](https://www.vldb.org/pvldb/vol16/p3898-wang.pdf) | [Slides](https://chenli.ics.uci.edu/files/vldb2023-texera-tutorial.pdf) * (8/2023) **Udon: Efficient Debugging of User-Defined Functions in Big Data Systems with Line-by-Line Control** Yicong Huang, Zuozhi Wang, and Chen Li _In SIGMOD 2024_ | [PDF](https://dl.acm.org/doi/10.1145/3626712) | [Slides](https://chenli.ics.uci.edu/files/sigmod2024-udon-presentation.pdf) * (8/2023) **Improving Iterative Analytics in GUI-Based Data-Processing Systems with Visualization, Version Control, and Result Reuse** Sadeem Alsudais Ph.D. Thesis | [PDF](https://sadeemsaleh.github.io/Sadeem_phd_thesis.pdf) * (7/2023) **Using Texera to Characterize Climate Change Discussions on Twitter During Wildfires** Shengquan Ni, Yicong Huang, Jessie W. Y. Ko, Alexander Taylor, Xiusi Chen, Avinash Kumar, Sadeem Alsudais, Zuozhi Wang, Xiaozhen Liu, Wei Wang, Suellen Hopfer, and Chen Li _In Data Science Day at KDD 2023_ * (7/2023) **Raven: Accelerating Execution of Iterative Data Analytics by Reusing Results of Previous Equivalent Versions** Sadeem Alsudais, Avinash Kumar, and Chen Li _In HILDA Workshop at SIGMOD 2023_ | [PDF](https://dl.acm.org/doi/10.1145/3597465.3605219) * (6/2023) **Texera: A System for Collaborative and Interactive Data Analytics Using Workflows** Zuozhi Wang Ph.D. Thesis | [PDF](https://zuozhiw.github.io/Zuozhi_Wang_UCI_PhD_Thesis.pdf) * (12/2022) **Towards Interactive, Adaptive and Result-aware Big Data Analytics** Avinash Kumar Ph.D. Thesis | [PDF](https://arxiv.org/abs/2212.07096) * (9/2022) **Fries: Fast and Consistent Runtime Reconfiguration in Dataflow Systems with Transactional Guarantees** Zuozhi Wang, Shengquan Ni, Avinash Kumar, and Chen Li _In VLDB 2023_ | [PDF](https://www.vldb.org/pvldb/vol16/p256-wang.pdf) | [Slides](https://chenli.ics.uci.edu/files/vldb2023-fries.pdf) * (7/2022) **Drove: Tracking Execution Results of Workflows on Large Datasets** Sadeem Alsudais _In the Ph.D. Workshop at VLDB 2022_ | [PDF](http://ceur-ws.org/Vol-3186/paper_10.pdf) * (6/2022) **Demonstration of Accelerating Machine Learning Inference Queries with Correlative Proxy Models** Zhihui Yang, Yicong Huang, Zuozhi Wang, Feng Gao, Yao Lu, Chen Li, and X. Sean Wang _In VLDB 2022_ | [PDF](https://www.vldb.org/pvldb/vol15/p3734-yang.pdf) * (6/2022) **Demonstration of Collaborative and Interactive Workflow-Based Data Analytics in Texera** Xiaozhen Liu, Zuozhi Wang, Shengquan Ni, Sadeem Alsudais, Yicong Huang, Avinash Kumar, and Chen Li _In VLDB 2022_ | [PDF](https://www.vldb.org/pvldb/vol15/p3738-liu.pdf) | [Demo Video](https://youtu.be/2gfPUZNsoBs) * (4/2022) **Optimizing Machine Learning Inference Queries with Correlative Proxy Models** Zhihui Yang, Zuozhi Wang, Yicong Huang, Yao Lu, Chen Li, and X. Sean Wang _In VLDB 2022_ | [PDF](https://www.vldb.org/pvldb/vol15/p2032-yang.pdf) * (7/2020) **Demonstration of Interactive Runtime Debugging of Distributed Dataflows in Texera** Zuozhi Wang, Avinash Kumar, Shengquan Ni, and Chen Li _In VLDB 2020_ | [PDF](http://www.vldb.org/pvldb/vol13/p2953-wang.pdf) | [Video](https://www.youtube.com/watch?v=SP-XiDADbw0) | [Slides](https://docs.google.com/presentation/d/14U6RPZfeb8Ho0aO2HsCSc8lRs6ul6AxEIm5gpjeVUYA/edit?usp=sharing) * (1/2020) **Amber: A Debuggable Dataflow system based on the Actor Model** Avinash Kumar, Zuozhi Wang, Shengquan Ni, and Chen Li _In VLDB 2020_ | [PDF](http://www.vldb.org/pvldb/vol13/p740-kumar.pdf) | [Video](https://www.youtube.com/watch?v=T5ShFRfHmgI) | [Slides](https://docs.google.com/presentation/d/1v8G9lDmfv4Ff2YWyrGfo_9iMQVF4N8a-4gO4H-K6rCk/edit?usp=sharing) * (4/2017) **A Demonstration of TextDB: Declarative and Scalable Text Analytics on Large Data Sets** Zuozhi Wang, Flavio Bayer, Seungjin Lee, Kishore Narendran, Xuxi Pan, Qing Tang, Jimmy Wang, and Chen Li _In ICDE 2017_ **Best Demo award** | [PDF](https://chenli.ics.uci.edu/files/icde2017-textdb-demo.pdf) | [Video](https://github.com/Texera/texera/wiki/Video)
# Publications (Interdisciplinary): * (2/2025) **DS4ALL: Teaching High-School Students Data Science and AI/ML Using the Texera Workflow Platform as a Service** Jiadong Bai, Xiaozhen Liu, Anthony Cuturrufo, Alexander Kundu Taylor, Jeehyun Hwang, Mingyu Derek Ma, Xinyuan Lin, Yanqiao Zhu, Yicong Huang, Yunyan Ding, Wei Wang, and Chen Li _To appear in [Data Science Education K-12: Research to Practice Annual Conference 2025](https://web.cvent.com/event/d641bd9f-6c99-4cbc-951b-33b1ca05d4ed/summary)_ * (7/2024) **Brain Image Data Processing Using Collaborative Data Workflows on Texera** Yunyan Ding, Yicong Huang, Pan Gao, Andy Thai, Atchuth Naveen Chilaparasetti, M. Gopi, Xiangmin Xu, and Chen Li _In Frontiers Neural Circuits_ | [PDF](https://doi.org/10.3389/fncir.2024.1398884) * (1/2024) **Wording Matters: The Effect of Linguistic Characteristics and Political Ideology on Resharing of COVID-19 Vaccine Tweets** Judith Borghouts, Yicong Huang, Suellen Hopfer, Chen Li, and Gloria Mark _In TOCHI 2024_ | [PDF](https://dl.acm.org/doi/pdf/10.1145/3637876) * (1/2024) **How the Experience of California Wildfires Shape Twitter Climate Change Framings** Jessie W. Y. Ko, Shengquan Ni, Alexander Taylor, Xiusi Chen, Yicong Huang, Avinash Kumar, Sadeem Alsudais, Zuozhi Wang, Xiaozhen Liu, Wei Wang, Chen Li, and Suellen Hopfer _In Climatic Change 2024_ | [PDF](https://link.springer.com/content/pdf/10.1007/s10584-023-03668-0.pdf) * (11/2023) **The Marketing and Perceptions of Non-Tobacco Blunt Wraps on Twitter** Joshua U. Rhee, Yicong Huang, Aurash J. Soroosh, Sadeem Alsudais, Shengquan Ni, Avinash Kumar, Jacob Paredes, Chen Li, and David S. Timberlake _In Substance Use & Misuse 2023_ | [PDF](https://www.tandfonline.com/doi/epdf/10.1080/10826084.2023.2280572?needAccess=true)
Expand All * (3/2023) **Understanding Underlying Moral Values and Language Use of COVID-19 Vaccine Attitudes on Twitter** Judith Borghouts, Yicong Huang, Sydney Gibbs, Suellen Hopfer, Chen Li, and Gloria Mark _In PNAS Nexus 2023_ | [PDF](https://academic.oup.com/pnasnexus/article-pdf/2/3/pgad013/49435858/pgad013.pdf) * (10/2022) **Public Opinions Toward COVID-19 Vaccine Mandates: A Machine Learning-Based Analysis of U.S. Tweets** Yawen Guo, Jun Zhu, Yicong Huang, Lu He, Changyang He, Chen Li, and Kai Zheng _In AMIA 2022_ | [PDF](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10148373/pdf/1066.pdf) * (9/2021) **The Social Amplification and Attenuation of COVID-19 Risk Perception Shaping Mask-Wearing Behavior: A Longitudinal Twitter Analysis** Suellen Hopfer, Emilia J. Fields, Yuwen Lu, Ganesh Ramakrishnan, Ted Grover, Quishi Bai, Yicong Huang, Chen Li, and Gloria Mark _In PLOS ONE 2021_ | [PDF](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0257428) * (4/2021) **Why Do People Oppose Mask Wearing? A Comprehensive Analysis of U.S. Tweets During the COVID-19 Pandemic** Lu He, Changyang He, Tera Leigh Reynolds, Qiushi Bai, Yicong Huang, Chen Li, Kai Zheng, and Yunan Chen _In JAMIA 2021_ | [PDF](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7989302/pdf/ocab047.pdf)
# Getting Started * For users, visit [Guide to Use Texera](https://github.com/Texera/texera/wiki/Getting-Started). * For developers, visit [Guide to Develop Texera](https://github.com/Texera/texera/wiki/Guide-for-Developers). Texera was formally known as "TextDB" before August 28, 2017. # Acknowledgements This project is supported by the National Science Foundation under the awards [IIS-1745673](https://www.nsf.gov/awardsearch/showAward?AWD_ID=1745673), [IIS-2107150](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2107150), AWS Research Credits, and Google Cloud Platform Education Programs. * NIH NIDDK This project is supported by an NIH NIDDK award. * Yourkit [Yourkit](https://www.yourkit.com/) has given an open source license to use their profiler in this project. # Citation Please cite Texera as ``` @article{DBLP:journals/pvldb/WangHNKALLDL24, author = {Zuozhi Wang and Yicong Huang and Shengquan Ni and Avinash Kumar and Sadeem Alsudais and Xiaozhen Liu and Xinyuan Lin and Yunyan Ding and Chen Li}, title = {Texera: {A} System for Collaborative and Interactive Data Analytics Using Workflows}, journal = {Proc. {VLDB} Endow.}, volume = {17}, number = {11}, pages = {3580--3588}, year = {2024}, url = {https://www.vldb.org/pvldb/vol17/p3580-wang.pdf}, timestamp = {Thu, 19 Sep 2024 13:09:37 +0200}, biburl = {https://dblp.org/rec/journals/pvldb/WangHNKALLDL24.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } ```