# c4_t5_10m **Repository Path**: hf-datasets/c4_t5_10m ## Basic Information - **Project Name**: c4_t5_10m - **Description**: Mirror of https://huggingface.co/datasets/hlillemark/c4_t5_10m - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-06-15 - **Last Updated**: 2025-06-15 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README --- dataset_info: features: - name: input_ids sequence: int32 - name: labels sequence: int64 splits: - name: train num_bytes: 54681600000 num_examples: 10240000 - name: validation num_bytes: 53400000 num_examples: 10000 download_size: 22999280634 dataset_size: 54735000000 --- # Dataset Card for "c4_t5_10m" [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)