# sagemaker-spark-container **Repository Path**: mirrors_aws/sagemaker-spark-container ## Basic Information - **Project Name**: sagemaker-spark-container - **Description**: The SageMaker Spark Container is a Docker image used to run data processing workloads with the Spark framework on Amazon SageMaker. - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-09-18 - **Last Updated**: 2026-03-21 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # SageMaker Spark Container ## Spark Overview Apache Spark™ is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing. ## SageMaker Spark Container The SageMaker Spark Container is a Docker image used to run batch data processing workloads on Amazon SageMaker using the Apache Spark framework. The container images in this repository are used to build the pre-built container images that are used when running Spark jobs on Amazon SageMaker using the SageMaker Python SDK. The pre-built images are available in the Amazon Elastic Container Registry (Amazon ECR), and this repository serves as a reference for those wishing to build their own customized Spark containers for use in Amazon SageMaker. For the list of available Spark images, see [Available SageMaker Spark Images](available_images.md). ## License This project is licensed under the Apache-2.0 License. ## Usage in the SageMaker Python SDK The simplest way to get started with the SageMaker Spark Container is to use the pre-built images via the SageMaker Python SDK. [Amazon SageMaker Processing — sagemaker 2.5.3 documentation](https://sagemaker.readthedocs.io/en/stable/amazon_sagemaker_processing.html#amazon-sagemaker-processing) ## Getting Started With Development To get started building and testing the SageMaker Spark container, you will have to setup a local development environment. See instructions in [DEVELOPMENT.md](./DEVELOPMENT.md) ## Contributing To contribute to this project, please read through [CONTRIBUTING.md](./CONTRIBUTING.md)