# cleanvision **Repository Path**: data_factory/cleanvision ## Basic Information - **Project Name**: cleanvision - **Description**: No description available - **Primary Language**: Unknown - **License**: AGPL-3.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-04-16 - **Last Updated**: 2024-04-16 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README
CleanVision automatically detects potential issues in image datasets like images that are: blurry, under/over-exposed, (near) duplicates, etc.
This data-centric AI package is a quick first step for any computer vision project to find problems in the dataset, which you want to address before applying machine learning.
CleanVision is super simple -- run the same couple lines of Python code to audit any image dataset!
[](https://cleanvision.readthedocs.io/en/latest/)
[](https://pypi.org/pypi/cleanvision/)
[](https://pypi.org/pypi/cleanvision/)
[](https://pypi.org/pypi/cleanvision/)
[](https://codecov.io/gh/cleanlab/cleanvision)
[](https://cleanlab.ai/slack)
[](https://twitter.com/CleanlabAI)
[](https://cleanlab.ai/studio/?utm_source=github&utm_medium=readme&utm_campaign=clostostudio)
## Installation
```shell
pip install cleanvision
```
## Quickstart
Download an example dataset (optional). Or just use any collection of image files you have.
```shell
wget -nc 'https://cleanlab-public.s3.amazonaws.com/CleanVision/image_files.zip'
```
1. Run CleanVision to audit the images.
```python
from cleanvision import Imagelab
# Specify path to folder containing the image files in your dataset
imagelab = Imagelab(data_path="FOLDER_WITH_IMAGES/")
# Automatically check for a predefined list of issues within your dataset
imagelab.find_issues()
# Produce a neat report of the issues found in your dataset
imagelab.report()
```
2. CleanVision diagnoses many types of issues, but you can also check for only specific issues.
```python
issue_types = {"dark": {}, "blurry": {}}
imagelab.find_issues(issue_types=issue_types)
# Produce a report with only the specified issue_types
imagelab.report(issue_types=issue_types)
```
## More resources on how to use CleanVision
- [Tutorial](https://cleanvision.readthedocs.io/en/latest/tutorials/tutorial.html)
- [Run CleanVision on a HuggingFace dataset](https://cleanvision.readthedocs.io/en/latest/tutorials/huggingface_dataset.html)
- [Run CleanVision on a Torchvision dataset](https://cleanvision.readthedocs.io/en/latest/tutorials/torchvision_dataset.html)
- [Example script](https://github.com/cleanlab/cleanvision/blob/main/docs/source/tutorials/run.py) that can be run with: `python examples/run.py --path
|
CleanVision supports Linux, macOS, and Windows and runs on Python 3.7+.
## Join our community
* The best place to learn is [our Slack community](https://cleanlab.ai/slack). Join the discussion there to see how
folks are using this library, discuss upcoming features, or ask for private support.
* Need professional help with CleanVision? Join our [\#help Slack channel](https://cleanlab.ai/slack) and message us there, or reach out via email: team@cleanlab.ai
* Interested in contributing? See the [contributing guide](CONTRIBUTING.md). An easy starting point is to
consider [issues](https://github.com/cleanlab/cleanvision/labels/good%20first%20issue) marked `good first issue` or
simply reach out in [Slack](https://cleanlab.ai/slack). We welcome your help building a standard open-source library
for data-centric computer vision!
* Ready to start adding your own code? See the [development guide](DEVELOPMENT.md).
* Have an issue? [Search existing issues](https://github.com/cleanlab/cleanvision/issues?q=is%3Aissue)
or [submit a new issue](https://github.com/cleanlab/cleanvision/issues/new/choose).
* Have ideas for the future of data-centric computer vision? Check
out [our active/planned Projects and what we could use your help with](https://github.com/cleanlab/cleanvision/projects).
## License
Copyright (c) 2022 Cleanlab Inc.
cleanvision is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public
License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later
version.
cleanvision is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
See [GNU Affero General Public LICENSE](https://github.com/cleanlab/cleanvision/blob/main/LICENSE) for details.
Commercial licensing is available for enterprise teams that want to use CleanVision in production workflows, but are unable to open-source their code [as is required by the current license](https://github.com/cleanlab/cleanvision/blob/main/LICENSE). Please email us: team@cleanlab.ai
[issue]: https://github.com/cleanlab/cleanvision/issues/new