diff --git a/README.md b/README.md index 8ef570866dcdf45c13e0b8fcc00b289f9f7f97ee..dd73aa5c8a8069be56bf2ddeb08ba60461f752b3 100644 --- a/README.md +++ b/README.md @@ -1,11 +1,13 @@ # datafu #### 介绍 -Apache DataFu is a collection of libraries for working with large-scale data in Hadoop. +Apache DataFu是一个库的集合,用于在Hadoop中处理大规模数据。 #### 软件架构 软件架构说明 - +分为两个部分组成: +- Apache DataFu Pig: a collection of user-defined functions for Apache Pig +- Apache DataFu Hourglass: an incremental processing framework for Apache Hadoop in MapReduce #### 安装教程 diff --git a/datafu.spec b/datafu.spec index 0107a79bff1b701b9ab745ffd8d81bd4a9a54899..90c5dd31be67456263691eed80695bce81bfe995 100644 --- a/datafu.spec +++ b/datafu.spec @@ -17,7 +17,8 @@ BuildRequires: gradle Requires: java-1.8.0-openjdk %description -Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications. +Apache DataFu is a collection of libraries for working with large-scale data in Hadoop. +The project was inspired by the need for stable, well-tested libraries for data mining and statistics. %prep