# 多元数据库查询系统-simba

**Repository Path**: opensci/simba

## Basic Information

- **Project Name**: 多元数据库查询系统-simba
- **Description**: 多元数据库查询系统-simba
- **Primary Language**: Unknown
- **License**: BSD-2-Clause
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 8
- **Forks**: 1
- **Created**: 2017-10-24
- **Last Updated**: 2025-08-27

## Categories & Tags

**Categories**: big-data

**Tags**: None

## README

# simba
insert, extraction and analysis framework for LDM

#Notice 1:
scala version should be compatible for the system and the Spark
1) spark 1.3.1
2) scala 2.10.4
3) hadoop 1.2.1
3) titan 1.0.0

#Notice 2:
assume lib in simba home contains following libs
hadoop-client-1.2.1.jar  
hadoop-gremlin-3.0.1-incubating.jar  
hbase-common-0.98.2-hadoop1.jar  
htrace-core-2.04.jar
hadoop-core-1.2.1.jar    
hbase-client-0.98.2-hadoop1.jar      
hbase-protocol-0.98.2-hadoop1.jar
or you need to include these libs through modifying the build.sbt

#Notice 3: (for titan)
1) conf contains "conf/titan-hbase-es-simba.properties" configuration file for TitanDB(hbase+es in default)
2) test_input contains the docs and links data and can be accessed as
	val docRDD = sc.objectFile[Document]("test_input/godsDocs")
	val linkRDD = sc.objectFile[DocumentLink]("test_input/godsLinks")

#### compile#### 
sbt clean compile
#### run ####
sbt run
#### test ####
sbt test

#Simple Example:
var gDB = TitanSimbaDB(sc, titanConf)
val docRDD = sc.objectFile[Document]("test_input/godsDocs")
gDB.insert(docRDD)
gDB.docs().foreach(s => s.simbaPrint())
gDB.close()