# VideoSpider **Repository Path**: man0sions/VideoSpider ## Basic Information - **Project Name**: VideoSpider - **Description**: php命令行爬虫,支持sohu、iqiyi、youku的视频分类、列表抓取,支持自定义配置抓取 - **Primary Language**: PHP - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1 - **Created**: 2016-10-10 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # php命令行爬虫,支持sohu、iqiyi、youku的视频分类、列表抓取,支持自定义配置抓取 ## Install ``` git clone https://git.oschina.net/man0sions/VideoSpider.git composer install ``` ## useage ### 1:抓取sohu 在命令行输入 ``` mysql数据结构在 public/demo.sql php public/index.php sohu ``` ### 2:自定义爬虫实现 ``` 继承src\base\api\Command\Spider 抽象类,并实现doSpider 方法 例如: /** * Class Sohu * 抓取sohu的类,继承spider * 实现playlist抓取,并保存 * 实现videos抓取,并保存 * * @package src\commands */ class Sohu extends Spider { private $rs_video = 4; private $site = 1; private $spider_data; function __destruct() { // var_dump(__CLASS__."==>__destruct\n"); } function __construct() { $this->spider_data = new SohuSpiderData(new SohuFormatStrategy()); } /** * @param $url * @param $firstcategoryid * @return bool * @throws AppException */ function doSpider($url, $firstcategoryid) { $request = new HttpRequest($url,new RandUseragentFetch()); $res = json_decode($request->fetch()); if($res->status!=200) { echo "==>fetch error:{$res}\n"; return false; } if(count($res->data->videos)==0) { echo "==>data empty\n"; return false; } foreach ($res->data->videos as $key => $val) { $this->savePlaylist($val, $firstcategoryid); } } /** * 把sohu api返回的数据视频据列表封住成ThirdPlaylist 对象 然后在对ThirdPlaylist 进行 save(insert / update) * * @param $item * @param $firstcategoryid * @throws AppException */ private function savePlaylist($item, $firstcategoryid) { } /** * 把sohu api返回的剧集信息封住成 ThirdVideo 对象,然后对 ThirdVideo 进行 save(insert / update) * @param $playlist * @return int */ private function saveVideos($playlist) { } } ```