\tickycollection

文章采集类

Summary

Methods

Properties

Constants

get_content()
get_sub_content()
get_all_url()
get_filter_html()
myexp()

$url

No constants found

replace_item()
url_check()

No protected properties found

N/A

No private methods found

No private properties found

N/A

File: tickyphp/library/collection.php
Package: Default
Class hierarchy: \ticky\collection

Properties

$url

$url :

Type

Methods

get_content()

get_content(  $url) : string

获取目标网址HTML源码

Parameters

$url

目标网址url

Returns

string

get_sub_content()

get_sub_content(  $html,   $start,   $end) : string

获取区间内的HTML源码

Parameters

	$html	目标网址的HTML源码
	$start	区间开始的html标识
	$end	区间结束的html标识

Returns

string

get_all_url()

get_all_url(  $html,   $url_contain = '',   $url_except = '') : array

根据区间内的HTML源码，提取文章的URL和TITLE

Parameters

	$html	区间内的HTML源码
	$url_contain	网址中必须包含
	$url_except	网址中不能包含

Returns

array

get_filter_html()

get_filter_html(  $html,   $config = array()) : array

根据文章内容页的HTML源码，过滤并提取相关信息

Parameters

	$html	文章内容页面的HTML源码
	$config

Returns

array

myexp()

myexp(  $separator,   $string) : array

根据配置项，切割数据

Parameters

	$separator	以什么字符分割字符串
	$string	要处理的字符串

Returns

array

replace_item()

replace_item(string  $html, array  $config) : string

过滤代码

Parameters

string	$html	HTML代码
array	$config	过滤配置

Returns

string

url_check()

url_check(string  $url, string  $baseurl) : string

URL地址检查

Parameters

string	$url	需要检查的URL
string	$baseurl	基本URL

Returns

string