- 11 Sep, 2014 3 commits
-
-
yihua.huang authored
-
Yihua Huang authored
Update FileCacheQueueScheduler.java
-
zhugw authored
在使用过程中发现urls.txt文件存在重复URL的情况,经跟踪源代码,发现初始化加载文件后,读取所有的url放入一集合中,但是之后添加待抓取URL时并未判断是否已存在该集合中(即文件中)了,故导致文件中重复URL的情况.故据此对源码做了修改,还请作者审阅.
-
- 09 Sep, 2014 1 commit
-
-
yihua.huang authored
-
- 21 Aug, 2014 2 commits
-
-
yihua.huang authored
-
yihua.huang authored
-
- 18 Aug, 2014 2 commits
-
-
yihua.huang authored
-
yihua.huang authored
-
- 14 Aug, 2014 1 commit
-
-
yihua.huang authored
Disable jsoup entity escape by Default. Set Html.DISABLE_HTML_ENTITY_ESCAPE to false to enable it. #149
-
- 13 Aug, 2014 1 commit
-
-
yihua.huang authored
-
- 25 Jun, 2014 3 commits
-
-
yihua.huang authored
-
yihua.huang authored
-
yihua.huang authored
-
- 10 Jun, 2014 1 commit
-
-
yihua.huang authored
-
- 09 Jun, 2014 4 commits
-
-
yihua.huang authored
-
yihua.huang authored
-
yihua.huang authored
-
zwf authored
-
- 04 Jun, 2014 8 commits
-
-
yihua.huang authored
-
yihua.huang authored
-
yihua.huang authored
-
yihua.huang authored
-
yihua.huang authored
-
yihua.huang authored
-
yihua.huang authored
-
yihua.huang authored
-
- 03 Jun, 2014 1 commit
-
-
yihua.huang authored
-
- 28 May, 2014 1 commit
-
-
yihua.huang authored
-
- 27 May, 2014 9 commits
-
-
yihua.huang authored
-
yihua.huang authored
-
Yihua Huang authored
多个代理的管理
-
yihua.huang authored
-
yihua.huang authored
-
yihua.huang authored
-
yihua.huang authored
-
yihua.huang authored
1. remove lazy init of Html 2. rename strings to sourceTexts for better meaning 3. make getSourceTexts abstract and DO NOT always store strings 4. instead store parsed elements of document in HtmlNode
-
yihua.huang authored
-
- 26 May, 2014 1 commit
-
-
yihua.huang authored
1. Only read from content once to fix stream closed exception 2. invite moco as server test
-
- 19 May, 2014 2 commits