• zhugw's avatar
    Update FileCacheQueueScheduler.java · 1db940a0
    zhugw authored
    在使用过程中发现urls.txt文件存在重复URL的情况,经跟踪源代码,发现初始化加载文件后,读取所有的url放入一集合中,但是之后添加待抓取URL时并未判断是否已存在该集合中(即文件中)了,故导致文件中重复URL的情况.故据此对源码做了修改,还请作者审阅.
    1db940a0
Name
Last commit
Last update
assets Loading commit data...
en_docs Loading commit data...
webmagic-avalon Loading commit data...
webmagic-core Loading commit data...
webmagic-extension Loading commit data...
webmagic-samples Loading commit data...
webmagic-saxon Loading commit data...
webmagic-scripts Loading commit data...
webmagic-selenium Loading commit data...
zh_docs Loading commit data...
.gitignore Loading commit data...
.travis.yml Loading commit data...
README.md Loading commit data...
pom.xml Loading commit data...
release-note.md Loading commit data...
user-manual.md Loading commit data...
webmagic-avalon.md Loading commit data...