Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
W
webmagic
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
沈俊林
webmagic
Commits
17d2d98c
Commit
17d2d98c
authored
Aug 09, 2013
by
yihua.huang
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
remove invalid @date
parent
72d29f08
Changes
41
Show whitespace changes
Inline
Side-by-side
Showing
41 changed files
with
41 additions
and
41 deletions
+41
-41
ResultItems.java
...core/src/main/java/us/codecraft/webmagic/ResultItems.java
+1
-1
Destroyable.java
...in/java/us/codecraft/webmagic/downloader/Destroyable.java
+1
-1
AndSelector.java
...main/java/us/codecraft/webmagic/selector/AndSelector.java
+1
-1
OrSelector.java
.../main/java/us/codecraft/webmagic/selector/OrSelector.java
+1
-1
PagedModel.java
...nsion/src/main/java/us/codecraft/webmagic/PagedModel.java
+1
-1
AfterExtractor.java
...main/java/us/codecraft/webmagic/model/AfterExtractor.java
+1
-1
ConsolePageModelPipeline.java
...us/codecraft/webmagic/model/ConsolePageModelPipeline.java
+1
-1
Extractor.java
.../src/main/java/us/codecraft/webmagic/model/Extractor.java
+1
-1
FieldExtractor.java
...main/java/us/codecraft/webmagic/model/FieldExtractor.java
+1
-1
ModelPageProcessor.java
.../java/us/codecraft/webmagic/model/ModelPageProcessor.java
+1
-1
ModelPipeline.java
.../main/java/us/codecraft/webmagic/model/ModelPipeline.java
+1
-1
OOSpider.java
...n/src/main/java/us/codecraft/webmagic/model/OOSpider.java
+1
-1
PageModelExtractor.java
.../java/us/codecraft/webmagic/model/PageModelExtractor.java
+1
-1
PageModelPipeline.java
...n/java/us/codecraft/webmagic/model/PageModelPipeline.java
+1
-1
ExtractBy.java
...ava/us/codecraft/webmagic/model/annotation/ExtractBy.java
+1
-1
ExtractBy2.java
...va/us/codecraft/webmagic/model/annotation/ExtractBy2.java
+1
-1
ExtractBy3.java
...va/us/codecraft/webmagic/model/annotation/ExtractBy3.java
+1
-1
ExtractByRaw.java
.../us/codecraft/webmagic/model/annotation/ExtractByRaw.java
+1
-1
ExtractByUrl.java
.../us/codecraft/webmagic/model/annotation/ExtractByUrl.java
+1
-1
HelpUrl.java
.../java/us/codecraft/webmagic/model/annotation/HelpUrl.java
+1
-1
TargetUrl.java
...ava/us/codecraft/webmagic/model/annotation/TargetUrl.java
+1
-1
PagedPipeline.java
...in/java/us/codecraft/webmagic/pipeline/PagedPipeline.java
+1
-1
RedisScheduler.java
.../java/us/codecraft/webmagic/scheduler/RedisScheduler.java
+1
-1
DoubleKeyMap.java
...c/main/java/us/codecraft/webmagic/utils/DoubleKeyMap.java
+1
-1
MultiKeyMapBase.java
...ain/java/us/codecraft/webmagic/utils/MultiKeyMapBase.java
+1
-1
RedisSchedulerTest.java
...a/us/codecraft/webmagic/scheduler/RedisSchedulerTest.java
+1
-1
LucenePipeline.java
...n/java/us/codecraft/webmagic/pipeline/LucenePipeline.java
+1
-1
OschinaBlog.java
...n/test/java/us/codecraft/webmagic/lucene/OschinaBlog.java
+1
-1
QuickStarter.java
...rc/main/java/us/codecraft/webmagic/main/QuickStarter.java
+1
-1
Blog.java
...c/main/java/us/codecraft/webmagic/model/samples/Blog.java
+1
-1
IteyeBlog.java
...n/java/us/codecraft/webmagic/model/samples/IteyeBlog.java
+1
-1
News163.java
...ain/java/us/codecraft/webmagic/model/samples/News163.java
+1
-1
OschinaAnswer.java
...va/us/codecraft/webmagic/model/samples/OschinaAnswer.java
+1
-1
OschinaBlog.java
...java/us/codecraft/webmagic/model/samples/OschinaBlog.java
+1
-1
IteyeBlogProcessor.java
...ava/us/codecraft/webmagic/samples/IteyeBlogProcessor.java
+1
-1
SeleniumDownloader.java
...raft/webmagic/downloader/selenium/SeleniumDownloader.java
+1
-1
WebDriverPool.java
...codecraft/webmagic/downloader/selenium/WebDriverPool.java
+1
-1
SeleniumTest.java
...t/java/us/codecraft/webmagic/downloader/SeleniumTest.java
+1
-1
SeleniumDownloaderTest.java
.../webmagic/downloader/selenium/SeleniumDownloaderTest.java
+1
-1
WebDriverPoolTest.java
...craft/webmagic/downloader/selenium/WebDriverPoolTest.java
+1
-1
HuabanProcessor.java
...t/java/us/codecraft/webmagic/samples/HuabanProcessor.java
+1
-1
No files found.
webmagic-core/src/main/java/us/codecraft/webmagic/ResultItems.java
View file @
17d2d98c
...
@@ -6,7 +6,7 @@ import java.util.Map;
...
@@ -6,7 +6,7 @@ import java.util.Map;
/**
/**
* 保存抽取结果的类,由PageProcessor处理得到,传递给{@link us.codecraft.webmagic.pipeline.Pipeline}进行持久化。<br>
* 保存抽取结果的类,由PageProcessor处理得到,传递给{@link us.codecraft.webmagic.pipeline.Pipeline}进行持久化。<br>
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-7-25 <br>
*
D
ate: 13-7-25 <br>
* Time: 下午12:20 <br>
* Time: 下午12:20 <br>
*/
*/
public
class
ResultItems
{
public
class
ResultItems
{
...
...
webmagic-core/src/main/java/us/codecraft/webmagic/downloader/Destroyable.java
View file @
17d2d98c
...
@@ -3,7 +3,7 @@ package us.codecraft.webmagic.downloader;
...
@@ -3,7 +3,7 @@ package us.codecraft.webmagic.downloader;
/**
/**
* 比较占用资源的服务可以实现该接口,Spider会在结束时调用destroy()释放资源。<br>
* 比较占用资源的服务可以实现该接口,Spider会在结束时调用destroy()释放资源。<br>
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-7-26 <br>
*
D
ate: 13-7-26 <br>
* Time: 下午3:10 <br>
* Time: 下午3:10 <br>
*/
*/
public
interface
Destroyable
{
public
interface
Destroyable
{
...
...
webmagic-core/src/main/java/us/codecraft/webmagic/selector/AndSelector.java
View file @
17d2d98c
...
@@ -5,7 +5,7 @@ import java.util.List;
...
@@ -5,7 +5,7 @@ import java.util.List;
/**
/**
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-3 <br>
*
D
ate: 13-8-3 <br>
* Time: 下午5:29 <br>
* Time: 下午5:29 <br>
*/
*/
public
class
AndSelector
implements
Selector
{
public
class
AndSelector
implements
Selector
{
...
...
webmagic-core/src/main/java/us/codecraft/webmagic/selector/OrSelector.java
View file @
17d2d98c
...
@@ -5,7 +5,7 @@ import java.util.List;
...
@@ -5,7 +5,7 @@ import java.util.List;
/**
/**
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-3 <br>
*
D
ate: 13-8-3 <br>
* Time: 下午5:29 <br>
* Time: 下午5:29 <br>
*/
*/
public
class
OrSelector
implements
Selector
{
public
class
OrSelector
implements
Selector
{
...
...
webmagic-extension/src/main/java/us/codecraft/webmagic/PagedModel.java
View file @
17d2d98c
...
@@ -4,7 +4,7 @@ import java.util.Collection;
...
@@ -4,7 +4,7 @@ import java.util.Collection;
/**
/**
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-4 <br>
*
D
ate: 13-8-4 <br>
* Time: 下午5:18 <br>
* Time: 下午5:18 <br>
*/
*/
public
interface
PagedModel
{
public
interface
PagedModel
{
...
...
webmagic-extension/src/main/java/us/codecraft/webmagic/model/AfterExtractor.java
View file @
17d2d98c
...
@@ -6,7 +6,7 @@ import us.codecraft.webmagic.Page;
...
@@ -6,7 +6,7 @@ import us.codecraft.webmagic.Page;
* 实现这个接口即可在抽取后进行后处理。<br>
* 实现这个接口即可在抽取后进行后处理。<br>
*
*
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-3 <br>
*
D
ate: 13-8-3 <br>
* Time: 上午9:42 <br>
* Time: 上午9:42 <br>
*/
*/
public
interface
AfterExtractor
{
public
interface
AfterExtractor
{
...
...
webmagic-extension/src/main/java/us/codecraft/webmagic/model/ConsolePageModelPipeline.java
View file @
17d2d98c
...
@@ -5,7 +5,7 @@ import us.codecraft.webmagic.Task;
...
@@ -5,7 +5,7 @@ import us.codecraft.webmagic.Task;
/**
/**
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-3 <br>
*
D
ate: 13-8-3 <br>
* Time: 下午3:41 <br>
* Time: 下午3:41 <br>
*/
*/
public
class
ConsolePageModelPipeline
implements
PageModelPipeline
{
public
class
ConsolePageModelPipeline
implements
PageModelPipeline
{
...
...
webmagic-extension/src/main/java/us/codecraft/webmagic/model/Extractor.java
View file @
17d2d98c
...
@@ -4,7 +4,7 @@ import us.codecraft.webmagic.selector.Selector;
...
@@ -4,7 +4,7 @@ import us.codecraft.webmagic.selector.Selector;
/**
/**
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-1 <br>
*
D
ate: 13-8-1 <br>
* Time: 下午9:48 <br>
* Time: 下午9:48 <br>
*/
*/
class
Extractor
{
class
Extractor
{
...
...
webmagic-extension/src/main/java/us/codecraft/webmagic/model/FieldExtractor.java
View file @
17d2d98c
...
@@ -7,7 +7,7 @@ import java.lang.reflect.Method;
...
@@ -7,7 +7,7 @@ import java.lang.reflect.Method;
/**
/**
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-1 <br>
*
D
ate: 13-8-1 <br>
* Time: 下午9:48 <br>
* Time: 下午9:48 <br>
*/
*/
class
FieldExtractor
extends
Extractor
{
class
FieldExtractor
extends
Extractor
{
...
...
webmagic-extension/src/main/java/us/codecraft/webmagic/model/ModelPageProcessor.java
View file @
17d2d98c
...
@@ -16,7 +16,7 @@ import java.util.regex.Pattern;
...
@@ -16,7 +16,7 @@ import java.util.regex.Pattern;
/**
/**
* 基于PageProcessor的扩展点。<br>
* 基于PageProcessor的扩展点。<br>
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-1 <br>
*
D
ate: 13-8-1 <br>
* Time: 下午8:46 <br>
* Time: 下午8:46 <br>
*/
*/
class
ModelPageProcessor
implements
PageProcessor
{
class
ModelPageProcessor
implements
PageProcessor
{
...
...
webmagic-extension/src/main/java/us/codecraft/webmagic/model/ModelPipeline.java
View file @
17d2d98c
...
@@ -14,7 +14,7 @@ import java.util.concurrent.ConcurrentHashMap;
...
@@ -14,7 +14,7 @@ import java.util.concurrent.ConcurrentHashMap;
* 基于Pipeline的扩展点,用于实现注解格式的Pipeline。<br>
* 基于Pipeline的扩展点,用于实现注解格式的Pipeline。<br>
* 与PageModelPipeline是一对多的关系(原谅作者没有更好的名字了)。<br>
* 与PageModelPipeline是一对多的关系(原谅作者没有更好的名字了)。<br>
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-2 <br>
*
D
ate: 13-8-2 <br>
* Time: 上午10:47 <br>
* Time: 上午10:47 <br>
*/
*/
class
ModelPipeline
implements
Pipeline
{
class
ModelPipeline
implements
Pipeline
{
...
...
webmagic-extension/src/main/java/us/codecraft/webmagic/model/OOSpider.java
View file @
17d2d98c
...
@@ -6,7 +6,7 @@ import us.codecraft.webmagic.Spider;
...
@@ -6,7 +6,7 @@ import us.codecraft.webmagic.Spider;
/**
/**
* 基于Model的Spider,封装后的入口类。<br>
* 基于Model的Spider,封装后的入口类。<br>
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-3 <br>
*
D
ate: 13-8-3 <br>
* Time: 上午9:51 <br>
* Time: 上午9:51 <br>
*/
*/
public
class
OOSpider
extends
Spider
{
public
class
OOSpider
extends
Spider
{
...
...
webmagic-extension/src/main/java/us/codecraft/webmagic/model/PageModelExtractor.java
View file @
17d2d98c
...
@@ -17,7 +17,7 @@ import java.util.regex.Pattern;
...
@@ -17,7 +17,7 @@ import java.util.regex.Pattern;
* Model主要逻辑类。将一个带注解的POJO转换为一个PageModelExtractor。<br>
* Model主要逻辑类。将一个带注解的POJO转换为一个PageModelExtractor。<br>
*
*
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-1 <br>
*
D
ate: 13-8-1 <br>
* Time: 下午9:33 <br>
* Time: 下午9:33 <br>
*/
*/
class
PageModelExtractor
{
class
PageModelExtractor
{
...
...
webmagic-extension/src/main/java/us/codecraft/webmagic/model/PageModelPipeline.java
View file @
17d2d98c
...
@@ -4,7 +4,7 @@ import us.codecraft.webmagic.Task;
...
@@ -4,7 +4,7 @@ import us.codecraft.webmagic.Task;
/**
/**
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-3 <br>
*
D
ate: 13-8-3 <br>
* Time: 上午9:34 <br>
* Time: 上午9:34 <br>
*/
*/
public
interface
PageModelPipeline
<
T
>
{
public
interface
PageModelPipeline
<
T
>
{
...
...
webmagic-extension/src/main/java/us/codecraft/webmagic/model/annotation/ExtractBy.java
View file @
17d2d98c
...
@@ -8,7 +8,7 @@ import java.lang.annotation.Target;
...
@@ -8,7 +8,7 @@ import java.lang.annotation.Target;
* 定义类或者字段的抽取规则。<br>
* 定义类或者字段的抽取规则。<br>
*
*
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-1 <br>
*
D
ate: 13-8-1 <br>
* Time: 下午8:40 <br>
* Time: 下午8:40 <br>
*/
*/
@Retention
(
java
.
lang
.
annotation
.
RetentionPolicy
.
RUNTIME
)
@Retention
(
java
.
lang
.
annotation
.
RetentionPolicy
.
RUNTIME
)
...
...
webmagic-extension/src/main/java/us/codecraft/webmagic/model/annotation/ExtractBy2.java
View file @
17d2d98c
...
@@ -8,7 +8,7 @@ import java.lang.annotation.Target;
...
@@ -8,7 +8,7 @@ import java.lang.annotation.Target;
* 定义类或者字段的抽取规则,只能在Extract、ExtractByRaw之后使用。<br>
* 定义类或者字段的抽取规则,只能在Extract、ExtractByRaw之后使用。<br>
*
*
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-1 <br>
*
D
ate: 13-8-1 <br>
* Time: 下午8:40 <br>
* Time: 下午8:40 <br>
*/
*/
@Retention
(
java
.
lang
.
annotation
.
RetentionPolicy
.
RUNTIME
)
@Retention
(
java
.
lang
.
annotation
.
RetentionPolicy
.
RUNTIME
)
...
...
webmagic-extension/src/main/java/us/codecraft/webmagic/model/annotation/ExtractBy3.java
View file @
17d2d98c
...
@@ -7,7 +7,7 @@ import java.lang.annotation.Target;
...
@@ -7,7 +7,7 @@ import java.lang.annotation.Target;
/**
/**
* 定义类或者字段的抽取规则,只能在Extract、ExtractByRaw之后使用。<br>
* 定义类或者字段的抽取规则,只能在Extract、ExtractByRaw之后使用。<br>
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-1 <br>
*
D
ate: 13-8-1 <br>
* Time: 下午8:40 <br>
* Time: 下午8:40 <br>
*/
*/
@Retention
(
java
.
lang
.
annotation
.
RetentionPolicy
.
RUNTIME
)
@Retention
(
java
.
lang
.
annotation
.
RetentionPolicy
.
RUNTIME
)
...
...
webmagic-extension/src/main/java/us/codecraft/webmagic/model/annotation/ExtractByRaw.java
View file @
17d2d98c
...
@@ -8,7 +8,7 @@ import java.lang.annotation.Target;
...
@@ -8,7 +8,7 @@ import java.lang.annotation.Target;
* 对于在Class级别就使用过ExtractBy的类,在字段中想抽取全部内容可使用此方法。<br>
* 对于在Class级别就使用过ExtractBy的类,在字段中想抽取全部内容可使用此方法。<br>
*
*
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-1 <br>
*
D
ate: 13-8-1 <br>
* Time: 下午8:40 <br>
* Time: 下午8:40 <br>
*/
*/
@Retention
(
java
.
lang
.
annotation
.
RetentionPolicy
.
RUNTIME
)
@Retention
(
java
.
lang
.
annotation
.
RetentionPolicy
.
RUNTIME
)
...
...
webmagic-extension/src/main/java/us/codecraft/webmagic/model/annotation/ExtractByUrl.java
View file @
17d2d98c
...
@@ -7,7 +7,7 @@ import java.lang.annotation.Target;
...
@@ -7,7 +7,7 @@ import java.lang.annotation.Target;
/**
/**
* 定义类或者字段的抽取规则(从url中抽取,只支持正则表达式)。<br>
* 定义类或者字段的抽取规则(从url中抽取,只支持正则表达式)。<br>
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-1 <br>
*
D
ate: 13-8-1 <br>
* Time: 下午8:40 <br>
* Time: 下午8:40 <br>
*/
*/
@Retention
(
java
.
lang
.
annotation
.
RetentionPolicy
.
RUNTIME
)
@Retention
(
java
.
lang
.
annotation
.
RetentionPolicy
.
RUNTIME
)
...
...
webmagic-extension/src/main/java/us/codecraft/webmagic/model/annotation/HelpUrl.java
View file @
17d2d98c
...
@@ -7,7 +7,7 @@ import java.lang.annotation.Target;
...
@@ -7,7 +7,7 @@ import java.lang.annotation.Target;
/**
/**
* 定义辅助爬取的url。<br>
* 定义辅助爬取的url。<br>
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-1 <br>
*
D
ate: 13-8-1 <br>
* Time: 下午8:40 <br>
* Time: 下午8:40 <br>
*/
*/
@Retention
(
java
.
lang
.
annotation
.
RetentionPolicy
.
RUNTIME
)
@Retention
(
java
.
lang
.
annotation
.
RetentionPolicy
.
RUNTIME
)
...
...
webmagic-extension/src/main/java/us/codecraft/webmagic/model/annotation/TargetUrl.java
View file @
17d2d98c
...
@@ -8,7 +8,7 @@ import java.lang.annotation.Target;
...
@@ -8,7 +8,7 @@ import java.lang.annotation.Target;
* 定义某个类抽取的范围和来源,sourceRegion可以用xpath语法限定抽取区域。<br>
* 定义某个类抽取的范围和来源,sourceRegion可以用xpath语法限定抽取区域。<br>
*
*
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-1 <br>
*
D
ate: 13-8-1 <br>
* Time: 下午8:40 <br>
* Time: 下午8:40 <br>
*/
*/
@Retention
(
java
.
lang
.
annotation
.
RetentionPolicy
.
RUNTIME
)
@Retention
(
java
.
lang
.
annotation
.
RetentionPolicy
.
RUNTIME
)
...
...
webmagic-extension/src/main/java/us/codecraft/webmagic/pipeline/PagedPipeline.java
View file @
17d2d98c
...
@@ -13,7 +13,7 @@ import java.util.concurrent.ConcurrentHashMap;
...
@@ -13,7 +13,7 @@ import java.util.concurrent.ConcurrentHashMap;
* 在使用redis做分布式爬虫时,请不要使用此功能。<br>
* 在使用redis做分布式爬虫时,请不要使用此功能。<br>
*
*
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-4 <br>
*
D
ate: 13-8-4 <br>
* Time: 下午5:15 <br>
* Time: 下午5:15 <br>
*/
*/
public
class
PagedPipeline
implements
Pipeline
{
public
class
PagedPipeline
implements
Pipeline
{
...
...
webmagic-extension/src/main/java/us/codecraft/webmagic/scheduler/RedisScheduler.java
View file @
17d2d98c
...
@@ -13,7 +13,7 @@ import us.codecraft.webmagic.schedular.Scheduler;
...
@@ -13,7 +13,7 @@ import us.codecraft.webmagic.schedular.Scheduler;
* 使用redis管理url,构建一个分布式的爬虫。<br>
* 使用redis管理url,构建一个分布式的爬虫。<br>
*
*
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-7-25 <br>
*
D
ate: 13-7-25 <br>
* Time: 上午7:07 <br>
* Time: 上午7:07 <br>
*/
*/
public
class
RedisScheduler
implements
Scheduler
{
public
class
RedisScheduler
implements
Scheduler
{
...
...
webmagic-extension/src/main/java/us/codecraft/webmagic/utils/DoubleKeyMap.java
View file @
17d2d98c
...
@@ -4,7 +4,7 @@ import java.util.Map;
...
@@ -4,7 +4,7 @@ import java.util.Map;
/**
/**
* @author code4crafter@gmail.com
* @author code4crafter@gmail.com
*
@d
ate Dec 14, 2012
*
D
ate Dec 14, 2012
*/
*/
public
class
DoubleKeyMap
<
K1
,
K2
,
V
>
extends
MultiKeyMapBase
{
public
class
DoubleKeyMap
<
K1
,
K2
,
V
>
extends
MultiKeyMapBase
{
private
Map
<
K1
,
Map
<
K2
,
V
>>
map
;
private
Map
<
K1
,
Map
<
K2
,
V
>>
map
;
...
...
webmagic-extension/src/main/java/us/codecraft/webmagic/utils/MultiKeyMapBase.java
View file @
17d2d98c
...
@@ -2,7 +2,7 @@ package us.codecraft.webmagic.utils;
...
@@ -2,7 +2,7 @@ package us.codecraft.webmagic.utils;
/**
/**
* @author code4crafter@gmail.com
* @author code4crafter@gmail.com
*
@d
ate Dec 14, 2012
*
D
ate Dec 14, 2012
*/
*/
import
java.util.HashMap
;
import
java.util.HashMap
;
...
...
webmagic-extension/src/test/java/us/codecraft/webmagic/scheduler/RedisSchedulerTest.java
View file @
17d2d98c
...
@@ -9,7 +9,7 @@ import us.codecraft.webmagic.Task;
...
@@ -9,7 +9,7 @@ import us.codecraft.webmagic.Task;
/**
/**
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-7-25 <br>
*
D
ate: 13-7-25 <br>
* Time: 上午7:51 <br>
* Time: 上午7:51 <br>
*/
*/
public
class
RedisSchedulerTest
{
public
class
RedisSchedulerTest
{
...
...
webmagic-lucene/src/main/java/us/codecraft/webmagic/pipeline/LucenePipeline.java
View file @
17d2d98c
...
@@ -26,7 +26,7 @@ import java.util.Map;
...
@@ -26,7 +26,7 @@ import java.util.Map;
/**
/**
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-5 <br>
*
D
ate: 13-8-5 <br>
* Time: 下午2:11 <br>
* Time: 下午2:11 <br>
*/
*/
public
class
LucenePipeline
implements
Pipeline
{
public
class
LucenePipeline
implements
Pipeline
{
...
...
webmagic-lucene/src/main/test/java/us/codecraft/webmagic/lucene/OschinaBlog.java
View file @
17d2d98c
...
@@ -13,7 +13,7 @@ import java.util.List;
...
@@ -13,7 +13,7 @@ import java.util.List;
/**
/**
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-2 <br>
*
D
ate: 13-8-2 <br>
* Time: 上午7:52 <br>
* Time: 上午7:52 <br>
*/
*/
@TargetUrl
(
"http://my.oschina.net/flashsword/blog/\\d+"
)
@TargetUrl
(
"http://my.oschina.net/flashsword/blog/\\d+"
)
...
...
webmagic-samples/src/main/java/us/codecraft/webmagic/main/QuickStarter.java
View file @
17d2d98c
...
@@ -14,7 +14,7 @@ import java.util.Scanner;
...
@@ -14,7 +14,7 @@ import java.util.Scanner;
/**
/**
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-7 <br>
*
D
ate: 13-8-7 <br>
* Time: 下午9:24 <br>
* Time: 下午9:24 <br>
*/
*/
public
class
QuickStarter
{
public
class
QuickStarter
{
...
...
webmagic-samples/src/main/java/us/codecraft/webmagic/model/samples/Blog.java
View file @
17d2d98c
...
@@ -2,7 +2,7 @@ package us.codecraft.webmagic.model.samples;
...
@@ -2,7 +2,7 @@ package us.codecraft.webmagic.model.samples;
/**
/**
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-2 <br>
*
D
ate: 13-8-2 <br>
* Time: 上午8:10 <br>
* Time: 上午8:10 <br>
*/
*/
public
interface
Blog
{
public
interface
Blog
{
...
...
webmagic-samples/src/main/java/us/codecraft/webmagic/model/samples/IteyeBlog.java
View file @
17d2d98c
...
@@ -7,7 +7,7 @@ import us.codecraft.webmagic.model.annotation.TargetUrl;
...
@@ -7,7 +7,7 @@ import us.codecraft.webmagic.model.annotation.TargetUrl;
/**
/**
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-2 <br>
*
D
ate: 13-8-2 <br>
* Time: 上午7:52 <br>
* Time: 上午7:52 <br>
*/
*/
@TargetUrl
(
"http://*.iteye.com/blog/*"
)
@TargetUrl
(
"http://*.iteye.com/blog/*"
)
...
...
webmagic-samples/src/main/java/us/codecraft/webmagic/model/samples/News163.java
View file @
17d2d98c
...
@@ -16,7 +16,7 @@ import java.util.List;
...
@@ -16,7 +16,7 @@ import java.util.List;
/**
/**
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-4 <br>
*
D
ate: 13-8-4 <br>
* Time: 下午8:17 <br>
* Time: 下午8:17 <br>
*/
*/
@TargetUrl
(
"http://news.163.com/\\d+/\\d+/\\d+/\\w+*.html"
)
@TargetUrl
(
"http://news.163.com/\\d+/\\d+/\\d+/\\w+*.html"
)
...
...
webmagic-samples/src/main/java/us/codecraft/webmagic/model/samples/OschinaAnswer.java
View file @
17d2d98c
...
@@ -9,7 +9,7 @@ import us.codecraft.webmagic.model.annotation.TargetUrl;
...
@@ -9,7 +9,7 @@ import us.codecraft.webmagic.model.annotation.TargetUrl;
/**
/**
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-3 <br>
*
D
ate: 13-8-3 <br>
* Time: 下午8:25 <br>
* Time: 下午8:25 <br>
*/
*/
@TargetUrl
(
"http://www.oschina.net/question/\\d+_\\d+*"
)
@TargetUrl
(
"http://www.oschina.net/question/\\d+_\\d+*"
)
...
...
webmagic-samples/src/main/java/us/codecraft/webmagic/model/samples/OschinaBlog.java
View file @
17d2d98c
...
@@ -10,7 +10,7 @@ import java.util.List;
...
@@ -10,7 +10,7 @@ import java.util.List;
/**
/**
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-8-2 <br>
*
D
ate: 13-8-2 <br>
* Time: 上午7:52 <br>
* Time: 上午7:52 <br>
*/
*/
@TargetUrl
(
"http://my.oschina.net/flashsword/blog/\\d+"
)
@TargetUrl
(
"http://my.oschina.net/flashsword/blog/\\d+"
)
...
...
webmagic-samples/src/main/java/us/codecraft/webmagic/samples/IteyeBlogProcessor.java
View file @
17d2d98c
...
@@ -8,7 +8,7 @@ import us.codecraft.webmagic.processor.PageProcessor;
...
@@ -8,7 +8,7 @@ import us.codecraft.webmagic.processor.PageProcessor;
/**
/**
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-7-26 <br>
*
D
ate: 13-7-26 <br>
* Time: 上午7:31 <br>
* Time: 上午7:31 <br>
*/
*/
public
class
IteyeBlogProcessor
implements
PageProcessor
{
public
class
IteyeBlogProcessor
implements
PageProcessor
{
...
...
webmagic-selenium/src/main/java/us/codecraft/webmagic/downloader/selenium/SeleniumDownloader.java
View file @
17d2d98c
...
@@ -22,7 +22,7 @@ import java.util.Map;
...
@@ -22,7 +22,7 @@ import java.util.Map;
* 需要下载Selenium driver支持。<br>
* 需要下载Selenium driver支持。<br>
*
*
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-7-26 <br>
*
D
ate: 13-7-26 <br>
* Time: 下午1:37 <br>
* Time: 下午1:37 <br>
*/
*/
public
class
SeleniumDownloader
implements
Downloader
,
Destroyable
{
public
class
SeleniumDownloader
implements
Downloader
,
Destroyable
{
...
...
webmagic-selenium/src/main/java/us/codecraft/webmagic/downloader/selenium/WebDriverPool.java
View file @
17d2d98c
...
@@ -12,7 +12,7 @@ import java.util.concurrent.atomic.AtomicInteger;
...
@@ -12,7 +12,7 @@ import java.util.concurrent.atomic.AtomicInteger;
/**
/**
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-7-26 <br>
*
D
ate: 13-7-26 <br>
* Time: 下午1:41 <br>
* Time: 下午1:41 <br>
*/
*/
class
WebDriverPool
{
class
WebDriverPool
{
...
...
webmagic-selenium/src/test/java/us/codecraft/webmagic/downloader/SeleniumTest.java
View file @
17d2d98c
...
@@ -14,7 +14,7 @@ import java.util.Map;
...
@@ -14,7 +14,7 @@ import java.util.Map;
/**
/**
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-7-26 <br>
*
D
ate: 13-7-26 <br>
* Time: 下午12:27 <br>
* Time: 下午12:27 <br>
*/
*/
public
class
SeleniumTest
{
public
class
SeleniumTest
{
...
...
webmagic-selenium/src/test/java/us/codecraft/webmagic/downloader/selenium/SeleniumDownloaderTest.java
View file @
17d2d98c
...
@@ -9,7 +9,7 @@ import us.codecraft.webmagic.Task;
...
@@ -9,7 +9,7 @@ import us.codecraft.webmagic.Task;
/**
/**
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-7-26 <br>
*
D
ate: 13-7-26 <br>
* Time: 下午2:46 <br>
* Time: 下午2:46 <br>
*/
*/
public
class
SeleniumDownloaderTest
{
public
class
SeleniumDownloaderTest
{
...
...
webmagic-selenium/src/test/java/us/codecraft/webmagic/downloader/selenium/WebDriverPoolTest.java
View file @
17d2d98c
...
@@ -6,7 +6,7 @@ import org.openqa.selenium.WebDriver;
...
@@ -6,7 +6,7 @@ import org.openqa.selenium.WebDriver;
/**
/**
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-7-26 <br>
*
D
ate: 13-7-26 <br>
* Time: 下午2:12 <br>
* Time: 下午2:12 <br>
*/
*/
public
class
WebDriverPoolTest
{
public
class
WebDriverPoolTest
{
...
...
webmagic-selenium/src/test/java/us/codecraft/webmagic/samples/HuabanProcessor.java
View file @
17d2d98c
...
@@ -11,7 +11,7 @@ import us.codecraft.webmagic.processor.PageProcessor;
...
@@ -11,7 +11,7 @@ import us.codecraft.webmagic.processor.PageProcessor;
* 花瓣网抽取器。<br>
* 花瓣网抽取器。<br>
* 使用Selenium做页面动态渲染。<br>
* 使用Selenium做页面动态渲染。<br>
* @author code4crafter@gmail.com <br>
* @author code4crafter@gmail.com <br>
*
@d
ate: 13-7-26 <br>
*
D
ate: 13-7-26 <br>
* Time: 下午4:08 <br>
* Time: 下午4:08 <br>
*/
*/
public
class
HuabanProcessor
implements
PageProcessor
{
public
class
HuabanProcessor
implements
PageProcessor
{
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment