Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
W
webmagic
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
沈俊林
webmagic
Commits
3b1993ea
Commit
3b1993ea
authored
Aug 10, 2013
by
yihua.huang
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
update docs
parent
19229dd8
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
8 additions
and
5 deletions
+8
-5
webmagic manual.md
webmagic manual.md
+3
-0
OschinaBlog.java
...java/us/codecraft/webmagic/model/samples/OschinaBlog.java
+5
-5
No files found.
webmagic manual.md
View file @
3b1993ea
...
...
@@ -352,6 +352,7 @@ webmagic-extension包括注解模块。为什么会有注解方式?
AfterExtractor接口是对注解方式抽取能力不足的补充。实现AfterExtractor接口后,会在**使用注解方式填充完字段后**调用**afterProcess()**方法,在这个方法中可以直接访问已抽取的字段、补充需要抽取的字段,甚至做一些简单的输出和持久化操作(并不是很建议这么做)。这部分可以参考[webmagic结合JFinal持久化到数据库的一段代码](http://www.oschina.net/code/snippet_190591_23456)。
*
#### OOSpider
OOSpider是注解式爬虫的入口,这里调用
**create()**
方法将OschinaBlog这个类加入到爬虫的抽取中,这里是可以传入多个类的,例如:
OOSpider.create(
...
...
@@ -362,7 +363,9 @@ webmagic-extension包括注解模块。为什么会有注解方式?
OOSpider会根据TargetUrl调用不同的Model进行解析。
*
#### PageModelPipeline
可以通过定义PageModelPipeline来选择结果输出方式。这里new ConsolePageModelPipeline()是PageModelPipeline的一个实现,会将结果输出到控制台。
PageModelPipeline还有一个实现
**JsonFilePageModelPipeline**
,可以将对象持久化以JSON格式输出,并持久化到文件。JsonFilePageModelPipeline默认使用对象的MD5值作为文件名,你可以在Model中实现HasKey接口,指定输出的文件名。
*
#### 分页
...
...
webmagic-samples/src/main/java/us/codecraft/webmagic/model/samples/OschinaBlog.java
View file @
3b1993ea
...
...
@@ -31,6 +31,11 @@ public class OschinaBlog implements HasKey{
,
new
JsonFilePageModelPipeline
(),
OschinaBlog
.
class
).
run
();
}
@Override
public
String
key
()
{
return
title
;
}
public
String
getTitle
()
{
return
title
;
}
...
...
@@ -42,9 +47,4 @@ public class OschinaBlog implements HasKey{
public
List
<
String
>
getTags
()
{
return
tags
;
}
@Override
public
String
key
()
{
return
title
;
}
}
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment