Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
W
webmagic
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
沈俊林
webmagic
Commits
dafd0b58
Commit
dafd0b58
authored
Apr 04, 2014
by
yihua.huang
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
[BugFix]multi model in one pageprocessor will be skipped #85
parent
7aaf837e
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
49 additions
and
1 deletion
+49
-1
ModelPageProcessor.java
.../java/us/codecraft/webmagic/model/ModelPageProcessor.java
+4
-1
ModelPageProcessorTest.java
...a/us/codecraft/webmagic/model/ModelPageProcessorTest.java
+45
-0
No files found.
webmagic-extension/src/main/java/us/codecraft/webmagic/model/ModelPageProcessor.java
View file @
dafd0b58
...
...
@@ -55,11 +55,14 @@ class ModelPageProcessor implements PageProcessor {
extractLinks
(
page
,
pageModelExtractor
.
getTargetUrlRegionSelector
(),
pageModelExtractor
.
getTargetUrlPatterns
());
Object
process
=
pageModelExtractor
.
process
(
page
);
if
(
process
==
null
||
(
process
instanceof
List
&&
((
List
)
process
).
size
()
==
0
))
{
page
.
getResultItems
().
setSkip
(
true
)
;
continue
;
}
postProcessPageModel
(
pageModelExtractor
.
getClazz
(),
process
);
page
.
putField
(
pageModelExtractor
.
getClazz
().
getCanonicalName
(),
process
);
}
if
(
page
.
getResultItems
().
getAll
().
size
()
==
0
)
{
page
.
getResultItems
().
setSkip
(
true
);
}
}
private
void
extractLinks
(
Page
page
,
Selector
urlRegionSelector
,
List
<
Pattern
>
urlPatterns
)
{
...
...
webmagic-extension/src/test/java/us/codecraft/webmagic/model/ModelPageProcessorTest.java
0 → 100644
View file @
dafd0b58
package
us
.
codecraft
.
webmagic
.
model
;
import
org.junit.Test
;
import
us.codecraft.webmagic.Page
;
import
us.codecraft.webmagic.Request
;
import
us.codecraft.webmagic.model.annotation.ExtractBy
;
import
us.codecraft.webmagic.model.annotation.TargetUrl
;
import
us.codecraft.webmagic.selector.PlainText
;
import
static
org
.
assertj
.
core
.
api
.
Assertions
.
assertThat
;
/**
* @author code4crafter@gmail.com
* @date 14-4-4
*/
public
class
ModelPageProcessorTest
{
@TargetUrl
(
"http://codecraft.us/foo"
)
public
static
class
ModelFoo
{
@ExtractBy
(
value
=
"//div/@foo"
,
notNull
=
true
)
private
String
foo
;
}
@TargetUrl
(
"http://codecraft.us/bar"
)
public
static
class
ModelBar
{
@ExtractBy
(
value
=
"//div/@bar"
,
notNull
=
true
)
private
String
bar
;
}
@Test
public
void
testMultiModel_should_not_skip_when_match
()
throws
Exception
{
Page
page
=
new
Page
();
page
.
setRawText
(
"<div foo='foo'></div>"
);
page
.
setRequest
(
new
Request
(
"http://codecraft.us/foo"
));
page
.
setUrl
(
PlainText
.
create
(
"http://codecraft.us/foo"
));
ModelPageProcessor
modelPageProcessor
=
ModelPageProcessor
.
create
(
null
,
ModelFoo
.
class
,
ModelBar
.
class
);
modelPageProcessor
.
process
(
page
);
assertThat
(
page
.
getResultItems
().
isSkip
()).
isFalse
();
}
}
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment