Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
W
webmagic
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
沈俊林
webmagic
Commits
eb376fca
Commit
eb376fca
authored
Jun 24, 2017
by
yihua.huang
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
update jsoup to 1.10.3 #608
parent
faca38d4
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
2 additions
and
19 deletions
+2
-19
pom.xml
pom.xml
+1
-1
Html.java
...re/src/main/java/us/codecraft/webmagic/selector/Html.java
+1
-17
HtmlTest.java
...ic-core/src/test/java/us/codecraft/webmagic/HtmlTest.java
+0
-1
No files found.
pom.xml
View file @
eb376fca
...
...
@@ -146,7 +146,7 @@
<dependency>
<groupId>
org.jsoup
</groupId>
<artifactId>
jsoup
</artifactId>
<version>
1.
8
.3
</version>
<version>
1.
10
.3
</version>
</dependency>
<dependency>
<groupId>
org.mockito
</groupId>
...
...
webmagic-core/src/main/java/us/codecraft/webmagic/selector/Html.java
View file @
eb376fca
...
...
@@ -3,7 +3,6 @@ package us.codecraft.webmagic.selector;
import
org.jsoup.Jsoup
;
import
org.jsoup.nodes.Document
;
import
org.jsoup.nodes.Element
;
import
org.jsoup.nodes.Entities
;
import
org.slf4j.Logger
;
import
org.slf4j.LoggerFactory
;
...
...
@@ -20,25 +19,12 @@ public class Html extends HtmlNode {
private
Logger
logger
=
LoggerFactory
.
getLogger
(
getClass
());
private
static
volatile
boolean
INITED
=
false
;
/**
* Disable jsoup html entity escape. It can be set just before any Html instance is created.
* @deprecated
*/
public
static
boolean
DISABLE_HTML_ENTITY_ESCAPE
=
false
;
/**
* Disable jsoup html entity escape. It is a hack way only for jsoup 1.7.2.
*/
private
void
disableJsoupHtmlEntityEscape
()
{
if
(
DISABLE_HTML_ENTITY_ESCAPE
&&
!
INITED
)
{
Entities
.
EscapeMode
.
base
.
getMap
().
clear
();
Entities
.
EscapeMode
.
extended
.
getMap
().
clear
();
Entities
.
EscapeMode
.
xhtml
.
getMap
().
clear
();
INITED
=
true
;
}
}
/**
* Store parsed document for better performance when only one text exist.
*/
...
...
@@ -46,7 +32,6 @@ public class Html extends HtmlNode {
public
Html
(
String
text
,
String
url
)
{
try
{
disableJsoupHtmlEntityEscape
();
this
.
document
=
Jsoup
.
parse
(
text
,
url
);
}
catch
(
Exception
e
)
{
this
.
document
=
null
;
...
...
@@ -56,7 +41,6 @@ public class Html extends HtmlNode {
public
Html
(
String
text
)
{
try
{
disableJsoupHtmlEntityEscape
();
this
.
document
=
Jsoup
.
parse
(
text
);
}
catch
(
Exception
e
)
{
this
.
document
=
null
;
...
...
webmagic-core/src/test/java/us/codecraft/webmagic/HtmlTest.java
View file @
eb376fca
...
...
@@ -30,7 +30,6 @@ public class HtmlTest {
@Test
public
void
testEnableJsoupHtmlEntityEscape
()
throws
Exception
{
Html
.
DISABLE_HTML_ENTITY_ESCAPE
=
false
;
Html
html
=
new
Html
(
"aaaaaaa&b"
);
assertThat
(
html
.
regex
(
"(aaaaaaa&b)"
).
toString
()).
isEqualTo
(
"aaaaaaa&b"
);
}
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment