Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
W
webmagic
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
沈俊林
webmagic
Commits
6f5b9e44
Commit
6f5b9e44
authored
Jul 29, 2017
by
yihua.huang
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
#627 set charset to request
parent
32f1f2cf
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
44 additions
and
1 deletion
+44
-1
Request.java
...gic-core/src/main/java/us/codecraft/webmagic/Request.java
+10
-0
HttpClientDownloader.java
...s/codecraft/webmagic/downloader/HttpClientDownloader.java
+1
-1
HttpClientDownloaderTest.java
...decraft/webmagic/downloader/HttpClientDownloaderTest.java
+33
-0
No files found.
webmagic-core/src/main/java/us/codecraft/webmagic/Request.java
View file @
6f5b9e44
...
...
@@ -51,6 +51,8 @@ public class Request implements Serializable {
*/
private
boolean
binaryContent
=
false
;
private
String
charset
;
public
Request
()
{
}
...
...
@@ -176,6 +178,14 @@ public class Request implements Serializable {
this
.
binaryContent
=
binaryContent
;
}
public
String
getCharset
()
{
return
charset
;
}
public
void
setCharset
(
String
charset
)
{
this
.
charset
=
charset
;
}
@Override
public
String
toString
()
{
return
"Request{"
+
...
...
webmagic-core/src/main/java/us/codecraft/webmagic/downloader/HttpClientDownloader.java
View file @
6f5b9e44
...
...
@@ -83,7 +83,7 @@ public class HttpClientDownloader extends AbstractDownloader {
Page
page
=
Page
.
fail
();
try
{
httpResponse
=
httpClient
.
execute
(
requestContext
.
getHttpUriRequest
(),
requestContext
.
getHttpClientContext
());
page
=
handleResponse
(
request
,
task
.
getSite
().
getCharset
(),
httpResponse
,
task
);
page
=
handleResponse
(
request
,
request
.
getCharset
()
!=
null
?
request
.
getCharset
()
:
task
.
getSite
().
getCharset
(),
httpResponse
,
task
);
onSuccess
(
request
);
logger
.
info
(
"downloading page success {}"
,
request
.
getUrl
());
return
page
;
...
...
webmagic-core/src/test/java/us/codecraft/webmagic/downloader/HttpClientDownloaderTest.java
View file @
6f5b9e44
...
...
@@ -289,4 +289,37 @@ public class HttpClientDownloaderTest {
});
}
@Test
public
void
test_download_set_charset
()
throws
Exception
{
HttpServer
server
=
httpServer
(
13423
);
server
.
response
(
header
(
"Content-Type"
,
"text/html; charset=utf-8"
)).
response
(
"hello world!"
);
Runner
.
running
(
server
,
new
Runnable
()
{
@Override
public
void
run
()
throws
Exception
{
final
HttpClientDownloader
httpClientDownloader
=
new
HttpClientDownloader
();
Request
request
=
new
Request
();
request
.
setUrl
(
"http://127.0.0.1:13423/"
);
Page
page
=
httpClientDownloader
.
download
(
request
,
Site
.
me
().
toTask
());
assertThat
(
page
.
getCharset
()).
isEqualTo
(
"utf-8"
);
}
});
}
@Test
public
void
test_download_set_request_charset
()
throws
Exception
{
HttpServer
server
=
httpServer
(
13423
);
server
.
response
(
"hello world!"
);
Runner
.
running
(
server
,
new
Runnable
()
{
@Override
public
void
run
()
throws
Exception
{
final
HttpClientDownloader
httpClientDownloader
=
new
HttpClientDownloader
();
Request
request
=
new
Request
();
request
.
setCharset
(
"utf-8"
);
request
.
setUrl
(
"http://127.0.0.1:13423/"
);
Page
page
=
httpClientDownloader
.
download
(
request
,
Site
.
me
().
setCharset
(
"gbk"
).
toTask
());
assertThat
(
page
.
getCharset
()).
isEqualTo
(
"utf-8"
);
}
});
}
}
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment