Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
W
webmagic
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
沈俊林
webmagic
Commits
3266ea15
Commit
3266ea15
authored
Jul 22, 2017
by
yihua.huang
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
#629 correct illegal url in HttpUriRequestConverter
parent
5daf92e8
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
39 additions
and
3 deletions
+39
-3
HttpUriRequestConverter.java
...odecraft/webmagic/downloader/HttpUriRequestConverter.java
+1
-1
UrlUtils.java
...e/src/main/java/us/codecraft/webmagic/utils/UrlUtils.java
+7
-2
HttpUriRequestConverterTest.java
...raft/webmagic/downloader/HttpUriRequestConverterTest.java
+31
-0
No files found.
webmagic-core/src/main/java/us/codecraft/webmagic/downloader/HttpUriRequestConverter.java
View file @
3266ea15
...
...
@@ -58,7 +58,7 @@ public class HttpUriRequestConverter {
}
private
HttpUriRequest
convertHttpUriRequest
(
Request
request
,
Site
site
,
Proxy
proxy
)
{
RequestBuilder
requestBuilder
=
selectRequestMethod
(
request
).
setUri
(
request
.
getUrl
(
));
RequestBuilder
requestBuilder
=
selectRequestMethod
(
request
).
setUri
(
UrlUtils
.
fixIllegalCharacterInUrl
(
request
.
getUrl
()
));
if
(
site
.
getHeaders
()
!=
null
)
{
for
(
Map
.
Entry
<
String
,
String
>
headerEntry
:
site
.
getHeaders
().
entrySet
())
{
requestBuilder
.
addHeader
(
headerEntry
.
getKey
(),
headerEntry
.
getValue
());
...
...
webmagic-core/src/main/java/us/codecraft/webmagic/utils/UrlUtils.java
View file @
3266ea15
...
...
@@ -43,7 +43,7 @@ public class UrlUtils {
if
(
url
.
startsWith
(
"?"
))
url
=
base
.
getPath
()
+
url
;
URL
abs
=
new
URL
(
base
,
url
);
return
encodeIllegalCharacterInUrl
(
abs
.
toExternalForm
()
);
return
abs
.
toExternalForm
(
);
}
catch
(
MalformedURLException
e
)
{
return
""
;
}
...
...
@@ -53,12 +53,17 @@ public class UrlUtils {
*
* @param url url
* @return new url
* @deprecated
*/
public
static
String
encodeIllegalCharacterInUrl
(
String
url
)
{
//TODO more charator support
return
url
.
replace
(
" "
,
"%20"
);
}
public
static
String
fixIllegalCharacterInUrl
(
String
url
)
{
//TODO more charator support
return
url
.
replace
(
" "
,
"%20"
).
replaceAll
(
"#+"
,
"#"
);
}
public
static
String
getHost
(
String
url
)
{
String
host
=
url
;
int
i
=
StringUtils
.
ordinalIndexOf
(
url
,
"/"
,
3
);
...
...
webmagic-core/src/test/java/us/codecraft/webmagic/downloader/HttpUriRequestConverterTest.java
0 → 100644
View file @
3266ea15
package
us
.
codecraft
.
webmagic
.
downloader
;
import
org.junit.Test
;
import
us.codecraft.webmagic.Request
;
import
us.codecraft.webmagic.Site
;
import
us.codecraft.webmagic.utils.UrlUtils
;
import
java.net.URI
;
import
static
org
.
assertj
.
core
.
api
.
Assertions
.
assertThat
;
/**
* @author code4crafter@gmail.com
* Date: 2017/7/22
* Time: 下午5:29
*/
public
class
HttpUriRequestConverterTest
{
@Test
(
expected
=
IllegalArgumentException
.
class
)
public
void
test_illegal_uri
()
throws
Exception
{
HttpUriRequestConverter
httpUriRequestConverter
=
new
HttpUriRequestConverter
();
httpUriRequestConverter
.
convert
(
new
Request
(
"http://bj.zhongkao.com/beikao/yimo/##"
),
Site
.
me
(),
null
);
}
@Test
public
void
test_illegal_uri_correct
()
throws
Exception
{
HttpUriRequestConverter
httpUriRequestConverter
=
new
HttpUriRequestConverter
();
HttpClientRequestContext
requestContext
=
httpUriRequestConverter
.
convert
(
new
Request
(
UrlUtils
.
fixIllegalCharacterInUrl
(
"http://bj.zhongkao.com/beikao/yimo/##"
)),
Site
.
me
(),
null
);
assertThat
(
requestContext
.
getHttpUriRequest
().
getURI
()).
isEqualTo
(
new
URI
(
"http://bj.zhongkao.com/beikao/yimo/#"
));
}
}
\ No newline at end of file
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment