Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
W
webmagic
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
沈俊林
webmagic
Commits
65fe2c44
Commit
65fe2c44
authored
Dec 02, 2016
by
Yihua Huang
Committed by
GitHub
Dec 02, 2016
Browse files
Options
Browse Files
Download
Plain Diff
Merge pull request #407 from jsbd/master
为PhantomJSDownloader添加新的构造函数,支持phantomjs自定义命令
parents
fdf39eb9
ebc61363
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
22 additions
and
3 deletions
+22
-3
PhantomJSDownloader.java
...us/codecraft/webmagic/downloader/PhantomJSDownloader.java
+22
-3
No files found.
webmagic-extension/src/main/java/us/codecraft/webmagic/downloader/PhantomJSDownloader.java
View file @
65fe2c44
...
@@ -20,13 +20,32 @@ import java.io.*;
...
@@ -20,13 +20,32 @@ import java.io.*;
public
class
PhantomJSDownloader
extends
AbstractDownloader
{
public
class
PhantomJSDownloader
extends
AbstractDownloader
{
private
static
Logger
logger
=
LoggerFactory
.
getLogger
(
PhantomJSDownloader
.
class
);
private
static
Logger
logger
=
LoggerFactory
.
getLogger
(
PhantomJSDownloader
.
class
);
private
static
String
phantomJSPath
;
private
static
String
crawlJsPath
;
private
static
String
phantomJsCommand
=
"phantomjs"
;
// default
private
int
retryNum
;
private
int
retryNum
;
private
int
threadNum
;
private
int
threadNum
;
public
PhantomJSDownloader
()
{
public
PhantomJSDownloader
()
{
PhantomJSDownloader
.
phantomJSPath
=
new
File
(
this
.
getClass
().
getResource
(
"/"
).
getPath
()).
getPath
()
+
System
.
getProperty
(
"file.separator"
)
+
"crawl.js "
;
this
.
initPhantomjsCrawlPath
();
}
/**
* 添加新的构造函数,支持phantomjs自定义命令
*
* example:
* phantomjs.exe 支持windows环境
* phantomjs --ignore-ssl-errors=yes 忽略抓取地址是https时的一些错误
* /usr/local/bin/phantomjs 命令的绝对路径,避免因系统环境变量引起的IOException
*
* @param phantomJsCommand
*/
public
PhantomJSDownloader
(
String
phantomJsCommand
)
{
this
.
initPhantomjsCrawlPath
();
PhantomJSDownloader
.
phantomJsCommand
=
phantomJsCommand
;
}
private
void
initPhantomjsCrawlPath
()
{
PhantomJSDownloader
.
crawlJsPath
=
new
File
(
this
.
getClass
().
getResource
(
"/"
).
getPath
()).
getPath
()
+
System
.
getProperty
(
"file.separator"
)
+
"crawl.js "
;
}
}
@Override
@Override
...
@@ -67,7 +86,7 @@ public class PhantomJSDownloader extends AbstractDownloader {
...
@@ -67,7 +86,7 @@ public class PhantomJSDownloader extends AbstractDownloader {
try
{
try
{
String
url
=
request
.
getUrl
();
String
url
=
request
.
getUrl
();
Runtime
runtime
=
Runtime
.
getRuntime
();
Runtime
runtime
=
Runtime
.
getRuntime
();
Process
process
=
runtime
.
exec
(
"phantomjs "
+
phantomJS
Path
+
url
);
Process
process
=
runtime
.
exec
(
phantomJsCommand
+
" "
+
crawlJs
Path
+
url
);
InputStream
is
=
process
.
getInputStream
();
InputStream
is
=
process
.
getInputStream
();
BufferedReader
br
=
new
BufferedReader
(
new
InputStreamReader
(
is
));
BufferedReader
br
=
new
BufferedReader
(
new
InputStreamReader
(
is
));
StringBuffer
stringBuffer
=
new
StringBuffer
();
StringBuffer
stringBuffer
=
new
StringBuffer
();
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment