Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
W
webmagic
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
沈俊林
webmagic
Commits
77e6ca29
Commit
77e6ca29
authored
Aug 17, 2013
by
yihua.huang
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
update comments
parent
50732582
Changes
6
Show whitespace changes
Inline
Side-by-side
Showing
6 changed files
with
17 additions
and
19 deletions
+17
-19
PlainText.java
...c/main/java/us/codecraft/webmagic/selector/PlainText.java
+5
-4
Experimental.java
...c/main/java/us/codecraft/webmagic/utils/Experimental.java
+1
-1
FilePersistentBase.java
.../java/us/codecraft/webmagic/utils/FilePersistentBase.java
+2
-3
ThreadUtils.java
...rc/main/java/us/codecraft/webmagic/utils/ThreadUtils.java
+1
-3
UrlUtils.java
...e/src/main/java/us/codecraft/webmagic/utils/UrlUtils.java
+7
-7
package.html
...re/src/main/java/us/codecraft/webmagic/utils/package.html
+1
-1
No files found.
webmagic-core/src/main/java/us/codecraft/webmagic/selector/PlainText.java
View file @
77e6ca29
...
...
@@ -6,10 +6,11 @@ import java.util.ArrayList;
import
java.util.List
;
/**
* 可抽取的纯文本,不包括xpath和css selector实现。<br>
* Selectable plain text.<br>
* Can not be selected by XPath or CSS Selector.
*
* @author code4crafter@gmail.com <br>
* Date: 13-4-21
* Time: 上午7:54
* @since 0.1.0
*/
public
class
PlainText
implements
Selectable
{
...
...
@@ -59,7 +60,7 @@ public class PlainText implements Selectable {
List
<
String
>
results
=
new
ArrayList
<
String
>();
for
(
String
string
:
strings
)
{
String
result
=
selector
.
select
(
string
);
if
(
result
!=
null
)
{
if
(
result
!=
null
)
{
results
.
add
(
result
);
}
}
...
...
webmagic-core/src/main/java/us/codecraft/webmagic/utils/Experimental.java
View file @
77e6ca29
package
us
.
codecraft
.
webmagic
.
utils
;
/**
* Stands for features unstable.
* @author code4crafter@gmail.com <br>
* Stands for features not stable.
*/
public
@interface
Experimental
{
}
webmagic-core/src/main/java/us/codecraft/webmagic/utils/FilePersistentBase.java
View file @
77e6ca29
...
...
@@ -3,11 +3,10 @@ package us.codecraft.webmagic.utils;
import
java.io.File
;
/**
*
文件持久化的基础类。<br>
*
Base object of file persistence.
*
* @author code4crafter@gmail.com <br>
* Date: 13-8-11 <br>
* Time: 下午4:21 <br>
* @since 0.2.0
*/
public
class
FilePersistentBase
{
...
...
webmagic-core/src/main/java/us/codecraft/webmagic/utils/ThreadUtils.java
View file @
77e6ca29
...
...
@@ -6,10 +6,8 @@ import java.util.concurrent.ThreadPoolExecutor;
import
java.util.concurrent.TimeUnit
;
/**
* 线程工具类。<br>
* @author code4crafer@gmail.com
* Date: 13-6-23
* Time: 下午7:11
* @since 0.1.0
*/
public
class
ThreadUtils
{
...
...
webmagic-core/src/main/java/us/codecraft/webmagic/utils/UrlUtils.java
View file @
77e6ca29
...
...
@@ -6,20 +6,20 @@ import java.util.regex.Matcher;
import
java.util.regex.Pattern
;
/**
* url及html处理工具类。<br>
* url and html utils.
*
* @author code4crafter@gmail.com <br>
* Date: 13-4-21
* Time: 下午1:52
* @since 0.1.0
*/
public
class
UrlUtils
{
private
static
Pattern
relativePathPattern
=
Pattern
.
compile
(
"^([\\.]+)/"
);
/**
*
将url想对地址转化为绝对地址
* @param url
url地址
* @param refer
url地址来自哪个页面
* @return
url绝对地址
*
canonicalizeUrl
* @param url
* @param refer
* @return
canonicalizeUrl
*/
public
static
String
canonicalizeUrl
(
String
url
,
String
refer
)
{
if
(
StringUtils
.
isBlank
(
url
)
||
StringUtils
.
isBlank
(
refer
))
{
...
...
webmagic-core/src/main/java/us/codecraft/webmagic/utils/package.html
View file @
77e6ca29
<html>
<body>
提供一些处理链接的静态工具类。
Static utils of webmagic.
</body>
</html>
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment