public class Extractor extends Object
| Modifier and Type | Class and Description |
|---|---|
static class |
Extractor.Entity |
| Modifier and Type | Field and Description |
|---|---|
protected boolean |
extractURLWithoutProtocol |
| Constructor and Description |
|---|
Extractor()
Create a new extractor.
|
| Modifier and Type | Method and Description |
|---|---|
List<String> |
extractCashtags(String text)
Extract $cashtag references from Tweet text.
|
List<Extractor.Entity> |
extractCashtagsWithIndices(String text)
Extract $cashtag references from Tweet text.
|
List<Extractor.Entity> |
extractEntitiesWithIndices(String text)
Extract URLs, @mentions, lists and #hashtag from a given text/tweet.
|
List<String> |
extractHashtags(String text)
Extract #hashtag references from Tweet text.
|
List<Extractor.Entity> |
extractHashtagsWithIndices(String text)
Extract #hashtag references from Tweet text.
|
List<String> |
extractMentionedScreennames(String text)
Extract @username references from Tweet text.
|
List<Extractor.Entity> |
extractMentionedScreennamesWithIndices(String text)
Extract @username references from Tweet text.
|
List<Extractor.Entity> |
extractMentionsOrListsWithIndices(String text) |
String |
extractReplyScreenname(String text)
Extract a @username reference from the beginning of Tweet text.
|
List<String> |
extractURLs(String text)
Extract URL references from Tweet text.
|
List<Extractor.Entity> |
extractURLsWithIndices(String text)
Extract URL references from Tweet text.
|
boolean |
isExtractURLWithoutProtocol() |
void |
modifyIndicesFromUnicodeToUTF16(String text,
List<Extractor.Entity> entities) |
void |
modifyIndicesFromUTF16ToToUnicode(String text,
List<Extractor.Entity> entities) |
void |
setExtractURLWithoutProtocol(boolean extractURLWithoutProtocol) |
public List<Extractor.Entity> extractEntitiesWithIndices(String text)
text - text of tweetpublic List<String> extractMentionedScreennames(String text)
text - of the tweet from which to extract usernamespublic List<Extractor.Entity> extractMentionedScreennamesWithIndices(String text)
text - of the tweet from which to extract usernamespublic List<Extractor.Entity> extractMentionsOrListsWithIndices(String text)
public String extractReplyScreenname(String text)
text - of the tweet from which to extract the replied to usernamepublic List<String> extractURLs(String text)
text - of the tweet from which to extract URLspublic List<Extractor.Entity> extractURLsWithIndices(String text)
text - of the tweet from which to extract URLspublic List<String> extractHashtags(String text)
text - of the tweet from which to extract hashtagspublic List<Extractor.Entity> extractHashtagsWithIndices(String text)
text - of the tweet from which to extract hashtagspublic List<String> extractCashtags(String text)
text - of the tweet from which to extract cashtagspublic List<Extractor.Entity> extractCashtagsWithIndices(String text)
text - of the tweet from which to extract cashtagspublic void setExtractURLWithoutProtocol(boolean extractURLWithoutProtocol)
public boolean isExtractURLWithoutProtocol()
public void modifyIndicesFromUnicodeToUTF16(String text, List<Extractor.Entity> entities)
public void modifyIndicesFromUTF16ToToUnicode(String text, List<Extractor.Entity> entities)
Copyright © 2015. All rights reserved.