Twitter Scrapper
The Twitter Scraper component is used to fetch tweets from Twitter based on a specified hashtag. It supports fetching both historical tweets and real-time streams, making it useful for:
Social media sentiment analysis
Brand monitoring
Event tracking and trend analysis
Configuration Sections
All configurations are classified into the following sections:
Basic Information
Meta Information
Resource Configuration
Connection Validation
Basic Information Tab
The Basic Information tab defines general execution settings.
Invocation Type
Select execution mode: Batch or Real-Time.
Yes
Deployment Type
Displays the deployment type of the component (pre-selected).
Yes
Container Image Version
Displays the Docker image version used (pre-selected).
Yes
Failover Event
Select a failover event to handle retries or errors.
Optional
Batch Size
Maximum number of records processed in one execution cycle (minimum: 10).
Yes
Meta Information Tab
The Meta Information tab defines authentication and query parameters for fetching tweets.
Consumer API Key
API key provided by Twitter Developer Portal.
Yes
Consumer API Secret Key
Secret key (acts as password) associated with the Consumer API Key.
Yes
Filter Text
Hashtag or keyword to filter tweets (e.g., #AI
, #BigData
).
Yes
Twitter Data Type
Select one of the following options: History (fetch past tweets) or Real-Time (fetch live tweets as they are posted).
Yes
Saving the Configuration
Enter API credentials and filter details in the Meta Information tab.
Click the Save Component (Storage icon).
A success message confirms that the component properties have been saved.
Activate the pipeline to start fetching tweets.
Example Workflow
Configure Twitter Scraper with:
Consumer API Key:
xxxxx
Consumer API Secret Key:
yyyyy
Filter Text:
#ClimateChange
Twitter Data Type:
Real-Time
Start the pipeline.
Tweets containing
#ClimateChange
are ingested into the pipeline and passed to downstream components for sentiment analysis and dashboard visualization.