Dockerized data pipeline
An implementation of a dockerized data pipeline that sends randomized tweets about politics together with their sentiment scores
The Docker-Compose pipeline includes five containers.
collects tweets with the Twitter API and
tweepy
stores the tweets in a MongoDB
applies an ETL job that
- extracts the tweets from MongoDB
- gets the sentiments of the texts with
VADERSentiment
loads the tweets and their sentiment scores in a Postgres database
creates a Slackbot that post a randomly selected anonymized tweet from the Postgres database into a Slack channel.
See more here