What is Crypto Sentiment Analysis? (Opinion Mining)
Crypto sentiment analysis involves evaluating public perceptions of cryptocurrencies through monitoring social media and news sentiment. Also termed 'opinion mining', it helps traders gauge market sentiment and understand how other investors are feeling about a particular coin or category. In the case of crypto, Twitter/X is the optimal platform for gauging social sentiment.
While the process typically involves machine learning and artificial intelligence to mine and process the data, we’ll be demonstrating how to conduct a crypto sentiment analysis with Twitter/X and CoinGecko API.
How to Develop a Sentiment-Based Crypto Trading Strategy
To conduct a market sentiment-based crypto trading strategy, follow the 5 steps outlined below:
- Generate Twitter/X and CoinGecko API keys.
- Open an IDE/code editor (like a Jupyter notebook), download Python and pip.
- Install Python packages that processes market sentiment data.
- Extract sentiment, synthesize the data set and generate polarity scores (sentiment scores).
- Develop crypto trading signals based on sentiment scores.
Essentially by synthetisizing crypto market sentiment data, you can utilize insights to inform potential entry and exit positions. Let's dive in!
Generate Twitter and CoinGecko API Keys
Twitter/X API
The crypto community has nestled into Twitter/X, also known as ‘crypto Twitter', where news often break faster than traditional media outlets. Given the volatile nature of crypto tied to perception shifts, monitoring market sentiment on Twitter/X can offer traders a valuable edge. On this basis, we will be developing a crypto trading strategy based on Twitter/X sentiment analysis, using the Twitter/X API.
To gain access to the Twitter/X API you will first need to create developer account. After which, you will receive an API key, an API secret key, a Bearer token, an access token, and an access token secret.
As of February 9, 2023, Twitter/X launched their Basic and Pro plans. In this guide, we’ll be leveraging the 'Search Tweets' endpoint found under their Basic plan. The Basic plan also allows you to fetch data for tweet counts and retweets, which can come in handy when analyzing sentiment.
CoinGecko API
Every CoinGecko user can generate 1 API key by signing up for the Demo Plan, and will enjoy a stable rate limit of 30 calls per minute and a monthly cap of 10,000 calls. We’ll be using the Open High Low Close (OHLC) /coins/{id}/ohlc endpoint for this guide, retrieving coins' OHLC data where data granularity is automatic for Demo API users:
- 30-minute data granularity for 1-2 days away from now
- 4-hourly data for 3-30 days away from now
- 4-daily data for >31 days away from now
Download Python and Pip
Before getting started, do also ensure that you have Python and pip downloaded. Python can be downloaded here, and you may follow these steps for pip installation:
- Press command + spacebar and type in 'Terminal'
- Check your Python version by typing
python3 --version
- Download pip by typing in:
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
- Then type:
python3 get-pip.py
Install Python Packages to Process Sentiment Data
We recommend using the following 3 Python packages to effectively process the market sentiment data:
-
ReGex Python Package: To clean-up the tweets we search for, we will need the regular expression (regex) Python package. This will help wash the tweets of the sprawling characters attached to each call.
-
Tweepy Package: The easiest and most convenient way to access the Twitter/X API through Python, is through the open-sourced Tweepy package. The Tweepy package will give us a convenient way to access the Twitter API through Python. It includes an assortment of methods that resemble Twitter/X’s or X’s, API endpoints, providing a variety of implementation aid and detail.
-
NLTK Vader Lexicon Package: Text analysis is an integral part of a number of industries, and one of its most important subsections is sentiment analysis. The Natural Language Toolkit (NLTK) is a ubiquitous library in Python programming for natural language processing (NLP). Advances in NLP have made it possible to perform sentiment analysis on large volumes of data, accompanying our construction of a sentiment-based strategy. There are several ways to perform NLP, such as lexicon-based analysis, machine learning, or pre-trained transformer-based deep learning. For this strategy we will be utilizing a lexicon-based analysis.
The NLTK Vader Sentiment Analyzer uses a set of predefined rules to determine the sentiment of a text. Hence, we can specify the degree of positive or negative polarity required in our compound score, to perform a trade.
Use pip to install the above packages, still in the terminal:
pip install tweepy
pip install nltk
nltk.download(‘vader_lexicon’)
Next, access jupyter notebook, and import the required packages:
import pandas as pd
import tweepy
import re
from nltk.sentiment.vader import SentimentIntensityAnalyzer
Conduct a Crypto Sentiment Analysis using Python
Now that we have installed Python and Pip and imported the required packages, it’s time to finally collect Twitter/X sentiment data.
Apply our newly acquired Twitter/X API credentials in Jupyter notebook:
API_key = ‘********************’
API_secret = ‘***********************’
Access_token = ‘*********************’
Access_secret = ‘***********************’
Simply replace the stars with your personal developer details. We will then use Tweepy to authenticate, using our credentials, and access the API:
# From Tweepy docs
auth = tweepy.OAuthHandler(API_key, API_secret)
auth.set_access_token(Access_token, Access_secret)
api = tweepy.API(auth)
We can then define our parameters and create a function using regex to clean up our pulled tweets. We will pull tweets containing $ETHUSD as an example:
# Define Variables
count = 10
coinid = ‘$ETHUSD’
#Clean up collected tweets with regex
def nice_tweet(tweet):
return' '.join(re.sub("(@[A-Za-z0-9]+)|([^0-9A-Za-z \t])|(\w+:\/\/\S+)", " ", tweet).split())
To apply the function on actual tweets, let's next create a ‘get_tweets’ method. Starting with an empty set to contain our 10 tweets, we'll then sift through it to determine if the tweet is already included, before leaving a lean list of recent tweets involving ‘$ETHUSD’:
def get_tweets(coinid, count):
tweets = set()
collected_tweets = api.search(q = coinid, count = count)
for tweet in collected_tweets:
cleaned_tweet = clean_tweet(tweet.text)
if cleaned_tweet not in tweets:
tweets.add(cleaned_tweet)
return tweets
After we have fetched data on the $ETHUSD tweets, let’s run them through a sentiment analyzer, to determine their polarity (i.e. how positive/negative/neutral they are).
def polarity(tweets):
scores = []
for tweet in tweets:
score = SentimentIntensityAnalyzer().polarity_scores(tweet)
score[‘tweet’] = tweet
scores.append(score)
return scores
With that, we have pulled tweets from the Twitter/X API, cleaned the data, and determined their sentiment score through the NLTK Vader Sentiment Intensity Analyzer.
Develop Crypto Trading Signals Based on Sentiment Scores
Let's now pull in crypto price data using the CoinGecko API, to combine them with our previous sentiment analysis and create trades. Use the following code to import Ethereum (ETH)’s Open High Low Close (OHLC) data for the last 30 days:
from pycoingecko import CoinGeckoAPI
cg = CoinGeckoAPI()
ohlc = cg.get_coin_ohlc_by_id(
id="ethereum", vs_currency="usd", days="30"
)
df = pd.DataFrame(ohlc)
df.columns = ["date", "open", "high", "low", "close"]
df["date"] = pd.to_datetime(df["date"], unit="ms")
df.set_index('date', inplace=True)
With this data and our polarity scores we can construct a simple signal to enter long and short positions with ETH prices. The compound score produced from our polarity function combines negative, positive, and neutral scores, giving us a single consolidated value. Rules can be varied, and for a simple trading signal we can follow this logic:
- If the compound score is greater than 0.06, buy 1 ETH at the spot price
- If the compound score is below 0.04, sell our 1 ETH position
- Otherwise, no action
This translates to the following python script:
def Signal(scores):
LongPrice = []
ShortPrice = []
Signal = 0
SignalList = []
tweets = get_tweets(coinid, count)
scores = polarity(tweets)
#Getting compound score
df = pd.DataFrame.from_records(scores)
mean = df.mean()
compound_score = mean[‘compound’]
for i in range(len(df)):
if (compound_score> 0.06 and Signal != 1
):
LongPrice.append(self.df['open'].iloc[i+1])
ShortPrice.append(np.nan)
Signal = 1
elif (compound_score <= 0.04 and Signal == 1
):
LongPrice.append(np.nan)
ShortPrice.append(self.df['open'].iloc[i+1])
Signal = -1
else:
LongPrice.append(np.nan)
ShortPrice.append(np.nan)
if Signal == -1:
Signal = 0
SignalList.append(Signal)
return SignalList, LongPrice, ShortPrice
With that, we have created a signal using our pulled CoinGecko $ETHUSD price data and polarity scores.
💡 Pro-tip: Be sure to check out Shashank Vemuri’s Medium blog, covering articles concerning the Tweepy and NLTK libraries.
For Professional Traders: Unlock Daily OHLC Data
Incorporating sentiment analysis into crypto trading strategies offers valuable market insights for informed decision-making. Using the Tweepy and NLTK Vader Sentiment Analyzer packages, traders can extract sentiment data from platforms like crypto Twitter/X, generating polarity or sentiment scores. Trading signals can then be developed based on sentiment scores and coin OHLC data from CoinGecko API. However, it's important to note that results may vary based on individual parameters and market circumstances.
Professional traders looking to maximize the OHLC endpoint with a daily candle interval data granularity may consider subscribing to our Analyst API plan. This is exclusively available to paid subscribers, along with endpoints like:
- /coins/list/new – get the latest 200 coins as listed on CoinGecko
- /coins/top_gainers_losers – get the top 30 coins with largest price gain and loss, within a specific time duration
- /exchange/{exchange_id}/volume_chart/range – get the historical volume data of an exchange for a specified date range
If you require a custom solution, fill in the form below to get in touch with our API sales team:
Disclaimer: This guide is for illustrative and informational purposes only, and does not constitute professional or financial advice. Always do your own research and be careful when putting your money into any crypto or financial asset.
Subscribe to the CoinGecko Daily Newsletter!