Integrations

Integrations #

Offline Twitter data is stored in a simple and transparent way. The twitter.db file is meant to be looked at. The cool part of data ownership is being able to do what you want with it.

Example: Tweet-posting histogram #

Here’s a Python script which will use PyPlot to create a histogram of what time-of-day a user most frequently sends tweets. It loads all the tweets from that user since January 1st, 2024, groups them into 30-minute buckets, and plots them in a bar chart.

import sys
from datetime import datetime
from matplotlib import pyplot
import sqlite3

# Composability: Offline Twitter uses SQLite, which is very composable!
db = sqlite3.connect("file:twitter.db?mode=ro")

if len(sys.argv) > 1:
	user = sys.argv[1]
else:
	user = "elonmusk"

c = db.cursor()
c.execute('''
	select posted_at from tweets
	 where user_id = (select id from users where handle like "{}")
	   and posted_at > strftime("%s", "2024-01-01")
'''.format(user))

buckets = [0] * 24 * 2 # Every 30 mins

# Create histogram data
for result in c.fetchall():
	timestamp = result[0] / 1000 # timestamps are in milliseconds
	if timestamp < 0: continue # Remove any missing data
	dt = datetime.fromtimestamp(timestamp)
	bucket = 2 * dt.hour + (dt.minute > 30)
	buckets[bucket] += 1

# Format buckets as 0 => "12:00 am", 1 => "1:00 am", etc
bucket_labels = [
	datetime(2024, 1, 1, b//2, 30 * (b % 2)).strftime("%l:%M %p")
	for b in range(len(buckets))
]

# Plot a bar chart
pyplot.bar(bucket_labels, buckets)
pyplot.title(f"{user} tweeting times")
pyplot.ylabel("Tweets")
pyplot.xlabel("Time of day")
pyplot.xticks(rotation=60)
pyplot.show()

Here is the histogram for @yacineMTB:

python histogram.py yacineMTB

@yacineMTB&rsquo;s tweeting times (click to expand)

Since I am currently on the Pacific coast (PST), and @yacineMTB starts tweeting around 5am-6am and drops off around 9pm, I infer that he probably lives somewhere in the EST time zone.