SQLite is an often overlooked flavor of SQL engines. Some have suggested it is the most prolific SQL engine in existence due to its highly flexible nature and ability to run on almost any platform with limited resources. Unlike SQL engines like MySQL, PostgreSQL, MSSQL, or Oracle, SQLite runs without a server. SQLite does not rely on a data directory or a constantly running daemon: a database is encapsulated in a single file.
SQLite and iMessage
iMessage is one of the most popular messaging platforms today, largely due to being built into iOS and Mac devices. Since its release it has evolved significantly and brought many new features to messaging, but at its core it is simply an instant messaging platform. iMessage uses SQLite in the background to store relational data about messages, conversions, and their participants.
As a long time Apple user, I have backed up and transferred my iPhone data since my first time using an iPhone, which was November 10, 2009. Because I have been digitally hoarding my text data for so long, my iMessage database is nearly 1GB in size.
Until a few years ago, the built-in search feature for iMessage was very limited and buggy. Although it has recently improved significantly, it is, like nearly any end-user tool, very limited in how you can query it. Those of us who frequently work with data that is trapped behind a limited front-end often wish we could get direct access to the SQL database. Fortunately, the iMessage database is not inaccessible and very easy to access.
Finding the iMessage SQL Database
On Your Mac
If you have iMessage enabled on your Mac as well as your iPhone you have 2 different databases from which to choose.
The database on your Mac is very easy to find as it is simply under
If you do not use your Mac for iMessage, or as in my case your Mac iMessages do not go as far back,
you can extract your iPhone's database by performing a backup to your Mac.
On Your iPhone
Follow these instructions to extract your iPhone's iMessage database:
- Opening Finder and selecting your iPhone under "Locations".
- Find the "Backups" section and select "Back up all of the data on your iPhone to this Mac", then press Back Up Now to immediately create a new backup. (This process may take a while.)
- Once it is complete, you will find the SQLite file under
/Users/[username]/Library/Application Support/MobileSync/Backup/[backup name]/3d/3d0d7e5fb2ce288813306e4d4636395e047a3d28.
- If you plan to open this database with Arctype, you'll want to copy and rename the file with a
.dbextension to indicate that it is a SQLite file.
Getting Started with SQLite
Unlike most SQL servers, you do not need a connection string, host, or username to connect to a SQLite database. All you need to do is point your SQL client to the database file. Arctype makes it simple and convenient to load in SQLite databases within the same workspace as your other connections.
- Under the Connections dropdown, select "Add new data source"
- Select "SQLite"
- Find the SQLite database file. The file must have a .sqlite3 or .db extension for Arctype to open it.
More detailed instructions can be found in the Arctype Docs.
With Command Line
From a UNIX terminal, type
One of my favorite parts about Arctype is how easy it is to analyze database schema. I'm a long time user of command line
tools and old-school editors, but sometimes having a more visually interactive tool is a lifesaver.
Let's dig into the schema Apple has created for iMessage.
Today we will focus on the
handle tables, as well as a few join tables to connect related records.
Note that I have created a custom view called
handle2 which adds a field
id2 that obfuscates the phone numbers
and email addresses of my personal contacts,
and you will see this view referenced in the examples in this article.
Digging Into iMessage
Let's write some queries and makes some observations that would not be possible without direct SQL access.
Pique Your Nostalgia with Old Messages
To get started, let's begin with a simple query to view your first 50 messages. If you have chat threads that go back years and years, there is no easy way to access early messages from your iPhone or Mac.
The interface on both platforms requires you to scroll back by about 25 messages at a time. This is prohibitively time-consuming and can result in a crash or reset if the user sends you a new message while you're scrolled back.
Fortunately, we have custom SQL for the save:
select h.id2 as sender_name, m.text as message_body from message m join handle2 h on h.rowid = m.handle_id order by m.date limit 50;
handle.id represents the readable identifier for the user. It will be either a phone number or an email address.
Rate Your Friendships with SQL
Let's use SQL to find out who our best friends are. Assuming you view quality of friendship as a function of the quantity of sent text messages, this should be very accurate!
First, let's divide the amount of messages that are
from_me by those that are not to produce a reply ratio. This query shows the top 10 people
we have been messaging by the total amount of messages, as well as the reply ratio.
Multiplication by 1.0 casts to the
REAL data type to avoid integer division, which would result in 1 or 0 instead of a decimal.
Here you can see the rules for integer division in SQLite.
select h.id2, count(1) as cnt, round( sum( case when m.is_from_me then 1 else 0 end ) * 1.0 / count(1) * 100.0, 2 ) from message m join handle2 h on h.rowid = m.handle_id group by h.id order by cnt desc limit 10;
One issue with this analysis is that fewer sent messages does not necessarily imply fewer words sent. Let's add some more fields to get a better insight.
Here we can see the total amount of characters sent and received, the average length of text message sent and received, the total ratio of characters sent and received, and the reply ratio. In my case, people from whom I tend to receive more messages also send longer messages than I.
select h.id, count(1) as cnt, sum(length(m.text)) as chars, sum(length(m.text)) filter (where m.is_from_me) as chars_sent, sum(length(m.text)) filter (where not m.is_from_me) as chars_received, round(avg(length(m.text)) filter (where m.is_from_me)) as avg_length_sent, round(avg(length(m.text)) filter (where not m.is_from_me)) as avg_length_received, round((sum(length(m.text)) filter (where m.is_from_me) * 1.0 / sum(length(m.text)) filter (where not m.is_from_me)), 2) as characters_sent_ratio, round((count(1) filter (where m.is_from_me)) * 1.0 / (count(1) filter (where not m.is_from_me)), 2) as reply_ratio from message m join handle h on h.rowid = m.handle_id group by h.id order by cnt desc limit 10;
This query makes heavy use of aggregate filters.
Aggregate filters allow you to use an aggregate function on only a part of the data by specifying a
WHERE clause to filter out unwanted records.
Examining iMessage Reactions
There are 2 newer iMessage features whose implementations, in the context of their schema design, are interesting to look into. Recently it was announced that Android phones will be able to show iMessage "reactions" properly. Historically, if you send an iMessage reaction to a non-Apple device, it will show up as a textual addition instead of an icon.
With the announcement of the new compatibility with Android devices, I was curious to learn how the current implmentation of the feature works.
a few records with and without a reaction and compared the results. I discovered that the
associated_message_type column was usually set to 0, but in messages
with a reaction it was an integer value between 2000-2005. I also noticed that
associated_message_guid was present.
Apple appears to be using 2000-2005 for its 5 reaction types, 3000-3005 for when a user removed a reaction, 3 for an Apple Pay request.
From this investigation it appears that reactions are sent as iMessages with the reaction's textual equivalent appended and a foreign key relation to the parent message. This allows the messages to seamlessly be sent and received by non-Apple devices.
If the message is sent over SMS, the metadata linking the reaction to
the message it references is simply lost. If the device is iMessage capable, Apple devices will ignore the
text part of the message, find the associated
message and add the proper reaction as a visual overlay.
Note that the
message table includes both a
ROWID and a
ROWID is a typical auto-increment integer
id field which
is useful for joining on within the local database. However, the auto-incremented primary key will not be the
same for the same message across devices. The
GUID is globally unique, generated by the author of the message,
and sent to all of its recipients. This allows foreign key reference across different databases, devices, and users.
For more information about the utility of GUIDs, check out this article.
Find Your Most Popular Group Chats
Group chats are stored in the
chat table. Join tables
chat_message_join are used to associate users and messages, respectively,
with group chats. Here's a query to find out most used group chats (chat with > 1 members) and the identities of their participants.
select group_concat(distinct h.id2) as participants, count(m. "ROWID") as message_count from chat c join chat_handle_join chj on chj.chat_id = c."ROWID" join handle2 h on h. "ROWID" = chj.handle_id join chat_message_join cmj on cmj.chat_id = c."ROWID" join message m on m. "ROWID" = cmj.message_id group by c."ROWID" having count(distinct h.id) > 1 order by message_count desc limit 10
group_concat function, which is familiar from MySQL by the same name and familiar to PostgreSQL users as
string_agg, is an aggregate function that
concatenates strings together. See more on how it can be used within SQLite here.
HAVING clause is similar to a
WHERE clause but operates on aggregate functions. If you've wanted to write a query
conditional on an aggregate but are not able to inside of your
HAVING is there for you.
SQLite is a powerful tool whose prolific reach across devices and numerous use cases make it one of the most impressive software projects around. If you're curious about what's behind the scenes, SQLite's source code is well known to be well-organized and fun (well, to some of us) to peek into.
iMessage is just one of many pieces of software that rely on SQLite and are used by million of end-users. Download Arctype for free today and start exploring the databases that power the tools you use daily!