Detailed analysis of iPhone location data

After reading Pete Warden’s iPhone Tracker FAQ and the article by Alex Levinson, I wanted to extract the precise location information from the consolidated.db file from my iPhone backups to see exactly how accurate the location data were, as well as to learn how frequently the location was being sampled and saved. Unfortunately, I had a rather busy week and wasn’t able to get around to it right away.

Apple officially responded on Wednesday. While the statement itself is a masterful example of doublespeak, I was intrigued by the following paragraph:

The iPhone is not logging your location. Rather, it’s maintaining a database of Wi-Fi hotspots and cell towers around your current location, some of which may be located more than one hundred miles away from your iPhone, to help your iPhone rapidly and accurately calculate its location when requested… These calculations are performed live on the iPhone using a crowd-sourced database of Wi-Fi hotspot and cell tower data that is generated by tens of millions of iPhones sending the geo-tagged locations of nearby Wi-Fi hotspots and cell towers in an anonymous and encrypted form to Apple.

I thought it would be interesting to check the precise location information to see if the data bore out this description. Furthermore, since I had precise GPS position data from the SPOT Messenger during a portion of the recorded time period, I was anxious to compare the coordinates to see how closely they matched.

Here is what I found:

Since I often disable Wifi when I’m on the move to save battery life, the Wifi location data should be sparse and so I decided to concentrate on the CellLocation table. Following the instructions in the iPhone Tracker FAQ, I extracted the CellLocation table entries. There were 13’412 points. Since I had precise location data from the GPS for the dates September 3-14th, 2010, I eliminated all entries outside this date range, leaving 3’161 rows.

I first mapped the data using GPS Visualizer to compare with the output of the iPhone Tracker app. According to the app’s FAQ,

To make it [iPhone Tracker] less useful for snoops, the spatial and temporal accuracy of the data has been artificially reduced. You can only animate week-by-week even though the data is timed to the second, and if you zoom in you’ll see the points are constrained to a grid, so your exact location is not revealed. The underlying database has no such constraints, unfortunately.

A comparison of the iPhone Tracker output with the raw data shows that the two maps look remarkably similar except for the size and color of the location points, so it isn’t clear how much the accuracy of the iPhone Tracker has been “reduced” for this data set anyway. Compare the maps below for the recorded locations in Tunisia.

Raw iPhone location data (Tunisia)

iPhone Tracker output (Tunisia)

Next I compared the database output with the GPS tracks generated with the Spot Messenger. Although I was able to superpose the tracks and the cell locations on the same map using GPS Visualizer, I was not able to make an embeddable Google Map due to the large number of data points. Here are a few screen snaps showing the Cell Location data (red pins) and the GPS tracks (solid lines).

GPS and iPhone Location data near Marseille

GPS and iPhone Location data from Northern Tunisia

GPS and iPhone Location data from Southern Tunisia

To better understand the scatter in the iPhone data, I took a look at the timestamps. The vast majority were the same even though information such as the Cellular Network, Cell ID and longitude/latitude was not! Assuming the format of the timestamp is NSDATE seconds (iPhone consolidated.db location tracking notes), I calculated the time intervals between adjacent rows. These are plotted below. Of the 3’161 rows in the sample, 14 points were recorded at intervals of over 1 hour, 44 were recorded at intervals of less than 1 hour. As shown, the interval between measurements is variable. I did not detect any pattern or obvious trigger for recording information with different timestamps.

Plot of time intervals between location records

Plot of time intervals between location records showing shorter times

Since the vast majority of measurements seem to have been made at the same time, I wondered if I could test the hypothesis that these were the locations of the Cell Towers as suggested in Apple’s Official Statement. I extracted 75 points with the same timestamp taken around the region of Lyon. These are mapped below. As mentioned in Apple’s statement, some of them are over 100km from my actual location at that time (I was on the A42 highway). I examined several of them in satellite and street view modes to see if I could detect the presence of a cell tower. I did not see anything that looked like a tower or antenna for the points I examined. Although I did not notice anything, this test is certainly not conclusive since actual installations may be quite small and difficult to recognize from the available imagery.

Although not conclusive, examination of the sample database entries was quite instructive. The iPhone Tracker FAQ suggested that the underlying data is very precise, however comparison with very accurate GPS coordinates shows that the locations in the consolidated.db are far less accurate. The large number of varying measurements with identical timestamps is further evidence of this and lends support to Apple’s explanation of the data, although I was not able to confirm that the measured positions actually coincided with the location of cell phone towers. Finally, the interval between recording of location information was highly variable but in most cases greater than about five minutes.

I should note here that during several portions of the trip, such as while on the ferry or in isolated regions in the Sahara, I turned the phone off due to the absence of network coverage. I don’t remember exactly when I did this, so I would have to go through the phone logs to determine the precise times. Although turning the phone off certainly has some effect on the location measurements, aside from producing some gaps in the data coverage and explaining some of the longer time intervals between location records, the results of the present analysis should remain valid.

In writing this post, I ran across another interesting analysis of the location information in the consolidated.db by Zach Brand. He came to some of the same conclusions I did.

This entry was posted in Apple, Privacy. Bookmark the permalink.

6 Responses to Detailed analysis of iPhone location data

  1. Pingback: iPhone Tracker tests | The well-prepared mind

  2. clarinette02 says:

    Why did I miss your excellent post?
    Now the WP art 29 has published its opinion, what is your view on the accuracy of geolocation data as they’ve analyzed it?

    • laura says:

      I was not aware of the opinion by the Working Party on Data Protection, but have just read through it. I think they have very clearly addressed the issues. The accuracy of the geolocation data derived by a particular service provider, or data controller, to use the language of the report, depends quite closely on the details of the particular smart phone implementation. In the report, the committee has wisely adopted a position that will protect privacy in the case of very accurate localization methods.

      From what I have seen, including the present analysis of my own iPhone data, the information being collected by Apple was not this precise. I could imagine that with a large amount of data points and sophisticated analysis some very personal information could be deduced. However, I think that at this time such an analysis would be impractical to perform on a large scale in an automatic way for a large number of users. That situation may change significantly in the near future.

  3. George says:

    Based on your research: Do you believe that GPS data retrieved off iPhones by police investigators is sufficient proof to place a suspect at a crime scene?

    • laura says:

      Hello George,

      Thanks for your question. I’m not in a position to give you a good answer because I’m not an expert in legal matters or law enforcement, and I haven’t looked into that aspect.

      From a purely technical standpoint and without considering questions about accuracy and reliability, the GPS data only tells you where the phone was. The presumption that the phone was in possession of the owner at the time would then be the crucial point. How the burden of proof would be handled in a court of law is another question entirely. I’d venture a guess the answer may also vary by country and jurisdiction.

  4. Pingback: The Future Of Privacy | The well-prepared mind

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s