Login
Login is restricted to DCN Publisher Members. If you are a DCN Member and don't have an account, register here.

Digital Content Next

Menu

Research / Insights on current and emerging industry topics

Google data collection research

August 21, 2018 | By DCN

While the public has been focused on the ongoing Facebook and Cambridge Analytica scandal, Google has largely avoided public scrutiny about its data collection practices despite having the ability to collect far more personal data about consumers across a variety of touchpoints. There have been efforts to document individual practices by Google such as their efforts to circumvent controls on Safari.  More recently, an investigation by the Associated Press revealed that Google continues to track location data even after a consumer has turned off the setting.  While these research efforts have been important to the public policy dialogue, no research exists which looks at the breadth and depth of data collected by Google.

In “Google Data Collection,” Douglas C. Schmidt, Professor of Computer Science at Vanderbilt University, catalogs how much data Google is collecting about consumers and their most personal habits across all of its products and how that data is being tied together.

The key findings include:
  • A dormant, stationary Android phone (with the Chrome browser active in the background) communicated location information to Google 340 times during a 24-hour period, or at an average of 14 data communications per hour. In fact, location information constituted 35 percent of all the data samples sent to Google.
  • For comparison’s sake, a similar experiment found that on an iOS device with Safari but not Chrome, Google could not collect any appreciable data unless a user was interacting with the device. Moreover, an idle Android phone running the Chrome browser sends back to Google nearly fifty times as many data requests per hour as an idle iOS phone running Safari.
  • An idle Android device communicates with Google nearly 10 times more frequently as an Apple device communicates with Apple servers. These results highlighted the fact that Android and Chrome platforms are critical vehicles for Google’s data collection.  Again, these experiments were done on stationary phones with no user interactions. If you actually use your phone the information collection increases with Google.
  • Google has the ability to associate anonymous data collected through passive means with the personal information of the user. Google makes this association largely through advertising technologies, many of which Google controls. Advertising identifiers—which are purportedly “user anonymous” and collect activity data on apps and third-party webpage visits—can get associated with a user’s real Google identity through passing of device-level identification information to Google servers by an Android device.
  • Likewise, the DoubleClick cookie ID—which tracks a user’s activity on the third-party webpages—is another purportedly “user anonymous” identifier that Google can associate to a user’s Google account. It works when a user accesses a Google application in the same browser in which a third-party webpage was accessed previously.
  • A major part of Google’s data collection occurs while a user is not directly engaged with any of its products. The magnitude of such collection is significant, especially on Android mobile devices, arguably the most popular personal accessory now carried 24/7 by more than 2 billion people.

DCN is grateful to support Professor Schmidt in distributing this research. We offer it to the public with the permission of Professor Schmidt.

Print Friendly and PDF