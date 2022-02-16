Google is the ultimate Big Brother. Facebook is not very far away: with what it knows about me, it could rebuild my life whenever it wanted. However: How much does Amazon know about me?

That is what we wanted to find out now that the company gives the option to download all the data it collects when using its services or products such as its smart speakers. There is good and bad news. Let’s go with them.

First the bad news

Amazon collects quite a lot of data. That is the reality that one can confirm by downloading everything it has on us. The process to obtain this data is simple.: just make the request following the steps indicated in the Amazon documentation.





In fact, to request the data we will have to go to this link indicated in that documentation, which will take us to a page with a drop-down menu in which we can choose what collected data we want to download.

You can choose any of the options shown in the image in case you only want that specific data. For example, Alexa and Echo devices only, Kindle only, or Advertising only. I wanted to download all of them, so I chose the “Request all your data” displayed at the bottom of that dropdown menu.

From there it’s time to wait. And wait a long time, because those requests take weeks to process. In my case, the email that warned me that the data could already be downloaded it took 22 days to arrive since I made the request.





Amazon also does not make downloading this data especially fast or comfortable: on that web page there is a huge list of small .ZIP files that we can download individually: nowhere is there an option to download them all at once. We will have to click on each “Download” button to save them on our computer.

Now for the good news: the data collection is reasonable and non-invasive.

Once those files were downloaded —in my case, around 130— I began to uncompress them. Most of which were very small and under 10KB, and when you unzip them you can quickly see how the vast majority are text files in .CSV formatprepared to be imported into spreadsheets and showing tabulated information about our use of Amazon products and services.

Now comes the important thing. What data does Amazon collect? After analyzing these files, we were able to verify how, in very specific cases, the information that it keeps about us is saved. It is just the one used for your online store.

Thus, Amazon saves information about our shipping addresses —this is done in one of the few PDFs that appear in the collection of collected data—, partial data on credit or debit cards that we use to pay for the items (last four digits, expiration, associated bank), and of course tables related to our operations on the Amazon website: purchases made, returns, wish list, etc.

That Amazon collects and stores this data seems logical considering that we use it when buying items and using the Amazon online store. There is nothing alarming hereor at least not if indeed the information downloaded in this segment is all that Amazon stores.



Amazon records our use or participation (engagement) in the apps that we download through its application store. I have a Kindle Fire HD tablet that my kids mostly used to play games or watch videos, and Amazon recorded the duration of those sessions. That allows you to know which applications or games are most popular and used among your users.

The rest of the files are mostly also quite innocuous. We have a list with the “marketplaces” (“marketplaces”) in which we have entered with any of our devices —for example, when we download games—, lists with entries that indicate if we have taken advantage of any promotion, or tables with information about notifications that Amazon has sent us by email.

There is also a table —derived from the registration.csv file— in which the devices in which we have used the Amazon account are shown, but without clearly identifying them: their serial numbers appear and then the generic name that Amazon generates —for example, “Javier’s 14th Android Device”—as well as the dates the Amazon account was activated and deactivated on the device. Once again, reasonable data that it is normal for Amazon to have saved as part of our history of use of its services.

Alexa doesn’t care, but Amazon is very interested in how much we read and watch on Kindle and Prime Video

The only thing I found relatively curious were two things. The first, that Amazon Do you know what car I have? ANDn reality it is something totally normal, since I have bought some spare part like the windshield wiper there.

The second, the one that probably concerns more users, is what kind of data Amazon collects from our use of your smart speakers or Prime Video. Its Echo family is very popular and gives access to comfortable functions through voice commands, but to what extent is our privacy invaded?

Judging from the data collected, that invasion seems once again almost nil. In the downloaded files in my case there were no Alexa recordings, but the reason is simple: we use my wife’s Amazon account for this device, because hers is the one associated with Prime.

In the end, it doesn’t matter if I didn’t download them, because everything Amazon records when we talk to Alexa is recorded and accessible in the Alexa privacy section of our Amazon account. By visiting that website we will see a list of the voice commands that we have givenand if we deploy any of them we can play the audio stream that has been recorded on Amazon servers.





When reviewing those commands I have only been able to find audio clips in which I controlled the music playback of the speaker or I asked about the time.

There are no weird recordings of conversations caught in the background for example, and once again it seems that the data collection behavior here is the expected. On that same web page it is also possible to delete all the recordings, which gives the user control over those files.





Perhaps the most curious thing about all this data collection has been to see how Amazon collects a lot of information about our reading sessions in its e-book readers. There are quite a few files related to the Kindle, though most are once again innocuous.

However, there are some files in which there is a clear tracking of our reading activity. The “Kindle.Devices.ReadingSession.csv” file is the most revealing here: it shows the start and end time of the session, the identifier of the electronic book through its ASIN code, and then two even more curious data: the time we have been reading (in milliseconds), and how many pages we turn in that session.





Here there is of course a particular obsession of Amazon to know what books we have read and if those books have interested us or not. Those metrics are very similar to what the service maintains for Prime Video.and that show in CSV tables which movie or series we watched and how many seconds we watched it in each session.

Is that interest of Amazon legitimate? Well, certainly one could argue that thanks to that Amazon knows which books, series or movies work in their catalogs, so from that point of view the collection of that data that they do in Amazon seems logical.

That, and all the spreadsheets on *how* I read (page turns, highlights, reading sessions), my Whole Foods purchases, my contacts lists, my video watching, my Amazon purchases, every time I’ve entered a physical store, etc etc etc etc pic.twitter.com/3JROpVBjkO – Alina Utrata (@AlinaUtrata) January 23, 2022

Saving them at that level of detail may seem like a stretch, and that’s precisely what has made there is criticism on social media about that collection of Alexa audios or about the pages we have turned when reading a book on our Kindle.

The truth is that the data collected by Amazon does not seem exaggerated. They are basically a history of our activity on their services, and in many cases part of that data they are useful to make the service more comfortable users — it’s useful to be able to access our orders or not have to enter shipping addresses for each purchase, for example.

The suspicions here can arise with the way in which Amazon can use that data, but again that collection of usage and preferences is usually intended to improve services and recommendation systems: If Kindles or Prime Video collect data, it seems reasonable to think that (at least in large part) it is to help us choose our next book, series or movie.

It is true that our purchase history can be useful for other purposes, as personalized advertising. Amazon itself acknowledges this type of scenario in its privacy notice and confirms that “we work with third parties such as advertisers, publishers, social networks, search engines, advertising providers and advertising companies that work on their own, to improve the relevance of the advertisements we offer. With everything and with that, all that data collected by Amazon they don’t seem especially invasive nor a serious threat to our privacy.