The future of data processing in driverless cars: the shift from connected to autonomous

Jun 26, 2019 12:00:00 AM | USA The future of data processing in driverless cars: the shift from connected to autonomous

The shift from ‘connected cars’ (cars communicating with their manufacturers, traffic lights, surrounding vehicles etc.) to ‘self-driving’, ‘driverless’ or ‘autonomous cars’, will impose new challenges to GDPR compliance. Business models and use cases of such cars will change, as will controllership, processors, purposes, and types of data being processed. In this article, we will discuss the repercussions and challenges to GDPR compliance in this paradigm shift.



As we speak, large amounts of data are being ‘pumped’ into the cloud by (partially) automated vehicles. Many claim that dozens of gigabytes per hour are collected and stored for each car on the road. The question is: with what purpose? Manufacturers of smart cars appear to collect as much data as possible. This is a very common mistake, at least from a privacy perspective, due to the frequent misunderstanding of what actually constitutes personal data, and as per the GDPR’s obligation to data minimisation, personal data collection should be kept to a bare minimum, necessary for the purpose. 

It’s personal data, silly!

An almost Pavlovian response of manufacturers might be that the majority of these data is not ‘personal data’, as they do not relate to an identifiable natural person. That’s not correct. From a legal perspective, it’s helpful to have a look at the categories of data that are being processed by such cars. In this blog post, without going into too much technical detail, we will distinguish vehicle health, driver health, location, and surveillance data.

First off, let’s explore vehicle health data. Any modern car will monitor its resources, such as fuel levels, engine air inflow, emissions, serving to optimise engine performance, and to indicate problems in advance. Even if this information constitutes personal data, for instance if it relates to the actual unique handling of the car by the driver, it would still be perfectly legitimate to use, since there is a clear purpose: optimising the operation of the car. 

Next, let's take a look at driver health data: cars monitoring their drivers’ behaviour. This is a very common practice, and generally regarded as a safety measure. For instance, a sensor might check whether the driver is falling asleep, and if so, wake him up with an alarm. Steering behaviour or frequent abrupt braking might also be an indicator of driver health. Such use of data collection can be perfectly legitimate. In this case, it can be reasoned with the “necessity for reasons of public interest in the area of public health” (art. 9(2)i GDPR).

Up next is location data. Location data tells a lot about a driver’s behaviour, it is therefore deemed sensitive data. Although sensitive data does not have an explicit protection regime under the GDPR, it we can safely assume that supervisory authorities will closely inspect the use of such data. This category also includes speed information, which might classify as crime-related data, with a specific protection regime under both GDPR and national laws. 

Last, let’s examine surveillance data. This type of data is collected from the ‘environment’ of the car. Proximity warning sensors, systems for communicating with cars in the direct neighbourhood for syncing speed and brake information, and also video cameras monitoring the outside area of the car are included in this data category. This data might partly overlap with location data (by showing identifiable landmarks,) but it might also concern other people’s personal data; for instance by collecting  recognisable number plates, or even peoples’ appearance. 

From connected to autonomous cars

Today, our vehicles collect loads of data with the apparent purpose of learning from that data, and using it to make cars more autonomous in the future. Besides whether manufacturers will succeed in producing truly autonomous cars and deploy them in real traffic on a large scale, one of the crucial questions is whether data collection today is allowed for future purposes, rather than a current purpose relating directly to the data subject affected.

Purpose shift is dealt with by articles 5(1)b and 6(4) GDPR. Article 5(1)b states that personal data should be collected for specific purposes and should not be further processed in a way incompatible with the initial purpose. Article 6(4) GDPR states that processing personal data for a purpose other than for which the data were originally collected is allowed when one of the following conditions are met:

  1. the data subject has given their consent
  2. there is an applicable EU or member state law
  3. the new purpose is ‘compatible’ with the original purpose

The aforementioned third condition is subject to various open criteria, such as the context of data collection, the nature of the personal data, and the possible consequences of further processing for the data subject. Typically, such an open norm would allow for a lot of legal ‘creativity’, only bound by the borders of EDPB or supervisory authority guidance and jurisprudence.

Of course, any ‘big data’ processing (the vast amounts of data of thousands of cars will qualify as such) has a problem with purpose limitation. Usually the purpose of big data processing will be to find unknown patterns, i.e. patterns that can improve the safety of autonomous cars. ‘Finding unknown patterns’ will not qualify as a sufficiently specific purpose. However, such a purpose connected to the improvement of safety might be sufficiently specific. 

After the shift: is there still personal data to be processed?

The concept of owning a car has been an important asset of the 20th century middle class welfare. But how will this concept develop once autonomous cars can easily be shared? Car sharing will be much easier once the vehicles are driving autonomously. The driver will be just a passenger. How will this affect personal data processing? It is tempting to envision data processing as the next iteration of an Uber-like service. 

Obviously, this is because, among others, Uber is also experimenting with autonomous cars, and because in its current form, it is already a data-driven company. From the types of data identified above, it is probable that if and once driverless cars will be the norm, vehicle health data, location data, and surveillance data will play an ever more important role. Insofar as vehicle health data concerns the ‘cabin’ of the car, it will contain data about passengers, and therefore will still be personal data. The same is true for location data, and surveillance data will still apply to people in the ‘environment’ of the car.

However, shifts in business models will also affect personal data to be processed. The data subject, rather than being ‘just’ a driver or a passenger of an individually owned car, might become a passenger of a data-driven company that tries to make a business out of the data subject, rather than the car itself and its mere transport function. How? By personalised advertising, catering offers, entertainment options, and so on. This may seem far-fetched, but Uber is already at it (with the exception of offering food in the car).

So what does this all mean for cars and personal data in the future?

Cars are quickly transforming into data collection and processing units, being part of an ecosystem where data is being reused and analysed. Connected and autonomous cars may easily become the ideal test bed for the GDPR, especially for its principles of data minimisation, purpose limitation and privacy by design. At the same time, current approaches from manufacturers seem to focus mostly on gathering as much data as possible. 

From a legal perspective, what can manufacturers should be doing instead is to actively think about their data processing activities, taking into account the following:

  • Turning connected cars into autonomous cars will not alleviate the degree of processing personal data;
  • Giving data subjects active control of personal data about them, to exercise their data subject rights and be in control of any additional services;
  • Actively compartmentalising personal data or turning it into aggregated data should decrease the risk of breaches and processing purpose creep;
  • Thinking about whether they want to be part of profiling activities for, e.g., insurance purposes.

On the other hand,  this advice concerns mainly the traditional model in which a car is individually owned, and the car itself is the product. Rather than worrying only about the shift from gasoline to electricity, manufacturers should probably be rethinking their entire business model. It might be that only a small percentage of the future population still wants to own a car or will be able to afford one. The rest of us might be spending our time in a driving billboard.