martedì, Aprile 23, 2024

Autonomous vehicles vs the principles of Data Protection law

1. Autonomous vehicles’ challenges

Totally autonomous or highly autonomous vehicles will be available on the market in the next few years and by 2030 it is estimated that they will be able to circulate on any type of road, in any weather condition and without the driver’s attention.

Europen Union has recognized the need to implement, as soon as possible, appropriate regulatory frameworks to ensure the safe operation of these vehicles and to provide for a clear liability regime so as to address the resulting changes, including the interaction between autonomous vehicles and the infrastructure as well as other users[1].

In Italy, the Decree of the Ministry of Infrastructure of 28/02/18 opened the road network to digital transformation and testing of autonomous vehicles[2]. The measure intends to modernize the Italian road network, spreading the use of intelligent transport system, through the implementation of the Smart Road and the introduction of self- driving cars.

However, autonomous vehicles – as well as other AI systems – raise many challenges and problems: road safety models across EU; European law on driving responsibility;  lack of industry- standardized technology and tools; consumer trust and acceptance, an so on.

Furthermore – although European data protection rules apply – specific measures require  to be developed to ensure IT security and protection of personal data collected and used by these vehicles, required for both training its AI systems and also for real-time decision making once those same systems are deployed[3].

Thus, it is necessary to ask whether the use of autonomous vehicles is compatible with the GDPR and the possible remedies that companies should implement in order to be compliant with the EU regulation.

To do this, it is necessary to understand what autonomous vehicles are, how they collect personal data and the interaction between the main GDPR principles and their use.

2. How autonomous vehicles collect personal data

According to a study made by Accenture, in one day, just one test autonomous vehicle produces as much data as the Hubble Space Telescope produces in a full year[4]. 

Autonomous cars collect data from passengers and from sensors (Lidar[5], radar, camera, ultra sound).

As for the passengers, self- driving vehicles can acquire a large amount of personal data related to the driver and all the subjects present in the car: to ensure that the driver is allowed to drive the car or to adapt the characteristics of the vehicle to the driver, to name a few.

The autonomous cars, also, store data about the destination, the route followed, the speed and the duration of the journey.

As for the sensors, autonomous vehicles collect data concerning the movement of the vehicle and the environment around it.

For example, Google’s driverless car (Waymo[6]) is equipped with sensors such as cameras, radars and infrared devices that collect information about the environment outside the vehicle: Google car can identify a bike and understand that if the cyclist extends an arm, he/she intends to make a maneuver; as a result, the car slows down and gives the bike enough space to operate.

Also, it should be noted that these vehicles use Wi-Fi networks, through which a huge amount of data can be transmitted.

3. The principles of data protection law in the use of autonomous vehicles: problems & remedies

Analyzing the functioning of the vehicles, it is easy to understand that data is key to autonomous vehicle technology.

Since European data protection standards also apply to the automated transport sector, it is necessary to check how the use of autonomous vehicles relates to the principles of the GDPR and whether and how fitting solutions can be found.

  • Lawfulness, fairness and transparency principle (Art. 5 (1) (a) GDPR)

The data subject may be the owner of the vehicle, the driver or a passenger.

The question of data ownership is intuitive in the case of customer provided data that can be linked to the subjects listed above and, as personal data, fall within the scope of the GDPR.

Instead, in the case of vehicle generated data, a distinction must be made.

If the data is personal, then it is easy to associate it with the person concerned by the processing.

When vehicle generated data are not personal data but mere technical data, it is necessary to understand who the owner is and what is the applicable legislation.

The prevailing opinion identifies the owner of the data as the one who has a greater interest in them and suggests the introduction of a sector- specific legislation for the technical and new data generated by the IoT[7].

As it stands, a remedial action could be providing adequate information for data subjects through portals, publicly available and well disseminated information.

  • Purpose limitation (Art. 5 (1) (b) GDPR)

Autonomous vehicles are fed by a large amount of data. As other AI systems, autonomous vehicles are trained through machine learning and deep learning.

Machine learning uses algorithms that discover patterns, then modifies itself and it adjust as it is exposed to more data.

This mechanism and the repetitive use of a huge amount of data processed  challenge the principles of purpose limitation.

The purpose limitation principle prevents arbitrary re-use, but it does not create a barrier for big data repurposing: it means that an assessment of compatibility of processing purposes must be done[8].

Further uses of the data collected by the autonomous vehicle may fit closely with the initial purpose or be different. The fact that the further processing is for a different purpose does not necessarily mean that it is automatically incompatible: this needs to be assessed on a case-by-case basis.

However, in general, the remedial actions could include the use of a portal – to provide the information about new purpose – and adequate information for data subjects.

  • Data minimization (Art. 5 (1) (c) GDPR)

Autonomous vehicles should not collect more data than it is necessary.

In general, research in the field of machine learning depends on the collection and availability of large amounts of data, including personal data: during tests, vehicles  generate between 4 and 6 TBs of data per day, which is equivalent to data generated by 6,200 internet users.

The amount of data needed to train an algorithm is not known in advance, so a “blind” application of this principle would affect any possible development in the field of artificial intelligence.

However, the issue regarding data minimization is not simply the amount of data being used, but whether it is necessary for the purposes of the processing, or excessive.

The GDPR states that personal data shall be “adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed”[9].

Companies must harmonize this principle with the need of storing data.

For this purpose, companies should arrange a  plan that balances three elements to eliminate data redundancy:  the portfolio required, the urgency of collection and the available resources[10]. 

Also, efficiency in data acquisition might includes measures to limit the quality of data: for example – in the case of external sensors – an appropriate remediation action could be to blur the faces of passers-by.

  • Accurancy (Art. 5 (1) (d) GDPR)

According to GDPR, the data collected must be accurate and kept up to date.

The  accuracy achieved depends strongly on the quality of the labeling process.

In many data mining applications, obtaining labels for the training data is an error-prone process for several reasons, including subjectivity, data-dependent error, inappropriate feature information used for labeling, etc[11].

In the case of autonomous vehicles this risk is amplified by considering that in a simple traffic intersection may be many objects to identify, so creating guidelines on what and how to annotate.

Considering the complexity of labelling operations, the best remedy in this case would be to outsource these operations to third parties who specifically focus on improve label and model quality[12].

  • Storage limitation (Art. 5 (1) (e) GDPR)

To protect personal data and promote data quality, the GDPR requires that personal data shall be “kept for no longer than is necessary for the purposes for which the personal data are processed”, though personal data may be stored longer if it “will be processed solely for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes”.

The establishment of short retention periods and the deletion or limitation of the use of the data after the original purpose has been achieved or if requested by an individual would undermine the potential benefits arising from the use of such data for the development of autonomous vehicles[13].

The issue comes to be much more complicated with regard to the personal data of unknown people taken by external sensors and to unclassified data categories.

In order to deal with this problem, a necessary remedy is certainly the arrangement of proper data governance and management (strategy, approach, policy).

To track what the data contains companies need to record and associate information on the source data: the locations where it was collected, what streets were covered, what intersections were recorded, whether the data is from day or night, or sun or rain[14].

  • Integrity and confidentiality (Art. 5 (1) (f) GDPR)

The GDPR requires that personal data shall be “processed in a manner that ensures appropriate security of the personal data, including protection against unauthorised or unlawful processing and against accidental loss, destruction or damage, using appropriate technical or organisational measures”

With regard to autonomous vehicles, the challenge concerns the Mobile Vehicular Cloud (MVC) among other.

The MVC is formed to improve routing utility in had hoc networks through the communication between vehicles with on – board units[15]; as a result, sensitive vehicular data will be stored in the cloud.

However, back-end cloud data is vulnerable to many threats, that can lead to loss or leakage of data, thus undermining the confidentiality, integrity and availability of the data.

The question poses important problems, especially since in the absence of a specific regulation the consequence would seem to be the restriction of the movement because of the MVC.

More correctly, however, it would be desirable for companies to adopt forward looking cybersecurity systems and to provide centralised IT team before distributed teams get too far down a difficult path.

4. Conclusions

As we have seen, in the absence of a specific regulation, a valid alternative to concretely guarantee the lawfulness of the processing could be represented by the strengthening of the principle of data protection by design, in addition to that of data protection by default (Art. 25 GDPR): it behooves companies to consider data-related processes and infrastructure needs  in research and development to disclose issues.

In any case, it is necessary to set new legal categories and new rules in relation to a new subject to which traditional principles are difficult to apply, delaying technological development.

In accordance with the aforementioned, The European Parliament considers that the automotive sector is in most urgent need of efficient European Union and global rules, in order to ensure the cross-border development of self-driving cars, the exploitation of their economic potential and the benefits from the technology[16].





[5] S. VAN ERP, Ownership of data: the numerus clausus of legal objects, in Brigham-Kanner Prop. Rts. Conf. J., Vol. 6, 2017


[7] GDPR Article 5(1)(c)




Ariella Fonsi

Laureata in Giurisprudenza nel 2017 presso l’Università LUISS “Guido Carli”, dal 2021 è abilitata all'esercizio della professione forense. Dopo aver conseguito il master in “Diritto e Impresa” erogato dalla 24ORE Business School di Milano, si occupa di contrattualista IT e di diritto dei dati.  

Lascia un commento