Legal and Regulatory Challenges for Data Analytics
Data is the new currency, as countless business executives around the world have already agreed on . Companies share a massive shift to digital transformation, which also provides great opportunities in terms of the data used. Data has a massive value and knowing how to attract and extract it, can determine the economic growth of enterprises. This translation of the raw data into valuable insights comes with great complexity, both technological and regulatory. We see enterprises increasing the budget for new technologies like AI machine learning to learn to extract such data value. Data analytics has, thus, become a highly valuable source for businesses worldwide.
At the same time, high legal and regulatory standards enhance the complexity further. Especially in today’s context, ensuring data protection compliance proves to be harder than ever. Organizations are not using Excel files and the fax in the back office anymore. Nowadays, companies rely heavily on a shared processing landscape, with external data processing infrastructure and services, and collaboration tools across the organization as well as with the outside world. Consequently, ensuring a compliant use of personal data is not as easy as it was back in the day. Now, compliance managers must also ensure compliance with any external data processors used and other third parties they work with.
Furthermore, during data transfer to (unsafe) third countries, it is crucial to ensure that third parties cannot access the data even while it is being processed in the Cloud.
Thus, more than ever, IT compliance concerning data security, especially for data analytics purposes, becomes more difficult due to IT infrastructure networking and third-party processes as well as cross-border transfer.
ECJ ruling: Schrems II
To better understand the importance of new technology measures to ensure legal and regulatory compliance for data analytics, it is important to grasp the legal changes within IT compliance.
Cross-border data transfer to the US has been a problem for decades. Already under the Data Protection Directive 95/46/EG, the EU Commission had agreed upon the so-called Safe Harbor Agreement in 2000, which offered data recipients the option of certification and subsequently permitted legally compliant data transfer. This decision was declared invalid by the ECJ in 2015 (the so-called Schrems I decision) . The EU and the US responded in 2016 by agreeing on a new agreement called the EU-US Privacy Shield . The European Commission has declared certain non-EU countries to have equivalent data protection safeguards to the EU itself. As a result, organizations in these countries can freely transfer the data of EU citizens without the need for additional security mechanisms. But the Privacy Shield met the same fate as the Safe Harbor Agreement with the Schrems II decision of the ECJ on July 16, 2020 .
In its ruling, the court clarifies that even with certification according to the Privacy Shield, no adequate level of protection within the meaning of Art. 45 GDPR can be assumed. It justifies its decision with the concern that US intelligence services might use data transfer to solicit access to EU citizens’ personal data.
This current legal situation poses challenges for companies of all sizes, especially when using Cloud computing. The topic is particularly important because major providers of Cloud computing services have their server location or their headquarters in the US. In this respect, the so-called CLOUD Act of 2018  must also be considered, which, under certain conditions, allows US authorities to access data stored in data centers outside the USA. And in practice, complete or partial reliance on service providers without US-based headquarters or server location is nearly impossible. This makes it even more urgent to develop technical and organizational measures that make it possible, following ECJ case law, to continue to make data transfers legally compliant with an appropriate level of security in the future.
Confidential Computing as the new key driver
Overview of the solution
We already talked about what Confidential Computing is and how exactly the technology works. For a holistic overview of the solution, please check out our Confidential Computing Explained blog post.
For the purpose of this article, we can sum it up as follows:
The technical trademark of Confidential Computing lies in the fact that it enables the complete protection of the data. Not only when stored (“data at rest”) and transported (“data in motion”) but also during processing (“data in use”). In particular, Confidential Computing allows data to be processed in an isolated, encrypted form. Only the user with the appropriate key can reconstruct the data.
A prerequisite for confidential computing is the use of computer processors (“CPUs”) with special security extensions (e.g., Intel SGX/TDX, AMD SME/SVE). Numerous providers have already launched corresponding technical solutions, including the hyperscaler Microsoft, AWS, IBM, and Google. With the help of these security extensions, a computer program and its data are subject to the exclusive control of the CPU: The data is now being processed in an encrypted execution environment, the so-called enclave. Only the processor can decrypt the data, process it, and store it encrypted again in memory. The result of enclaving is that data processing stays in isolation from the operating system and the applications running on it. During processing, neither the (Cloud) service provider nor the administrator or a (compromising) third party has access to the data.
Another important property of Confidential Computing is the attestation of the enclave. This takes into account the prerequisite that the user must be able to know whether the data processing takes place in the enclave or a non-protected execution environment, especially if the Cloud infrastructure is provided by a third party. With the help of cryptographic protocols (“remote attestation”), the CPU can audit the execution of the enclave and generate proof not only that the data processing took place in an enclave, but also that the data processing is compliant with regulations and personal data stays anonym. Data controllers can therefore attest that the data privacy is protected in conformance with the data protection regulations.
Practical Example for Data Analytics
It’s reasonable to argue then, that data analytics solutions leveraging Confidential Computing could fulfil these regulatory hurdles. To illustrate how Confidential Computing may help improve data protection compliance, let’s explain it using the following use case :
We are a global pharmaceutical company, with a US-based headquarter. We want to join forces with healthcare providers and government bodies in the EU to develop a drug against cancer. For this purpose, we are putting together different data streams for joint analysis: healthcare providers share the highly sensitive personal health data of the users, we as a pharmaceutical company share the data we collected within our sector and together we hope to gain valuable patient behaviour insights to proactively develop an efficient cancer treatment. Our insights will be based on new technologies such as AI algorithms.
However, before we engage in these joint data analytics, we discuss the following key regulatory challenges we will be facing:
- Data security: what kind of security measures do we need to prevent any unauthorized access to this highly sensitive data? Data cannot be leaked to any of the insiders, or the outsiders involved (i.e., infrastructure providers)? 
- Data privacy: how do we ensure the protection of personal data throughout the data lifecycle (i.e., while the data is at rest, in transit, or during processing)? 
- Cross-Border Data Transfer: what adequate level of data protection can we provide for the personal data while transferring it across borders to our US servers? 
- The Potential of Confidential Computing after Schrems II
How can we ensure that this collaboration can take place while meeting the requirements of GDPR and Schrems II? Let’s look at each of these challenges in detail.
First, we need to undertake the necessary technical measures to ensure no accidental or unauthorized access to the data shared. This refers to both the data processing within our company but also when relying on third-party processing. Furthermore, following EDPB’s recommendations, it is required that the supplementary technical measures used alone or in combination with contractual or organizational measures shall be “state-of-the-art” .
For joint analytics involving multiple parties as in our example here, we would rely on a Cloud-based analytics solution. This also means that the parties within our example need to extend their trust to the provider of this data analytics solution. However, as data breaches are frequent, it is understandable that businesses are reluctant to extend their trust to Cloud-based solutions. Furthermore, since the data used is highly sensitive, the benchmark of the technically enforced protection must be very high.
A data analytics solution using Confidential Computing technology could be the answer to this lack of trust. The CPU stores the cryptographic key, ensuring the integrity of the code that is processing the personal data. It keeps information away not only from Cloud or infrastructure providers but also from (compromised) external parties. Thus, if malware or unauthorized code tries to access the encryption keys the CPU denies access and cancels the computation. In this way, sensitive data remain protected within these enclaves for the entire data lifecycle time. And considering this, it is not unlikely that Confidential Computing will become a state-of-the-art of data-security method in the field of data processing in the foreseeable future .
Another important aspect our actors in the practical example above need to consider is how to ensure data privacy. To do so, they must first assess the likely effect of processing personal data on the rights and freedoms of the data subjects and then design the processing in such a way as to prevent or minimize the risk of interference with those rights and freedoms from collection to disposal of data .
Specifically for our data analytics example, we need to consider the following key data protection principles:
- Purpose limitation: controllers must collect personal data only for explicit, specified, and legitimate purposes. Processing the data in a way that is incompatible with these purposes is not allowed .
- Proportionality and Data Minimization: the processing of personal data must be proportionate to the legitimate purpose pursued and reflect at all stages of the processing a fair balance between the interests concerned .
- Accountability: controllers shall be able to demonstrate compliance of their data processing with applicable data protection principles and requirements at all times .
Looking at the section above where Confidential Computing technology was briefly explained, we can determine, that a data analytics solution that relies on such Confidential Containers technology fulfils all the key data protection principles and requirements. It enforces the technically required principles but also ensures a remote attestation at any given moment. This allows data controllers to demonstrate data compliance throughout the entire analytics cycle.
Cross-Border Data Transfer
Considering cross-border transfers between the EU and US (or other non-EU countries), there are only the following options to regard:
- Do not use the personal data of EU citizens outside of the EU
- Encrypt all personal data transferred outside the EU
- Fall into an exception to transfer data, stipulated in Article 49 of the GDPR
Article 49 of the GDPR states that data transfer from the EU to third countries can take place even in the absence of appropriate safeguards if there is the explicit consent of the data subject:
- Necessary for the performance of a contract between the data subject and the controller
- necessary for important reasons of public interest
- necessary for legal claims
- necessary to protect the vital interests of the data subject or other persons.
But as such exceptions are not the norm, the real option remains to encrypt personal data that leaves the EU. Consequently, no government or other organizations can tap into surveilling, demanding encryption keys.
Similar to articles 25 and 32 of the GDPR, EDPB requires that the supplementary technical measures shall be state-of-the-art. Encrypting the data before transferring it is presumed one of the most important technical measures. Here, EDPB also states that the encryption keys must stay within the European Economic Area (EEA).
The legal uncertainty surrounding the conditions for cross-border transfers of personal data can often be the reason why data collaborations between European and non-European countries are not taking place. The healthcare providers in our example might see an issue in the participation of a global pharmaceutical company, based in the US.
However, the concern about the legally enforced powers of government agencies to access data may be addressed by a data analytics solution using Confidential Computing technology. As described above, no one has access to the source data inside the secure enclave. Healthcare providers or global pharmaceutical companies can’t see the sensitive data shared. Neither can any other parties passively involved such as the solution- or infrastructure providers. If any government agency demands access to the healthcare providers’ data, none of the parties would be able to comply. They do not have access to the data in the first place. The same goes for the decryption keys since the CPU is holding these, without any outside party being able to disclose them.
Consequently, it is safe to say that Confidential Computing fulfils the data protection requirements outlined in the Schrems II ruling. It paves the way for new joint data processing across the globe, now also with actors based in countries that do not provide adequate data protection in the eyes of the ECJ.
With the technological jump made in recent years, Cloud-based solutions can become more appealing, as the lack of trust in infrastructure providers diminishes significantly. Data analytics solutions powered by a Cloud landscape are taking on more and more prominence. However, the risk of externalizing the processing of highly sensitive data correlates with the increased number of cyberattacks. Moreover, legislators across the globe impose important compliance and regulatory frameworks for data controllers.
In this respect, Confidential Computing offers new possibilities, especially within the area of Cloud computing. Specifically, during data transfer to third countries, this technology can ensure that outside parties can’t access the data, even in a Cloud environment. This considers the fundamental requirement of the ECJ in the Schrems II decision. Furthermore, based on the remote attestation feature, data controllers can demonstrate the compliance of personal data processing.
Following the hardware-based encryption approach of this technology, it can be argued that Confidential Computing incorporates current state-of-the-art measures to ensure data security and privacy.
Thus, taking Confidential Computing into account for future data protection purposes within any organization is worthwhile.