Consumer Protection Tuesday: How Coinbase Safeguards PII Using MPC Encryption

By Coinbase5min read
Consumer Protection Tuesday

At Coinbase, protecting customer data is not just a priority - it’s a responsibility. As the world’s leading cryptocurrency platform, we handle sensitive Personally Identifiable Information (PII) such as Social Security Numbers (SSNs), names, and addresses. To ensure this data remains private, we leverage our deep expertise in cutting-edge Multi-Party Computation (MPC), which also powers our secure signing of blockchain transactions and is showcased in our open-source MPC cryptography library.

Why Protecting PII Matters

PII is the backbone of identity in the digital age. It includes data that can uniquely identify an individual, such as SSNs, names, and addresses. If compromised, PII can lead to identity theft, fraud, and other malicious activities. At Coinbase, we understand the importance of safeguarding this information, not just to comply with regulations but to uphold the trust our customers place in us.

How Coinbase Uses MPC to Protect PII

At Coinbase, safeguarding PII requires leveraging advanced cryptographic techniques. Our CoreKMS Encryption Service uses MPC to ensure secure, deterministic, and authenticated encryption of sensitive data. 

Key Features of Coinbase’s Encryption Algorithm

  1. Distributed: Encryption keys are ephemeral and are only temporarily stored in memory for caching purposes. They are derived on-demand using MPC, ensuring that no single party ever has access to the complete master key or its derivation process.

  2. Deterministic: The algorithm always produces the same ciphertext for identical plaintext and encryption parameters. This enables secure querying and indexing of encrypted data.

  3. Authenticated: By using AES-GCM-SIV, the encryption ensures both confidentiality and integrity, making it resistant to tampering and replay attacks.


How It Works

Step 1: Generating Master Keys with MPC

The foundation of our encryption system begins with the generation of MPC keys, referred to as Master keys, using a cryptographic method called Distributed Key Generation (DKG). DKG is a collaborative process where multiple independent parties jointly generate a cryptographic key without any single party ever knowing the complete key. This ensures that the key is securely distributed and highly resistant to compromise.

Step 2: Deriving Symmetric Encryption Keys

To perform deterministic encryption, symmetric encryption keys are derived from the master keys using MPC derivation. This derivation process involves combining the master key with a unique secret value called a Tweak. The tweak acts as an additional input to ensure that the derived encryption key is unique for each use case. It is a secret value provided by the user that ensures the encryption key is specific to the data or context being encrypted. The derivation process can be summarized as:

MasterKey + Tweak = Unique Encryption Key

Using the same master key and tweak will always produce the same derived encryption key.

Step 3: Encrypting Data with AES-GCM-SIV

Once the symmetric encryption key is derived, it is used with the Advanced Encryption Standard (AES) algorithm in Galois/Counter Mode with Synthetic Initialization Vector (AES-GCM-SIV). This mode provides both deterministic encryption and authentication, ensuring that the data is not only secure but also tamper-proof. Key features include:

  • Deterministic Encryption: Guarantees that the same plaintext combined with the same tweak always produces the same ciphertext. This is critical for use cases like searching or indexing encrypted data.

  • Authenticated Encryption: Ensures that any unauthorized modification of the ciphertext can be detected.

  • Ephemeral Keys: Encryption keys are derived on the spot using MPC and are never stored, except for caching, reducing the risk of compromise.

  • Nonce Misuse-Resistant: SIV provides secure deterministic encryption even in case of faults which is not available with AES-GCM.

  • Field-Level Encryption: Specific fields, such as SSNs, are encrypted individually, ensuring granular protection.

Real-World Applications: NY State Data-Match Reporting using Snowflake

One of the first use cases for Coinbase’s encryption service was NY State Data-Match Reporting, which required integration with Snowflake to securely handle sensitive user data.

The Process

Coinbase utilizes an internal ETL (Extract, Transform, Load) process to periodically transfer user data, including SSNs, to Snowflake for secure storage and analysis. As part of this process, SSNs are encrypted using the CoreKMS Encryption Service before being transferred. The encryption ensures that identical plaintext SSNs always produce the same ciphertext when encrypted with the same parameters, enabling deterministic encryption.

On the other side, Coinbase receives lists from the NY Tax Authorities containing requests for information based on SSNs. To fulfill these requests, the process involves querying Snowflake for matching records. However, since the SSN field is stored in encrypted form in Snowflake, direct querying is not feasible.

The Solution

To address this, Coinbase implemented a Snowflake API Integration which is accessed by User Defined Functions (UDFs) for encrypting the SSNs. Here’s how it works:

  1. The UDF in Snowflake calls Coinbase’s encryption service API to encrypt the SSN from the list.

  2. The encrypted SSN is then used to query Snowflake for matching records.

  3. Deterministic encryption ensures that the ciphertext generated by the UDF matches the ciphertext stored in Snowflake, enabling accurate data retrieval.

This approach ensures that sensitive data remains encrypted throughout the process, maintaining privacy and security while fulfilling regulatory requirements.
Read more about this in Snowflake Blog

Why MPC Encryption is a Game-Changer

By leveraging MPC, Coinbase raises the bar for data security in the crypto industry. Here’s why:

  • Security Preservation: The cryptographic shares of the master key are never combined or exposed, reducing the risk of compromise.

  • Scalability: The derived encryption keys can be tailored to specific use cases and arbitrary data types without requiring new master keys. 

  • Compliance: Deterministic encryption ensures that sensitive data can be securely queried and indexed, meeting regulatory requirements while reducing risks to users.


Coinbase’s Commitment to Consumer Protection

At Coinbase, we believe that innovation and security go hand in hand. By integrating MPC encryption into our CoreKMS service, we’re not just protecting PII, we’re reinforcing our commitment to consumer protection. This is part of our broader mission to create a safer and more resilient crypto ecosystem.

Looking Ahead: Join Us in Shaping the Future of Data Security

We’re constantly innovating to push the boundaries of cryptographic security. Our expertise in MPC encryption is not only protecting sensitive PII today but is paving the way for future use cases, such as data-shredding for secure deletion and lifecycle management of sensitive information. We invite developers, researchers, and security enthusiasts to explore our open-source MPC cryptography library and contribute to advancing the crypto ecosystem. If you’re passionate about building cutting-edge security solutions, consider joining our team to help shape the future of privacy and blockchain security in the digital age.

Recent stories

Disclaimers: Derivatives trading through the Coinbase Advanced platform is offered to eligible EEA customers by Coinbase Financial Services Europe Ltd. (CySEC License 374/19). In order to access derivatives, customers will need to pass through our standard assessment checks to determine their eligibility and suitability for this product.