The PCI Council today released the long-awaited tokenization guidelines (Information Supplement: PCI Tokenization Guidelines) which give merchants specific guidance on using tokenization to reduce PCI scope and assessment costs, which have been costing large merchants upwards of $500,000 / year.While the guidance is strong, it seems to provide one area of ambiguity around controls for so called high-value tokens which we will examine in some depth later in this blog. First, let’s dig into the guidance itself:
Out of Scope ConsiderationsThe main thrust of the guidance defines a scoping principle and then out-of-scope considerations. The scoping principles state that all components of the tokenization system are considered part of the cardholder data envrionment (CDE). The scoping principles refine this further to include the token generation and detokenization components as well as any system with access to the tokenization system or part of the CDE. For specific out-of-scope considerations, the document states the following requirements. I thought it would be helpful to analyze the guidelines in terms of a real-world use case: credit card information stored in a data-warehouse application used for post-payment customer loyalty tracking. Let’s assume that our fictional merchant has decided to receive new entries as format-preserving tokens using Intel(R) Expressway Tokenization Broker (ETB). In this example, the gateway is the tokenization engine and is the only entity with access to a secure token vault. We can assume that the data-warehouse application has no need for the live PANs and if PAN data is required for charge-back or refunds, it is retrieved by a different application (which would be in full PCI scope).
- Recovery of the PAN value associated with a token must not be computationally feasible through knowledge of only the token, multiple tokens, or other token-to-PAN combinations.
- PAN cannot be retrieved even if the token and the systems it resides on are compromised
- System components are segmented (isolated) from any application, system, process, or user with: (i) The ability to submit a de-tokenization request for that token and retrieve the PAN, (ii) Access to the tokenization system, data vault, or cryptographic keys for that token, (iii) Access to token input data or other information that can be used to de-tokenize or derive the PAN value from the token
- System components are not connected to the tokenization system or processes, including the data vault, or cryptographic key storage
- System components do not store, process, or transmit cardholder data or sensitive authentication data through any other channel
- System components that previously stored, processed, or transmitted cardholder data prior to implementation of the tokenization solution have been examined to ensure that all traces of cardholder data have been securely deleted.
ETB generates format-preserving tokens based on true random numbers or pseudo-random numbers with a protected seed. This means that there is no mathematical relation between the token and the PAN. This requirement appears to be met. It is interesting to note here that the guidance would technically allow a token generated from the PAN through an encryption or hash function to count, but the burden would be to show that this transformation is computationally infeasible to reverse
Expressway Tokenization Broker (ETB) stores PAN values encrypted in the secure token vault using AES-256. Even if the vault were compromised, the attacker would have to execute a brute-force attack against AES-256. The same goes with the data-warehouse application as only tokens would be available. Gaining access to only the token list would provide no value as well as there is no mathematical relation between the two. Note that again, this may not be the case when the token is generated using an encryption or hash function.
ETB supports both the proper segmentation and access control. In our example, the broker would be sending tokens to the data-warehouse application rather than live PANs and would be segmented by at least one network hop. Further, the broker would not provide de-tokenization access to the data-warehouse application using a specific security policy. The broker itself is also segmented from the secure vault and only the broker can access the vault using 2-way protected TLS communication. Further, an external identity store such as LDAP, Active Directory, Siteminder, Oracle Access Manager, IBM Tivoli Access Manager and others can be used to authenticate de-tokenization requests. A requesting application can approach the broker through an authenticated REST or SOAP API call to retrieve a token or provide an entire document or message to be tokenized.
Here, the system component (data-warehouse application) would only have the ability to receive tokens from ETB and would not be connected to the tokenization or de-tokenization process and would not have access to the vault or cryptogrpahic key storage. It is an open question if this ability to receive only tokens will be classified as connectedness.
This item would be met by our example as the data-warehouse application would only be receiving format-preserving tokens.
This is more of a process issue. It makes sense that before our data-warehouse application can be taken out of scope, the existing PAN data must be thoroughly scrubbed out
High-Value TokensSection four of the PCI Tokenization guidelines poses an interesting question regarding high value tokens. Just what is a high-value token? Here, they are referring to tokens that can be used to generate transactions. While they are not themselves PAN values, they can be used as a true surrogate for the PAN. The intuitive example given is the use of a room number in paying for a meal at a hotel restaurant. In this example, the token (the room number) along with a last name (ostensibly a secret) generates a transaction for a credit card. Under this view, the PCI council seems to be suggesting that a list of compromised last names and room numbers falls under PCI DSS! In other words, if tokens begin to look like and act like PANs, they bear the same compliance burden PANs. This seems like overstretching and more clarity is needed. Fortunately, the guidelines give an additional caveat, which is the the key phrase, “additional controls” , and this is where more guidance is needed. In other words, the loss or compromise of a token doesn’t expose anything, as long as you have the proper controls. What could these controls look like? It seems reasonable that tokens that work just like PANs – let’s called them uncontrolled high value tokens – definitely contribute to the risk of a breach. Here is a stab at some types of controls for high value tokens that generate transactions.
High Value Token Controls
- Single-Use: Should be single-use where one token maps to a different PAN each time. This may reduce the risk of an attacker collecting tokens for a concentrated attack in the future
- Short Lifetime: Should be given a constrained lifetime for additional protection
- Authentication: Should only generate a transaction in connection with additional protected authentication data that defines the entity attempting to make the transaction (person or system)
- Authorization: Is only authorized for a specific transaction and not others.
- Unpredictability : Valid tokens should not be predictable. In other words, an attacker should not be able to anticipate valid token values