CVS Health Records for 1.1 Billion Customers Exposed
A vendor exposed the records, which were accessible with no password or other authentication, likely because of a cloud-storage misconfiguration.
More than 1 billion records for CVS Health customers were left in the database of a third-party, unnamed vendor – exposed, unprotected, online. Researchers said the data points revealed could be strung together to create an extremely personal snapshot of someones’s medical situation.
The glitch is likely due to human error, security researcher Jeremiah Fowler said in a post on WebsitePlanet on Thursday: In other words, it’s probably yet another incidence of rampant misconfiguration that’s plaguing cloud-based storage, leading to exposure of sensitive data on an internal network.
According to Fowler’s post, researchers at WebsitePlanet – a portal for web developers and internet marketers – found the non-password-protected database, which had no form of authentication in place to prevent unauthorized entry, on March 21. They coordinated with Fowler in documenting their discovery and on that same day, after they contacted CVS Health, the naked database was closed off from public view.
CVS Health is the parent company behind multiple household brands, including the CVS Pharmacy retail pharmacy chain; CVS Caremark, a pharmacy benefits manager; and Aetna, a health insurance provider.
A CVS spokesperson confirmed the researchers’ findings, saying that CVS Health had been notified of the exposure of a publicly accessible database that contained non-identifiable CVS Health metadata. Upon investigation, they determined that the database was hosted by a third-party vendor, whose name the company didn’t disclose. The database didn’t contain any personally identifiable information (PII) of customers, members or patients, the company said in a statement, and the database was quickly taken down.
As the researcher’s report indicates, there was no risk to customers, members or patients, and we worked with the vendor to quickly take the database down. We’ve addressed the issue with the vendor to prevent a recurrence and we thank the researcher who notified us about this matter. —CVS Health statement.
What Was in That CVS Cache of Data?
Fowler said in his post that there was in fact enough information to derive customers’ PII, including their email addresses. The total size of the database was 204 GB, according to the researchers. It held 1.1 billion records, or, to be precise, 1,148,327,940 files. They were labeled “production” and included information typed into search bars, such as the data types add to cart, configuration, dashboard, index-pattern, more refinements, order, remove from cart, search, server.
The records also exposed fields called Visitor ID, Session ID and device information, such as whether customers were using an iPhone, an Android, an iPad or a desktop PC. The team noted that by stringing together the data, they could reveal emails that could be targeted in a phishing attack, in social engineering, or “potentially used to cross-reference other actions.”
As well, the files gave a “clear understanding of configuration settings, where data is stored and a blueprint of how the logging service operates from the backend,” according to the advisory.
In looking for PII, the researchers performed several search queries for common email extensions, such as Gmail, Hotmail and Yahoo, they said. They were rewarded with results for each query within the dataset, indicating that the records did in fact contain email addresses. Fowler said that, given how many personal email addresses are formatted using portions or all of the user’s name, he was able to identify “a small sampling of individuals by simply searching Google for the publicly exposed email address.”
The records also contained the data types Visitor ID and Session ID, indicating the items that visitors searched for, including medications, COVID-19 vaccines and other CVS products. All of this data, strung together, could have created a snapshot of private details about individuals’ health, Fowler said.
“Hypothetically, it could have been possible to match the Session ID with what they searched for or added to the shopping cart during that session and then try to identify the customer using the exposed emails,” he said in the advisory.