Check out my conversation with Aaron Rinehart on chaos security engineering…
Some common threads...
News broke this morning of a long running North Korean campaign targeting cybersecurity researchers on Twitter, roughly playing nice and engaging until they could build enough trust to pass on files containing malicious code.
There is often sideways movement in cybersecurity attacks - you might want to get into one computer to get leverage on another one. In this example, as in SolarWinds its looks like the right place to start breaking down cybersecurity is the cybersecurity community itself...
https://blog.google/threat-analysis-group/new-campaign-targeting-security-researchers/amp/
A video worth rebroadcasting...
by Alexander Mueller
There is much that is topical in this story, both on the topic of the ups and downs of entrepreneurship viewed from inside and outside a company as well as some of the big issues of the moment the economic impact of Covid 19, geopolitical tensions with China, and just what sort of media the current technological climate can support.
On the surface level this fellow is asking for money. I will say I have watched their channel quite a bit and I was initially attracted to it exactly because of the mission he describes - trying to broadcast people's stories to a broader audience that might not otherwise get to learn.
November 2020 in Data Privacy: Schrems Guidance
For us here at Capnion, one of the most notable stories to track this past November was the ongoing regulatory situation in the European Union following the Schrems ruling. Quick review for the uninitiated: earlier this year, a judge in the European Union ruled ruled (in response to a suit filed by Maximilian Schrems) to invalidate a legal framework called Privacy Shield that Facebook and other companies had been using to determine adequate safeguards around wholesale movement of consumer personal data out of the EU. (The specific objection was that Privacy Shield did not do enough to protect EU citizens from surveillance by the United States government.) The present situation is something of a lingering, awkward limbo for companies that depended on Privacy Shield as there is not yet any clear successor.
There is some recent guidance from the European Data Protection Board this past month, though, on how to approach the issue and it is very favorable to Capnion and Ghost PII’s mode of approaching these problems. (The document itself is available here.) In several places this guidance draws a distinction between standards for tasks which require data to be in the clear (not encrypted or anonymized) vs. tasks that do not. Naturally, one has considerably more and better options for exporting personal data out of the EU if that data can stay encrypted the entire time.
This sort of thing is of course exactly what Capnion was founded to help you with: letting you accomplish more mission-critical tasks without need of data in the clear!
Hyperconvergence and the "Why?" of the Cloud
The rapid success of the cloud has created a sometimes confusing jungle of wrinkles around just how larger businesses should approach this trend... Is your organization in a position to go whole hog into the cloud? Or do you still need to keep some systems but not all on-prem, leading you to a "hybrid" cloud? Just why did you want to be in the cloud in the first place and in this all jargon what are the buzzwords that actually refer to the practices that will concretely benefit your business? "Hyperconvergence" is a big, intimidating term that could probably use a bit of demystification anyway and unpacking it provides a great window into the actually quite different reasons that different organizations might be interested in cloud computing.
The DoD JEDI cloud computing contract was an enlightening case study for me (and will likely continue to be). The military has some very conventional reasons it might want stay on-prem, including unique security concerns and the scale necessary to do a good job. The JEDI contract, for those that enjoy argument, was almost a contract to help DoD build on-prem and not in the "cloud" at all. Why the interest in the cloud? The answer I got asking around is in part that the military is interested in adoption of other technologies like machine learning consistently across a large and sprawling organization. It is the software layer that comes along with the cloud that really enables this and not any issue of where the data is hosted and who owns the physical server.
If there is something "hyper" in hyperconverged infrastructure that didn't get the name for marketing reasons, it is the hypervisor that creates and runs the virtual machines that then host the databases, servers, etc. more familiar to most engineers. Moving this task to a software layer, and not a hardware one, is a bit part of what makes the cloud scalable, cheap, and agile especially. Through this lens, you might say that cloud providers will important innovators in hyperconverged infrastructure, and what we called the cloud in the past was outsourcing of this new infrastructure. You might ask, though, if what you really wanted was to be on-prem but using the new software, and this is implicitly the sort of decision that the DoD made with the JEDI contract.
It seems like many large organizations drawn by the hype into hybrid cloud situations are having this kind of revelation. If you adopt the technologies used by the cloud providers for your own data center, particular those around virtualization and emphasis on the software layer, you might find that this is a better way to get at the value you thought "the cloud" was supposed to provide. It also makes your hybrid cloud environment much more workable when people inevitably need to collaborate across the true-cloud and on-prem components.
Hygiene for your favorite chat app...
Do you use a chat apps like WhatsApp, Signal, or Telegram? You might benefit from a few moments focused on how the security features in these apps work and how various settings impact your privacy and security posture.
Many common apps, including WhatsApp, Signal, and Telegram have a feature called end-to-end encryption. This feature has a reputation for being "unhackable" and whether this is true or false
it may have lulled even some powerful and sophisticated people into a false sense of security.
The idea of end-to-end encryption is that your chats are encrypted on your phone before they leave and they stay encrypted until they get to their intended destination. Hopefully, this encryption keeps anyone who might be snooping in the middle from learning about your conversation - in particular, the server that drives the app only needs to handle this encrypted data and lacks the power to snoop on you. Such a system, however powerful, eliminates just one of the many ways an interloper might try to spy on you.
The problem is that there is much that can go wrong on your phone before anything gets encrypted.
For example, many of these chat apps will back up your conversations to the cloud, and these backups are typically not encrypted at all. It is not uncommon that an app backs up to the cloud by default and thus does so without the knowledge of many users.
While not well known among users, it is not a secret that these backups exist and that they’re not encrypted. The history is that they’re not encrypted in part because law enforcement complained that encryption would make these backups difficult to use in investigations.
The next layer is that a bad actor could conceivably put malicious software on your phone that affects how these backups are handled (or otherwise changes the function of the app). This is approximately how the notorious gangster El Chapo was brought to justice, and recently it was revealed that Jeff Bezos was hacked when he opened a malicious link sent to him by, of all people, the Crown Prince of Saudi Arabia.
If you're privacy conscious, please be aware of how your phone is storing data from your chat apps, if you are backing up this data to the cloud, and be very careful about opening any links that anyone might send you (even links from foreign royalty).
Socioeconomic Class & Automation
Everyone and Yang are talking about a new wave of automation and the danger we might all be thrown jobless out on the street. The new wave of automation is real and there will be consequences, but there is systematic understanding of the past, present, and future history of automation. We are presently willfully misunderstanding the nature of the consequences. The missing puzzle piece is the banal yet always amazingly taboo topic Americans always use to misunderstand the world: social class.
Perhaps there is a wave in the sense that things are moving faster and more perceptibly than usual, yet replacement of jobs with automation is a process that has been going on continuously for 150 years. Vast changes in economies all over the world have occurred as artisans were put out of work to become factory workers, then factory workers were put out of work to move to the service industry. There are still people you might call artisans, and there are certainly still factory workers, yet today Americans overwhelmingly work service jobs in a way that would be shocking to a time traveler. Many of us are already busying ourselves in jobs a previous generation might find to be surreal make-work.
This is not to say, however, that this journey was a good time for everyone. Rivers of ink have been spilled examining the consequences (economic, social, cultural, religious, everywhere!) of the disintegration of the artisan class, especially in Europe. Many felt that the new lifestyle technology had handed them was less dignified, and more tangibly that it induced a degrading level of social hierarchy (and inequality of wealth and income) that they would be better off without. People were angry, and like today there was a vitriolic populist politics to express their anger. We will always struggle to really empathize with that shift because we have only known the world they regarded as a plummet into disaster.
What is different today? The answer again revolves around social class. In the past, those who were most affected had a disproportionately small voice in public discourse. Those with jobs requiring education - jobs intimately requiring advanced literacy and numeracy - were relatively safe. These are also the jobs that grant privilege in public conversation. Journalism, which you might say has been partially automated disruptively by information technology, is an excellent prototype.
People will keep inventing new work for other people and people will keep finding ways to re-purpose their skills. What will change is the nature of our socioeconomic hierarchy… past history suggests it will become steeper and more stratified, and it is a whole other line of common public conversation that we are quite deep into this trend already. There is nothing new here, but an ancient thing that we perceive selectively.
Blockchain vs. GDPR
Bad news: there are fundamental conflicts between blockchain, at least in its original form, and data privacy regulations like GDPR. Good news: these conflicts provide a good opportunity to learn about what is actually contained in both.
The core property of blockchain that is novel and has been important for non-scam applications is immutability. If you have a network that is working properly, a bit of information you put on the blockchain will stay there unchanged. You will not be able to take it back, and no one else will be able to tamper with it either. This was important for the prototypical blockchain application, cryptocurrency, as it was necessary to hold people’s feet to the fire regarding their transactions - if I can change or retract what I put on the blockchain, I can undo my spending after I have run off with the goods and then double-spend my coins.
A few newer brands of blockchain (EOS comes to mind) have methods for retracting transactions, but these are controversial. For human and technical reasons, take-backsies on blockchain transactions may always range from slow and difficult to impossible.
On the GDPR side, a key privacy provision is the right to be forgotten - you can write to Google or Facebook and tell them to delete all the information they have about you. This clause has proven influential and has been incorporated into newer laws like California’s CCPA, and it will likely be imitated again in the future. From a concrete information technology standpoint, forgetting someone means going into your database to delete all the records related to them and their relationship with your business. A blockchain is really just an immutable database, so if you are using a blockchain in your technology stack right to be forgotten vs. immutability is a real headache. Immutability is explicitly a “real pain-in-the-ass to delete things” property, and this is a problem if you are legally required to delete things on request.
There are some things that can be done to mitigate these problems but at the end of the day it is actually the essential, novel property of blockchains that is the problem.
The slow, alienated death of blockchain
As a preface to this, there are good people doing great things with blockchain in good faith and this is not an essay targeting these people but a polemic in their defense. There are too many people, though, lost in fog about what blockchain is and not enough people telling the truth about what it is not.
It would be crazy to say you won’t hear about blockchain again, and perhaps the worst blockchain fatigue is ahead of us. Rather, we have arrived at an inflection point where the emerging viral big idea is that blockchain has lost its way, been applied to problems for which it has no utility, carelessly dropped as a buzzword with no meaning, incorporated into all manner of scams, not only over-hyped but radically mis-hyped, and warped by these trends into something that will now struggle to fulfill its original real potential.
The amazing power of blockchain has been its power purely as a word - a word about which one can say anything, arbitrarily disingenuous or poorly informed, and profit from the statement presuming the claim is sufficiently grandiose, intimidating, and FOMO-inducing. There is apparently vast power for “disruption” in using blockchain to record supply chain information where a centralized server would be demonstrably superior. Apparently, blockchain has a unique relationship with quantum computing despite being composed of very conventional classical cryptography. Apparently, blockchain can save your business by “decentralizing” when your business was really an effort at centralization by its very nature and no one can even explain in plain English what decentralization means and why it is good business. Apparently, blockchain can hugely improve your information security by reproducing your sensitive data across many servers each with the same vulnerabilities as any other centralized server. By etymology, to say something is apparent suggests it has appeared, but we are still waiting and we will continue to wait forever because these claims range from optimistic distortions to outright lies.
There are a great many lies that have become so pervasive that they have been widely repeated by honest people. This is a real tragedy, and this essay is not a polemic targeting these people but a polemic in their defense.
Blockchain came into the world in step with “decentralization” and if we actually maintain discipline about what these words mean, this is absolutely sensible and correct. The problem is much of the wealth and power in our society is centralized and the power of blockchain as a word is too useful a tool in chasing it. Corporations are centralized organizations - decentralization is what the Department of Justice does to your firm if it decides you don’t have enough competition to treat consumers decently. Governments are centralized organizations - decentralization is what happens when people decide they’ve had enough and find ways to be governed less and more locally. Venture capital firms are centralized organizations - if you take the money of a group of wealthy investors and centralize it one place to invest all at once, you’re on the way to founding a venture capital firm. One can’t say that blockchain will never have anything to offer these groups, but many promises made could never have been anything but empty because decentralization is contrary to the nature of these organizations. And neither centralization nor decentralization is intrinsically good or bad.
Too many organizations have been presented with blockchain the magical spell, the voodoo word for inciting fear of falling behind the times and missing out. Blockchain has been presented to many in bad faith not as information technology but as psychological manipulation. It is well attested the companies that have merely added blockchain to their name have seen their stock price soar, in some cases criminally (in the literal sense) absent any effort to implement any version of the real technology at all. Any application you might pitch to an investor that involves a database might as well involve a blockchain, and such is the power of the word that many can’t resist. But why did your application need to be on the blockchain? It didn’t, and it may have been a poor architecture decision that it was.
Experimentation in blockchain architecture continues, and while much of it is interesting and valuable, there is a sector that strongly resembles efforts to find a centralized server that resembles a blockchain enough to avoid lawsuits. Why be bothered to work with the challenging, real technology when one can work with the awesome persuasive power of the word alone?
Blockchain is not dying in the sense that it will disappear tomorrow. It is dying in the sense that it is mutating carelessly towards no constructive end, wasting time and money and human intellect and human emotion as it does. The real tragedy is that it is dying not because it has no potential but because no one can resist the potential it does not have.
Approaching "Big P.I.I."
Is your pet’s name P.I.I. (personally identifiable information)? In the era where data is bought, sold, and reaggregated the answer has to be “Yes!” If you find yourself asking whether something is P.I.I. then you should probably treat it as such.
A few years ago, it would have been silly to take a stand saying that your pet’s names are your P.I.I. Perhaps what we all didn’t quite anticipate was how the data we share over here, the data we share over there would likely be reunited in a single file by one of the many firms that aggregates and resells data. There are many bits of information about you that seem innocuous, certainly not so obviously compromising as an SSN or credit card number, but the reality is that when all the available information is collected about you, every little bit matters.
The world of 2019 has a lot more people than names, and when you think about it we end up using odd bits of information in maintaining our identity. Your bank may ask you where you met your partner, the names of pets, etc. and I am sure that each reader can come up with some more examples of their own. This is in part an effort to avoid more circulation of more intrinsically dangerous information like government ID numbers, but the result is that all these little facts then become and stay sensitive. If you could tell Facebook your cat’s name and Robinhood where you bank without them comparing notes behind your back, the situation would not be so bad, but this sort of behind the scenes aggregation is a pervasive part of the data economy and it is unlikely to go anywhere soon.
To change resolution a bit, there will always be a lot of other people out there with personal information that looks a bit like yours - same name, similar address, and so on - and there will always be someone digging a bit deeper for that last piece of data that identifies you the human uniquely. There are all manner of organizations working on this all the time, with motives both friendly and hostile to you, and they are eager to get that marginal value out of the names of your pets. And it should be noted that there have already been notable breaches of large aggregated datasets of information on millions of consumers with details including information on pets and more.
Until there is a major change in the regulatory climate, or some other seismic shift, definitions of P.I.I. must become more and more expansive. If you can use it to tell the difference between you and someone else that shares your name, someone will use it for this purpose, and anyone you share it with might decide to hoard it until it becomes interesting later - interesting to them, or interesting to someone else who wants to buy it.
CapitalOne and the cloud's shades of gray
Much of the propaganda for taking your data into the cloud has a “the cloud is the same, but cheaper” flavor - while this is not a terribly inaccurate four words, the CapitalOne breach does expose some wrinkles that the cloud creates for security governance. What follows is an attempt at spotlighting what these wrinkles are and what business leaders should know about them in a heuristic, minimally technical way.
Talking about “inside vs. outside” is a good lens for thinking about the extra administration burdens of the cloud. If you keep your data on-prem, operating your own data center and using it only for the data of your business, you might be blessed with a very cut-and-dried inside and outside. Your data is in a particular building, not mixed together with anyone else’s, probably protected by a firewall with a footprint more or less identical to the footprint of that building. There are less shades of gray on-prem and it is likely you created them up yourself.
If you keep your data in the cloud, there are some more potential shades of gray about inside and outside, and errors in managing shades of grey is how CapitalOne was left vulnerable. Your cloud environment is in a building with other people’s cloud environments, and maybe your cloud environment is actually a virtual server on a computer that is also simulating the cloud environments of other businesses. There might be, in principle, multiple firewalls in play (perhaps for the data center, a given physical server, and a virtual server it simulates) and there might be multiple different sets of rules governing how different servers, virtual and real, interact with these various firewalls. You can talk about inside vs. outside your virtual machine, your physical machine, a data center, a platform… This might sound a little trite and silly absent detail, and detail is unfortunately something the CapitalOne story can provide.
The CapitalOne breach exploited a misconfigured firewall, and in particular a misconfiguration that wrongly allowed CapitalOne’s servers to talk to a back-end resource inside Amazon Web Servers. This is the inside vs. outside heuristic starting to break down - the AWS resource was outside CapitalOne but still inside of Amazon. CapitalOne was secure relative to the all-the-way outside world, but assigned too much permission to an intermediate layer of systems designed to be friendly to users inside AWS, and there are in fact all kinds of actors using AWS with all kinds of motivations. One of them decided to steal data.
To return to the metaphor a moment, CapitalOne’s configuration treated AWS like a safe “inside” space, and the reality is that the cloud exposes you to systems that are neither so safe as the pure “inside” nor so dangerous as the pure “outside” of an on-prem data center. Secure use of the cloud requires recognition of shades of gray and the security risks that each presents.
The DAG of Counter-Party Risk
Many of the risks attached to sensitive consumer data are directional in an important sense. If I start with data and give it to you then I retain some liability for your bad behavior but there is no similar way for me to get you in trouble. Risk generated by sharing data is thus asymmetric. It is both true that public relations liability happens to work out this way and that this has been the design of data privacy regulations new and old. This is perhaps only a manifestation of our everyday common sense, as we are often upset with those who leaked our secrets and not with those who happened to learn them.
An excellent case study recently in the newspaper was the data breach at laboratory company Quest Diagnostics. One might put “at Quest Digagnostics” in quotation marks because this was how the headlines were typically written, yet the breach actually occurred at the debt collection agency AMCA. Quest had hired this firm, handed over data to them to enable their work, and it was actually AMCA that lost the data.
It is really little consolation to Quest, though, that it was a partner who stumbled. Quest is a much larger company, directly consumer-facing and visible, that patients are aware is holding personal data about them. Not much of anyone has heard of AMCA, and it is inevitable that the lion’s share of the public relations damage is heaped upon Quest. Furthermore, Quest is explicitly liable under HIPAA regulations for AMCA’s failures.
To be a firm that generates regulated data, like Quest, is tough sledding in 2019. Such firms have many legitimate reasons to share data, and each generates for risk that the partner does not equally share and is not likely to “feel” as well as one would hope. You might visualize the difficulty of this situation using a common concept in computer science and mathematics: the directed acyclic graph - a bunch of dots and arrows that point in some direction but do not form loops. Data propagates along the arrows and risk propagates back, and to be a company with all arrows out (lab testing companies, hospital networks, and any business that generates medical data are roughly in this boat) is to carry all the risk without much ability to manage it… at least so far.
Technologies like homomorphic encryption and zero-knowledge proof have great potential to change this situation, as they remove the need for partners to truly hold the data. Quest might have shared an encrypted dataset with AMCA and then used a cryptographic process to regulate the information they extracted from it - presumably only the information AMCA needed. It is debatable whether a breach of encrypted data is really a breach, especially when it is unlikely anyone will be able to do anything useful with that encrypted data. There is much to be gained, as there is not only the risk of actions taken to consider but the analytics value lost to data siloing undertaken avoiding that risk.
Data privacy regulations past and future: GDPR, CCPA, and beyond
What will data privacy regulations look like a few years down the line? There is much insight in examining the history of data privacy regulations so far and the role that geopolitical boundaries have played in that history. The big questions revolve around how data is shared over the internet and this sharing easily crosses national boundaries, and any data privacy regulation inevitably affects companies outside normal jurisdiction of the country passing the law. Economies of scale in information technology in turn make it inexpensive to obey a law uniformly, even in interactions with customers not covered by the law. This has created strong incentives to model their regulations on the more strict of those regulations already out there - an emerging and strong continuity trend. On the other hand, there are many regulators who are just getting started - legislators in several US states, for example, and the potential for regulatory complexity and headache to pile up is there even if each law closely (but not totally) resembles those which have come before.
GDPR and CCPA are the prototypical example. The United States has trailed behind Europe in interest in data privacy and GDPR was the first exposure of many American companies to vigorous data privacy regulation. A company with a web store, for example, previously might have customers from all over the world with little need to discriminate, but post-GDPR the data of the European nationals is regulated differently and requires new care. Invariably, the IT systems required for this care can be applied to all customers without much additional marginal cost, and this is one reason that CCPA was constructed in the model of GDPR. Businesses often see incentives to apply the sternest standard uniformly, so a natural legislative model that serves both businesses and consumers is to closely mimic the most demanding laws out there so far.
CCPA is also an example in the other direction, in that it introduces new wrinkles that will be a headache for some businesses. It specifically protects data on households, for example, rather GDPR’s explicit emphasis on individual data. It targets only firms based in California, yet given the influence of California firms on information technology it is still likely to have cross-border impact in ways that are yet to be seen.
A look backward at the road to GDPR in Europe can provide some insight into how data privacy might play out in the U.S. as more states contemplate data privacy regulations. Germany, for example, has a much longer history of data privacy regulation going back to 1990, and the development of GDPR embodies the same continuity vs. conflict narrative. Each new piece of legislation keeps much of what before, yet also introduces new conflicts while reconciling old ones. This dynamic continues to this day as firms and national governments continue to work out the contradictions in national v. international laws and regulatory powers.
The United States has no national data privacy law, in effect or in the works, but many states are considering such laws. A national law, presuming one comes, will be in response to and in continuation of actions taken by the states. Hawaii is considering a law with no legal definition of a “business” from a data standpoint, another in Massachusetts is heavily focused on biometric information, while several others aspire to tweak the relatively well-known and popular “right to be forgotten” provision. All borrow heavily from CCPA.
The U.S. national data privacy law will be defined again by continuity vs. conflict, perhaps coming early to correct a burdensome mess of disagreement among state laws or coming late to formalize a de facto international law created by continuity between German federal data privacy, GDPR, CCPA, and then a panoply of states.
What is a great collaborator?
What makes a great collaborator? Is it different than being great on your own? Does one come at the expense of the other? I have a couple examples from popular music I think about frequently...
Eric Clapton is about as celebrated as anyone in modern popular music, but many of his most famous songs are covers or collaborations with other luminaries. He wrote "Layla" with the immortal Duane Allman, several of his hits were covers of songs by the excellent but more obscure J.J. Cale, and even his early group Cream was thusly called for being a super-group.
Run the Jewels provides a different sort of case study because both its members have maintained vigorous parallel solo careers. This is only my opinion, but I think the two members are much better together than alone: Killer Mike is prone to bogging down in southern rap idiom while El-P gets lost in grumpiness and misanthropy. These vices seem to vanish when they are together.
What do you think? Is there such thing as a great collaborator?
Moving Parts Unknown
It is a major challenge for businesses of all sizes, and one that will only loom larger and larger, how information technology is increasingly complex, essential, and opaque. One can read almost every day about a firm that got both more and less than in bargained for from an IT contractor. These must only be the tip of the iceberg, as you are able to read about them in the media only where they boil over into lawsuits (like Hertz v. Accenture recently) or when they are intrinsically public (as in the case of Healthcare.gov).
A recent example involving Siemens, an independent contractor, and some subsequent criminal trouble is a great case study in these challenges.
The short story is that a contractor (allegedly) hid a bit of sabotage in their own code in hopes of generating more demand for follow-on work. Siemens noticed they had a problem but didn’t have a real great time figuring out what it was and they were greatly displeased when they did. This is all out in public only because of the ensuing criminal complaint against the contractor.
The idea of hiring a contractor for certain purposes, at least in spirit, is that you need some standard functionality and you don’t want to distract everyone in your organization with the details of how it is getting done under the hood. This presents some danger and requires some trust, however, as it leaves room for malicious action that will be quite difficult to detect - the metaphor “under the hood” only takes us so far and many of us are better equipped to recognize an extra widget bolted to our car engine than we are to sniff out malicious surplus code.
This an introduction to a subtle, structural challenge in cybersecurity: there are administrative and economic pressures driving decentralization in how code is generated, yet the end product can be very opaque and difficult to audit. And every indication is that these are trends which will continue for a while…
The "why" of our acronyms: PII vs. PHI
by Alexander C. Mueller
You might have a medical diagnosis you find embarrassing or just plain don’t want to talk about. If someone had your medical records, they would certainly find out… but if they had just some, any medical records, without reference to you or any other particular person, then your privacy is secure.
This silly fact highlights the difference between PII and PHI, and why PII is important. PHI, or personal health information, might be described as medical records that with enough data to tie them to people in the real world. This connecting data is then PII, or personally identifiable information, pretty much by definition. With your PII, a chart becomes your medical record and fits the definition of PHI, otherwise it is some medical record and this extra level of privacy motivates interest in deidentifying datasets.
PHI and PII overlap in usage a ton because PII is a key part of what makes PHI a privacy concern.
ABCs of PII
by Alexander C. Mueller
What is personally identifiable information, abbreviated P.I.I. or PII, and why is it important?
It’s easiest to break down backwards. First, it is Information, and typically the information so discussed is held by a large corporation of a government agency. Second, it Identifies some individual Person apart from the others. The term PII can sometimes refer by law to specific types of data, but the term is used broadly to refer to a broad category of data about everyday people that large organizations commonly end up storing.
Your name is the ultimate everyday example of PII. If you are standing next to someone else, a person who wanted your attention would say your name and not theirs - they’ve just used a small piece of information (your name) to identify you as one person apart from another.
Phone numbers are a bit more interesting. They do have a practical purpose, but they are also a good way to keep two people with the same name from getting confused in your database. Often, a business that collects this information on you is doing it for this sort of reason and not to actually try and call you. Phone number is thus another example of PII, information used to identify one person apart from another.
Thinking about data in this way is valuable because there are many white collar crimes and other misdeeds for which this sort of information is absolutely necessary to get started. Identity theft is the obvious and familiar example. However, there are many more scams you can only begin after you have enough information to target specific individuals and not groups of people. Imagine you are a foreign spy agency looking to recruit informants. Which is more helpful to you: 1) knowing that there are indebted people living in a particular city 2) a list of names, addresses, and phone numbers of indebted people in a particular city?
A Tale of Two Breaches
by Alexander Mueller
Much of our public conversation around cybersecurity and data loss in particular imagines one organization, usually a business, trying to defend its castle full of goodies from the barbarian hackers outside. The reality is that data gets passed around quite a bit, and in 2019 it is lost more often because of mistakes and bad practices around how it was circulated. The public has limited visibility into this circulation, and differences in regulation create drastic differences in who hears about what breach, what firms can be held liable for, and then inevitably in their information security practices and level of care.
On one end of the spectrum, industries without any regulation of their data are almost certainly breached more often than is public and more often than they know themselves. The damage of a breach is typically to consumers and not directly to the company breached, so there is a perverse incentive to avoid discovering breaches if you believe no one else will discover either. This dynamic is particular egregious around data collaboration with business partners - in principle, if I give my data to you and you lose it doing something stupid then I am liable as well, but in practice why does anyone want to maintain a bunch of records about who has what just so they can be a liability in court.
This may sound a bit jaded and conspiratorial, but the reality is that for many breaches no one can even say where the data came from originally. These breaches are also lightly publicized because there isn’t much constructive to say about them. There is this illicitly traded database of information on 200 million Americans with no clear provenance - many believe Experian lost this data originally, but this is disputed and to the knowledge of the author Experian has not been proven liable or held accountable in any way. Databases with huge amounts of personal information are often found derelict in the cloud (often with no password!) by security researchers, and invariably it is impossible to find owners for them. This database of medical information found unprotected is one of many examples.
At the other end of the spectrum, firms holding regulated data are in a really painful position because of the data they must share for unavoidable business needs and the difficulty of ensuring that 100% of their data partners are responsible. Good regulations often require firms to maintain records on to whom data is given (HIPAA requires this for example). It is becoming increasingly burdensome for many firms to find enough responsible partners - the nature of your business requires you to share data with partners and if someone else loses it, you are still liable. Cybersecurity in one organization is hard enough!
A great example from just the past few weeks was the data breach at Quest Diagnostics, or perhaps we should say the breach at AMCA. The first breach of the affair to be announced was at the laboratory testing company Quest, but unmentioned or buried but deep in many articles was that the breach had actually occurred at AMCA, a collections agency Quest employed. Days later, with considerably less publicity, a larger story emerged about the many firms caught up in the breach that centered on AMCA. Yet it will still be true going forward that Quest will get a big share of attention related to the incident as they are the largest firm involved, the most visible, and the one who originally collected the data from consumers.
At one end of the market, more regulation is sorely needed. At the other end, we must confront the unique and subtle challenges of securing data not just in a firm but across an ecosystem of many firms that must share data as an essential part of their operations. At Capnion we believe that emerging technologies like homomorphic encryption and zero-knowledge are a hand-in-glove solution to helping this latter group of firms collaborate - don’t share more than you need, don’t share anything in the clear, set up a system with just enough information in it for your business process and nothing else.
Save the Deal!
by John Senay
The modern business development manager’s greatest frustration? The inability to share data with a customer.
You spent months looking for new prospect that could benefit from your company’s product.
Your company has used internal resources at great cost to design and provide the ultimate solution for the prospect to turn them into a high margin customer.
Both your company and your soon-to-be high margin customer see the value of the relationship and need to move forward.
Let’s get the deal done!!
To get the deal done a great amount of data needs to be shared, exchanged, and tracked between your company and the new customer. To complicate the work flow, the high-margin customer has stated that for the deal to work, certain data has to be shared with 3 different partners in the supply chain with accompanying security and compliance issues.
The above scenario is all too familiar to the business development manager. In this day and age, business to business sales are complicated by the requirements of sharing of data. What information do your partners have to have access to? And who is going to control what, where, and how the 3 different partners use the data?
This situation is becoming the norm for contract acceptance and completion.
To get the contract signed the someone has to find a way to provide the data needed for the contract terms.
Is there a way to provide the required data in a safe, secure manner for all parties involved in the contract that all the companies IT groups can agree upon?
Yes there is!!
Capnion has a suite of cutting-edge encrypted data-in-use tools that allows specific, agreed upon data to be exchanged with all parties involved. Using our specially generated Answer Keys the appropriate parties can verify or analyze specific data without any need of decrypting it. At no time does the data ever need to be in plaintext!
Please contact sales@capnion.com for more information on how to meet contract clause for data sharing obligations.
Get that deal signed today!!
Thanks for reading.
The importance of being Random
by John Senay
What is all the fuss about Random Numbers and how they are generated? What do Random Numbers provide anyway? I know that’s how they pick the winning Lotto numbers.
Cryptography and Encryption use Random Numbers as their most basic building block. Without Random Numbers, Encryption would not be possible. Ghost PII would not be possible, and that would be very bad!
How can you get Random Numbers for the cryptography used in Ghost PII?
Well…you could use an algorithm to create “random” numbers, but research has shown that in certain instances, algorithms can be attacked and cracked. If the researchers have done it once, you know that the researchers will do it again, so Capnion does not use algorithms alone to generate Random Numbers!
What about hardware Random Number generators that utilize the “white noise” a PC produces while running? That is a possibility, but there are humans involved. We have all heard about the attempted and successful backdoors put into hardware by various unscrupulous parties. Not good enough for Ghost PII!!
Then how can you create true Random Numbers? One way is to use the white noise created by the earth’s atmosphere. That’s a great idea! Turn on the radio and feed the static (white noise) into the sound card and create Random Numbers. Maybe, but the earth is so finite. Not good enough for Ghost PII.
All of my life, I have wondered what is out “THERE.” You go out on a clear night, look up in any direction, and you are looking at infinity. Look through a telescope and what you see is a piece of the infinite, infinity. Wow! to this day I still cannot fully comprehend infinity of the Universe. It’s big, but it’s home!
How to use the infinite infinity to generate Random Numbers?
Capnion goes out to a Top Secret location and takes high resolution pictures of the night sky. Capnion calls this process Starlight.
The night sky is always changing due to changes in the atmosphere and even in the light that is arriving from the stars in the sky. The furthest star you can see with the naked eye is V762 Cas in Cassiopeia at 16,308 light-years away! When you look at V762, some of that twinkle you see is 16,308 years old!!
The high resolution pictures are digitized and the Random Numbers that Ghost PII uses for Encryption are generated from the tiny changes in this data.
Using Starlight to generate Random Numbers for Ghost PII, is out of this world, its truly COSMIC!!
Look for some photos on the website for a Starlight event.
Thanks for reading!!