Page 1 of 3
By Rebecca Herold
Following the release of the CMU SSN report, I've had some very interesting discussions with privacy and information security folks, and I've been pretty amazed at some of the reactions to the study.
Yes, probably many of us in information security and privacy professions have known for a very long time that SSNs could probably be guessed because of the way in which they are constructed using simple items as city and birth date.
However, over the years when discussing this topic with executives the question always arose, "Well, how easy is it really?"
I never could provide an answer supported by actual research.
The CMU provides this answer, with successful SSN discovery percentages being comparatively low in densely populated areas, but comparatively high in sparsely populated areas, and also depending upon the year the SSN was issued... later years are most vulnerable to discovery.
It also shows that is it relatively easy to run a computer program to take a partial SSN (such as often found in customer and employee IDs) and then determine the full SSN.
I agree with many opinions; the CMU report itself, meeting necessarily rigorous academic standards and documentation, is not an easy read for the general public, or even for business leaders who must consider and make decisions about authentication, identity validation, and other security controls involving various types of personally identifiable information (PII).
But, it truly does point out the ease with which valid, whole, SSNs can be determined when knowing just geography and birth date, and even easier when knowing a portion of the SSN.
This should be used to critically look at how businesses truly use SSNs.
As a very simplified bit of background, Social Security numbers (SSNs) have historically been, and are currently, created using an algorithm based largely upon geographical information and birthdate.
Until the Internet was widely being used, and prior to wholesale posting of a large amount of personal information to various websites, primarily social networking and other "Web 2.0" sites, the data items used to generate SSNs were not easily found.
A person would need to go, often physically, to different locations to gather the different items, or remember to collect the information as they happened upon them over time to be able to determine a person's SSN.
As I explain a bit more in my blog posting, now it takes a matter of seconds to find these information items online for most folks, and by using computers to apply a comparatively simple formula, valid SSNs can be discovered.