Forecast: Cloudy with Eventual Challenges

01.26.12

Understanding data sources for acquiring and integrating data is rarely a slam dunk. After all, each data source is like an employee in a company, unique coming to the table with similar content and behavior but there are underlying differences, sometimes very subtle, indicating the person is from somewhere else. My favorite example, the way the word coffee is annunciated across the US puts a little more insight on where a person may be from, says the Jersey girl.  Unlike the accents where you understand the word and carry on with meaningful communication, data sourcing nuances define the phrase, the devil is in the details. Delta’s vs. full pulls, does the file need interception for cleanup, SLAs/ OLAs, upstream sources, downstream consumers…oh my. Not to mention the sensitive task of data classification requirements, PII (Personally Identifiable Information) vs. (Non PII) that data owners and integrators need to collaborate closely on.  

 
In response to the important task of classifying information, companies have developed programs for PII which protects the sensitive information that they have about you and I and who gets access to that information. If you have been to the doctor lately, you will have surely gotten HIPPA paperwork (considered HSPII – Highly Sensitive PII) indicating your information is secure. Source owners are the first step in the governance program as ultimately, data source owners decide who can have access to the their content through the downstream applications. Here are the PII classification categories and some examples:
 
  1. Low Risk:  Gender, Age, Zip Code
  2. Medium:  Name, Address, IP address
  3. High:  Phone number, SSN, Financial information, aggregated data: Email, Name, Address

 

The above categories relate to the impact to the company not securing this information. Hence, one piece of information may not be High Risk but aggregated with another piece of information it will be.

 
PII is likely to put technical demands on cloud solutions by not only requiring a secure transmission but also in regards to the security access and governance involved with reviewing and validating who can access what data. There are definitely companies out there that are finding the cloud a quite suitable solution. For those with data sensitive issues, a hybrid solution providing front end Web components in the Cloud and keeping data On Prem is working. However, even with keeping the bulk of the data safe on the corporate network, whatever does get transmitted and sits in the cloud will still need to be secured with encryption. Cloud encryption solutions need to consider the data not only while it is under transmission but also while it is sitting at rest in the cloud.
 
As always performance is key component to an application’s success. The need for sophisticated encryption may have an implication on performance. So let’s tally the work flows generated with moving data into the cloud:  1) classify data - review database tables 2) governance access – ongoing work stream to ensure data is made available to only the people that require access 3) acquire/integration profiling – best to identify issues as early as possible 4) PII vs. Non PII vs. HSPII 5) Encrypt? 6) Engagement of Security Department to get their approval and 7) make sure the implications of securing your data does not cause you to take a hit in performance. 
 
Now, that is not a final list by any means but a good start. Needless to say, initially, the task of data classification does not seem too daunting but based on the data content and the solution you need, the tasks can pile up quite quickly. Having good PMs that will get the team over the speed bumps that every project encounters will be key to ensure your schedule remains on track so you can be a ‘cloud’ star! 
 
 

« Back to index