A data governance and compliance expert explains how AI capabilities and new regulations push the collection of personal information to a higher level of risk.
By Sabine Vollmer August 2019
The expanding use of artificial intelligence (AI) and machine learning challenges the finance function to take on responsibility for personally identifiable information that technology can create, says Cindy Maike, CPA, vice-president, industry solutions at US-headquartered software company Cloudera and a data governance and compliance expert.
One driving factor is data privacy protection regulation in effect in the EU and Japan and set to take effect next year in Brazil, South Africa, and the US (the California Consumer Privacy Act takes effect in January 2020). Ethical data governance, particularly in case of cyberattacks and data breaches, is a second factor.
Maike talked with FM magazine about what the finance function can do to become a better data steward:
What kind of personally identifiable information can we create by connecting disparate data?
Maike: The techniques we’re using with artificial intelligence and machine learning are no different from techniques and methods we have used in the past for link analysis and forensic analysis for various types of pattern recognition. Traditionally, we’ve had statistical models, but it was always with limited amounts of data that fit nicely into columns and rows. Now we’ve got large amounts of data in new forms, shapes, and sizes. The other aspect is we’re able to join data we’ve never been able to join before. I’ll refer to it as alternate data [data from nontraditional sources] and derived data [data that is reused, mixed, or computed from different sources].
We’re using all these data points to look for intersections. That’s done through neural network analysis methods, which allow me to research, plot, and identify new data correlations and patterns. Historically, we could do it, pulling different pieces together, but it took forever. Now, we have the ability to do it much faster and with more data.
We have to be concerned about some of the derived data values. One aspect is the fact that we can do it. The other aspect is what are we going to do with the data once we discover a pattern. We need to think about it from a control, risk and compliance, accounting, and audit perspective.
You have to tag your data. You have to know more and more about the metadata, or the data about the data. From an audit perspective, you need to think about the data lineage perspective and data sources. Because AI and data science are allowing us to do things we may not have done before, I’m seeing situations where people may not understand the outcomes of the “derived data”. Take for instance geospatial information and combining that with other data sources and what can it tell me. I’ve seen groups such as the National Association of Insurance Commissioners launch a big data working group. They’ve started to become aware about personal data and what from an insurance industry perspective you can bring together to know about customers.
Give me an example.
Maike: Think of a person’s driving data or data coming off a vehicle. You have sensors and devices. You can start locating cars. Then, think about aerial imagery. You can take a geospatial location, that is longitude and latitude, and that will help you go to anything related to Google maps. From this, you can get more data as to what homes look like, what businesses look like. Link that to publicly available property records in the US and ask, “Who owns that property?” You can get mortgage information. “Is the mortgage paid off?” These are public filings.
And then you think, “Is this OK? Should I have all that information?” As a big data person, I’m thinking, “This is cool to have all this information.” But then I go, “Yeah, but I want to make sure that I don’t abuse this information.”
Absent regulation, you might establish some ethical rules for yourself, but that’s voluntary.
Maike: Right. The other aspect is, if we do end up with data breaches, everybody wonders why it wasn’t caught. Data has to be properly protected. It has to be governed. You have to know who has it and what they have done with it.
What it means from a CFO perspective, and from a compliance and risk perspective, is that you had better know what data you have and what you’ve done with it. Data governance is critical when it comes to the implementation of AI. People typically call data an asset, but when does it become a liability? You’ve got to be able to balance these two.
The liability comes from operating in a global economy, where there are regulations outside of the US.
Maike: Exactly. In fact, I was in a meeting the other day where an organisation was concerned about videoconferencing services, about storing the recordings in a different geography. What if that information is taken? Just for compliance reasons, they realised that they need to store that data in the same country where it originates. Think about that. There were people from different countries who participated. How do you keep track of that? Where’s the governing law?
Are there any processes that the finance function can implement to identify data created by AI or machine learning and address potential privacy issues?
Maike: As a finance and accounting function, we have to be the ones who make sure that we’ve got governance programmes in place. We may not always be the owner of data governance. It could be a chief risk officer, the accounting officer, the finance officer, but someone has to own it.
Especially if you’re a publicly traded organisation. It’s a risk and exposure that needs to be recorded. I think it needs to be disclosed. What are your data privacy practices? It may even be an honour rule we get into. What are your principles?
If you have proper data governance and metadata tagging, you can say, “This data and this data may not actually constitute private data, but when this data and this data is joined with this data, it becomes private.” Somewhere within the organisation, rules and processes are established. It’s up to your data governance and your data stewards to say, “You can’t join this kind of data.” It gets into security processes and controls. I don’t think everything should be dependent on the IT team. IT can be a custodian of data, but I believe the business needs to own data and what’s done with data.
Who should own that responsibility?
Maike: It can go a number of different ways, depending upon the size of the enterprise. It’s definitely a risk and compliance function responsibility. As a result, I believe it falls under the CFO or based upon your reporting structure, or your chief risk officer.
— Sabine Vollmer (Sabine.Vollmer@aicpa-cima.com) is an FM magazine senior editor.