How Data Lineage Helps Manage Chain of Custody for Healthcare Records

Jason Hall

Published On: April 6, 2020

Categories: Data Lineage, Documentation 0

I'm not a doctor or nurse, nor am I presently working in healthcare. However, I do work with a lot of customers in either healthcare or software systems meant to support healthcare. You can read about one of them in this case study. One thing I know is that HIPAA is always at the front, center, and back of their minds.

It is a top concern for systems in healthcare. It is also something they constantly need to chase. There are often mergers and acquisitions that bring in new and different systems and hundreds, sometimes thousands, of records and databases. Achieving HIPAA compliance in the first place is an achievement. Maintaining it through change is a triumph.

Importance HIPAA Places on Chain of Custody

Chain of custody, or being able to say who touched or saw records, is an important concept for HIPAA compliance. If it were necessary, I would like to know who can see or has seen my medical records. I also want to be able to trust institutions to keep that list tight when it is not necessary, which is most of the time. HIPAA compliance is, of course, about privacy, but from a business point of view, there are hefty fines to consider as well.

Providing chain of custody isn't easy. Think for a minute about how you might design chain of custody for data that is stored in a database, displayed across several reporting and transactional systems, and readily accessed or updated by any number of devices within medical facilities. I'm sure you've got something in mind, but I'm willing to bet it isn't as simple as microwaving popcorn.

The methods used are diverse across organizations, but one solution involves collecting detailed audit data at every access point and storing this in a central location. This sounds straightforward, but consider that healthcare records cross the boundaries of many new and legacy systems during their lifecycle. Also consider cases where information from a record, but not the full record, is projected or aggregated in reporting scenarios. Understanding all the points of access for PHI data can become daunting quickly.

Given the total landscape of systems that touch healthcare data along with the need to maintain a clear and accurate chain of custody, a new problem is surfaced. How can we simplify the process of knowing where data is stored, where it appears, and where it is modified?

Data Lineage and Chain of Custody

Data lineage refers to mechanisms, processes, and output pertaining to the origin, locations, and transformation of data across systems and over time. Data lineage relates to chain of custody at a meta level.

Consider a document like your birth certificate. You began life with only one original copy of this important document. Over time, you may have needed it to get a driver's license or a marriage license. You may have gathered copies of it for several occasions. In however many years you've been around, do you know how many folders, file cabinets, databases, printers, and screens have housed or displayed information from your birth certificate? Being able to see the path the data has taken from inception, and how it may have changed across all of these locations and systems, is its lineage.

We've established that chain of custody is a challenge. We've also established how data lineage is similar to chain of custody, except it deals with metadata rather than actual data access. The two are related in that having a clear view of data lineage greatly reduces the effort needed to establish chain of custody. If you know the lineage of the data, it is far easier to narrow down the points at which you must gather information for chain of custody.

Again, data can be in many systems. Those systems can be database servers in your data center, reporting systems used by analysts, temporary staging locations used in processing, or any number of use cases that may be needed by an institution. Is there a solution that can collect and visualize data lineage across systems and automatically track how it changes over time?

SentryOne Document and Data Lineage

SentryOne Document is a SaaS solution designed to help easily document data sources and data lineage. It currently supports several sources from the Microsoft Data Platform and a few others. The latest addition is Salesforce.

SentryOne Document Data Lineage and impact analysis

As records move from source database, to ETL processes, to Data Warehouses, to Power BI reports, etc., the data lineage function in SentryOne Document will help users, administrators, and—yes—auditors, discover which systems  view or modify the data. In determining chain of custody, this same lineage feature will provide a foundation for assembling the chain of custody for this data.

An added benefit for analysts, executives, and data scientists is that—when they inevitably question the validity of  data they are seeing—lineage lets them pinpoint where data can change or transform. They can then validate or invalidate what they see quickly and without a heavy investigation that ties up the data platform team for hours or potentially days.


Chain of custody for medical data is quite necessary for HIPAA compliance. Tracking data lineage, among other benefits, reduces time and risk in establishing chain of custody. SentryOne Document provides data lineage tracking plus many additional enterprise features in a convenient SaaS format.

You can try SentryOne Document free for 30 days right now, so what are you waiting for?

Jason has worked in technology for over 20 years. He joined SentryOne in 2006 having held positions in network administration, database administration, and software engineering. During his tenure at SentryOne, Jason has served as senior software developer and founded both Client Services and Product Management. His diverse background with relevant technologies made him the perfect choice to build out both of these functions. As SentryOne experienced explosive growth, Jason returned to lead SentryOne Client Services, where he ensures that SentryOne customers receive the best possible end to end experience in the ever-changing world of database performance and productivity.