Manually Add a Metadata Source in SentryOne Document
SentryOne Document supports multiple metadata sources, and we plan to expand available providers in the future. But what if you want to add a metadata source that isn’t currently supported?
This is where SentryOne Document’s Custom Metadata Import, available in both the Software and Cloud edition, can help. Leveraging this provider, you can manually insert any metadata source—DB2, REST API, or others. You can dictate the structure and contents of the custom metadata, including linking the relationships to and from any objects in Lineage, so you can understand the dependencies between all your metadata sources.
In this blog post, we’ll walk through Custom Metadata Import basics and how to obtain a sample set of a custom metadata source. We’ll wrap up with a Q&A at the end to discuss the most common questions.
Obtaining a Custom Metadata Import Example
The Custom Metadata Import leverages three Excel files manually written in a simple format, which will surface your customized metadata in the Documentation and Lineage tabs of the product, respectively. You can obtain an example set of these files at any time once you have installed the Remote Agent software. The example files represent a manually built SQL Server data source.
After you’ve installed the Remote Agent, you can download the example files discussed below by configuring a solution and selecting a source type of Custom metadata import. You can find the steps by navigating to this link, clicking Custom Metadata on the top navigation bar in SentryOne Document, and then following the steps listed to “Save a set of example files.”
Custom Metadata Import Files Explained
Three files are responsible for inserting your customized metadata into SentryOne Document. Let’s look at each of the files in detail.
The Objects File
The Objects file is the first file that should be built because it’s responsible for creating the hierarchy and drill-down functionality of the documentation as it’s seen and explored within SentryOne Document’s Documentation tab. There are five columns in this file, some of which tie into the Lineage Links and Properties files. The screenshot below shows the correlation between the Objects file and the structure of the documentation.
Column A: ObjectID
The ObjectID column contains a unique identifier. Each row in the Objects file is required to have a unique ObjectID.
Column B: ParentObjectID
The ParentObjectID column is responsible for nesting an entry under another entry by using the ObjectID of the parent. If an entry has a ParentObjectID of 0, the entry will remain at the root level of the documentation. You can use a ParentObjectID of 0 to have multiple custom metadata sources within a single Objects file if desired. For example, Database1 has a ParentObjectID of 2, which means Database1 will be nested under the object with the ObjectID of 2.
Column C: ObjectName
The ObjectName column is used to display the name of the object within the documentation.
Column D: ObjectType
This feature is currently disabled and might be coming in a future update based on customer demand. You can insert any nominal value here but don’t delete the ObjectType column. I recommend you use this as a descriptor of what the object is, such as Table or ServerName.
Column E: EndpointName
If you don’t plan to use Lineage for your custom metadata source, you can leave this column empty.
This column is leveraged within the Lineage Links file to build the dependencies and connections between all objects within Lineage. Note, not all rows in the Objects file need to have a value for EndpointName—only the rows you want to appear within the Lineage graph. For example, it makes sense to include specified Tables and Columns, but the TableCollection grouping your Tables has no value in the Lineage diagram.
This field must be comprised of the string used in the ObjectName column. Use period separation to indicate nesting within the Lineage Object Explorer.
Below is an example of period separation in the Objects file for MyServer.Database1.Table1 and how this is surfaced in the Lineage drill-down menu. To see Table1 in the Lineage view, I must drill into MyServer, then Database1, and I can then select Table1.
The Lineage Links File
Note: If you don’t plan to build Lineage connections for your custom metadata source, you can load an empty Excel .csv file.
The Lineage Links file is responsible for surfacing the dependencies between your custom metadata source objects in Lineage. This includes how your custom metadata source displays interconnections between objects within itself and how those objects connect to the rest of the supported metadata sources such as SQL Server databases, tables, columns, etc. There are three columns for each entry.
SourceEndpointName and TargetEndpointName
The SourceEndpointName and TargetEndpointName columns follow similar rules and formatting. The SourceEndpointName column is leveraged when determining the starting point of a Lineage connection. The TargetEndpointName value is leveraged when determining the ending point of a lineage connection. The directional arrow will point toward this object. This does mean that setting the Directional Dependency filter in the Lineage diagram, if used, will change what is or isn’t displayed in your result set.
If your source or target endpoint will use a manually created entry from the Objects file, meaning you have created the object manually, this value must use the value inserted in the EndpointName column of the Objects file. In the example shown in the screenshot below, you can see View1 connects to Table1 using the string generated in the Objects file.
If you want the source or target endpoint connection to link to a native object, you can locate the EndpointName by browsing to the object in Lineage and hovering over the object. (Native object is defined as any metadata source’s object that isn’t built manually through Custom Metadata Import, such as a SQL Server database, SQL Server Analysis Services, Power BI, etc.)
In the screenshot below, the EndpointName of my Address table is SERVERNAME.AdventureWorks2012.Person.Address. I can use this string in either the SourceEndpointName or TargetEndpointName columns to bridge my custom metadata into natively supported data sources within Lineage.
The LinkType column will display the type of link from source to target in the Lineage diagram. This relates to the Filter by Link Type in the Lineage diagram to determine when to display or hide an object in your result set based upon your criteria. Valid values are Object, Column, and ForeignKey. The list below provides definitions of each of these Link Type values and how they apply to the Lineage graph.
- Object: This represents a relationship created by object dependencies. If the Object filter in the Lineage diagram is enabled, you’ll see this in your result set.
- Column: This represents a relationship created by the dependencies on another object’s data, meaning the flow of data between objects. If the Data Lineage filter in the Lineage diagram is enabled, you’ll see this in your result set.
- ForeignKey: This represents a relationship created by key dependencies, such as a foreign key dependency between two schema/tables. If the Key filter in the Lineage diagram is enabled, you’ll see this in your result set.
In some rare cases, one lineage connection might have multiple Link Types. You can apply this by using the format LinkType,LinkType,LinkType in the LinkType column. For example, ForeignKey,Object,Column.
Keep in mind, this will rarely be used, as most connections will only have one Link Type per dependency direction. For example, a SQL Server Reporting Services report does SELECT * FROM Table. Column would go from the table to the report because the data is flowing from the table to report. Object would go from the report to the table because the definition of the report depends on the table. This means you would have two separate entries in the Lineage Links file using a single LinkType each to display communication to and from the report.
Make sure you use the correct LinkType so filtering the result set of Lineage will return the expected result set.
The Properties File
The Properties file is responsible for inserting customized information about your various Metadata objects into the documentation and is completely optional. The Properties file will be useful for inserting whatever custom information needs to be present within the written documentation (e.g., displaying data types or inserting security compliance or developer notations). You can be as descriptive and verbose, or as simple and minimalistic, as your use case demands.
Column A: ObjectId
This directly maps to the ObjectID column in the Objects file, which is the unique identifier for the object. It determines where your customized information will appear within the written documentation. In the example, the Properties file has added information into ObjectId 7, which maps to Column1 in the Objects file. This means our entries of int and 4 will appear in the documentation when we click Column1.
Column B: PropertyKey
This column is a description field and is typically used to describe the purpose of the entry. You can insert any value here, although we recommend you keep them consistent with your documentation requirements. For example, if you want to write developer notations about the given object, you could name this Developer Notes.
Column C: PropertyType
This column is used to describe the type of entry. Typically, you can leave this as System.String or String if you’re going to write out information.
Column D: PropertyValue
This is where your value will be inserted into the documentation. For example, you can write developer notations about the object in question or input a value for a data type or data length.
Run a Snapshot
Once you’ve written and saved the three files, you’re ready to run a snapshot. Keep in mind, if you don’t want to see Lineage connections to your custom metadata, you can ignore writing out the Lineage Links file and import a blank CSV.
Once your snapshot is complete, your custom metadata will appear in the Documentation and Lineage tabs of the solution. The granularity of Custom Metadata is highly flexible—you have control over how deep and complex or simple the documentation and lineage is. If you ever want to make changes to your Custom Metadata, all you need to do is add/remove/update the changes to the files, save the changes, and run a new snapshot.
Q: If I only want documentation, do I need to build the Lineage Links file?
A: No. If you only want documentation without lineage for your custom metadata source, you can fill out only the Objects and Properties files, respectively. The output can be as simple or as detailed as your use case demands.
Q: Should we backup our Custom Metadata Import files?
A: Absolutely. We recommend backing up these Excel files, so you don’t lose your work.
Q: Do the order of entries in the files matter? Are they sequential? And do the entries need to be numeric or contiguous?
A: No, the order/numbering/sequence doesn’t matter. I do recommend grouping them in a logical format as it will help you to stay organized.
Q: Can we have multiple custom sources in one set of custom metadata files?
A: Yes, you can. If you’re working with many custom metadata sources in one set of files, we recommend grouping them together in the files to stay organized. However, that isn’t required.
Q: Does Custom Metadata Import require a license?
A: Yes, leveraging Custom Metadata Import will consume a license for a metadata source.
If you aren't already using SentryOne Document, sign up for a free trial to see for yourself how SentryOne Document can support your metadata sources.
As a Solutions Engineer, Jon leverages technical knowledge with consulting skills to provide advice, demonstrations, and support on SentryOne products by being a key point of contact for existing and potential clients. He also works with development teams to provide insight into future development through testing and customer feedback, among other responsibilities.