Faced with ever-expanding volumes and types of data, organizations increasingly need a central catalog for their data assets. Dataplex Catalog, Google Cloud’s next-generation data asset inventory platform, provides a unified inventory for all your metadata, whether your resources are in Google Cloud or on-premises, and today, it’s generally available. 

What can Dataplex Catalog help you with? 

With Dataplex Catalog, you can search and discover your data across the organization, understand its context to better assess its suitability for data consumption needs, enable data governance over your data assets, and further enrich it with additional business and technical metadata to capture the context and knowledge about your data realm.

Here’s how Dataplex Catalog helps answer your daily data discovery and governance questions:

  • As a data analyst or a business analyst, you can search data resources across the organization and explore associated metadata.
  • As a data producer or governor, you can annotate your data resources, capturing additional technical, semantic and business metadata. 
  • As a data owner, steward or governor, you can bring consistency into your metadata by defining the standards for annotation and custom resources.
  • As a data engineer, you have a unified inventory for your resources, including Google Cloud resources (automatically harvested by Dataplex Catalog) and resources from third-party systems (harvested by you and ingested into Dataplex Catalog).

The Dataplex Catalog brings a robust metamodel and a single, easy-to-use API. Here are a few of the benefits of using Dataplex Catalog:

  • An expressive metadata structure allows you to store and work with a wide range of metadata types, including complex structures like lists, maps, and arrays.
  • You can self-configure the metadata structure for your custom resources for efficient ingestion and consistency.
  • You can interact with all metadata associated with an entry through single atomic CRUD operations and fetch multiple metadata annotations associated in search or list responses.

There are no charges for basic API operations (create, read, update, delete) and search performed against Dataplex Catalog individual resources. However, there is a charge for metadata storage. Dataplex Catalog is available in the console, via API, via the `gcloud` CLI, and has full Terraform provider support.

Google Cloud strive to make it easy for partners to integrate with Google, amplifying our collective value.  Google work closely with a diverse range of  partners to extend our data management capabilities into hybrid and multi-cloud environments. Today, Dataplex Catalog is integrated with Collibra for customers that use both Dataplex and Collibra to streamline governance across their cloud, on-premises, self-managed, and edge locations. Stay tuned for more announcements about additional partnerships that will further enhance Google data management capabilities and deliver even greater value to Google customers.