Announcing vector support in PostgreSQL services to power AI-enabled applications
Generative AI is ushering in a new wave of experiences that transform how we interact with information, brands, and one another. In the future, Google believe that almost every application will be AI-driven, and developers who build these applications will need the right tools to build these new experiences into their applications.
These experiences will leverage application state, user inputs, and operational data to provide relevant context to better serve users’ needs. And since operational databases are already at the heart of some of these applications, they will be critical to enabling new generative AI user experiences.
Today, Google are announcing support for storing and efficiently querying vectors in Cloud SQL for PostgreSQL and AlloyDB for PostgreSQL, empowering you to unlock the power of generative AI in your applications. Now you can use your database to store and index vector embeddings generated by large language models (LLMs) via the popular pgvector PostgreSQL extension, efficiently find similar items using exact and approximate nearest neighbor search, and leverage relational database data and features to further enrich and process the data.
Vector embeddings are numerical representations typically used to transform complex user-generated content like text, audio, and video into a form that can be easily stored, manipulated, and indexed. These representations are generated by embeddings models such that, if two pieces of content are semantically similar, their respective embeddings are located near each other in the embedding vector space. Vector embeddings are then indexed and used to efficiently filter data based on similarity.
For example, as a clothing retailer, you might want to surface product recommendations that are similar to the items in a user’s cart. Today, you can use Vertex AI’s pre-trained text embeddings model to generate embeddings based on product descriptions. The pgvector extension allows you to store and index these embeddings in your Cloud SQL or AlloyDB database, and easily query for similar products. Because these embeddings are stored with your operational data, you can further tailor the result based on what you know about the user and the product. For example, you can filter on structured data like price or category, or join the results with real-time operational data like user profile and inventory data to make sure that you’re only showing items that are in stock in the user’s size.
Vector embeddings can also play a critical role in helping developers leverage pre-trained LLMs, which have taken the world by storm in the past year. LLMs are trained on vast amounts of data and can be applied to use cases such as translation, summarization, question answering, and creative writing. They’re useful for ML-driven application development because they can be customized to an application’s specific needs without requiring ML expertise or custom model training. You can customize the output of an LLM by strategically crafting a prompt, and the prompt then allows you to ground the LLM using application-specific contextual data like documentation, user profiles, and chat history.
One thing to note about LLMs is that they have no concept of state. But as every chatbot user knows, the history of the chat is needed in order to keep up the conversation and provide relevant responses. Since models have strict input token limits, it isn’t always possible to provide the full context in the prompt. Embeddings allow you to store large contexts such as documentation or long-term chat histories in your database and filter them to find the most relevant information. You can then feed the most relevant pieces of chat history or documentation to the model to simulate long-term memory and business-specific knowledge.
Key benefits
With vector support built directly into our Cloud SQL and AlloyDB databases, you can create AI-enabled experiences that are aware of application and user state. In addition to basic vector support, Cloud SQL and AlloyDB provide:
- Enterprise-grade serving stacks: Cloud SQL and AlloyDB are fully managed services that can power low-latency serving applications. Both databases offer automatic provisioning and maintenance to reduce database costs, as well as enterprise capabilities such as high availability, data protection, and built-in security backed by our 24/7 SRE team.
- Ease and familiarity: Vector support in PostgreSQL means you can use your existing operational database to power AI-enabled experiences and leverage your existing PostgreSQL skills including everything that the PostgreSQL ecosystem has to offer.
- Tight integration with operational data: We believe the best AI-enabled experiences will be deeply integrated with the application, leveraging real-time transactional data to enhance user experiences. AlloyDB and Cloud SQL for PostgreSQL make that possible by embedding AI directly into your operational database, supporting powerful queries across structured and unstructured operational data and combining vector predicates with standard PostgreSQL filters.
When you run your enterprise workloads on AlloyDB or Cloud SQL for PostgreSQL, you can use the power of vector embeddings together with your fully managed operational database.
Integrating with Vertex AI
Cloud SQL and AlloyDB’s vector support is particularly powerful when paired with generative AI services on Vertex AI, including pre-trained foundational and embeddings models across text and images. And with AlloyDB, you can call custom Vertex AI models directly from the database, for high-throughput, low-latency augmented transactions. Together, these provide a toolkit to integrate large language models into applications.
You can also use vector support in conjunction with Vertex AI Matching Engine, the industry’s leading high-scale low-latency vector database, to store embeddings and for vector similarity matching. Now Cloud SQL and AlloyDB offer similar functionality for using vector embeddings directly in your application, leveraging real-time operational data and relational database features to create richer experiences, using the trusted PostgreSQL database.