From 3373fa42fef2c8cc798d0a28e4887ce568feecbe Mon Sep 17 00:00:00 2001 From: Marco Minerva Date: Thu, 7 Nov 2024 09:42:27 +0100 Subject: [PATCH] Update README.md and modify prompt logic in ChatService README.md: Updated links to 'sql' branch and corrected property name. ChatService.cs: Changed prompt separators for better clarity. --- README.md | 13 +++++-------- SqlDatabaseVectorSearch/Services/ChatService.cs | 4 ++-- 2 files changed, 7 insertions(+), 10 deletions(-) diff --git a/README.md b/README.md index b3397ab..5f15ee1 100644 --- a/README.md +++ b/README.md @@ -1,21 +1,18 @@ # SQL Database Vector Search Sample A repository that showcases the native VECTOR type in Azure SQL Database to perform embeddings and RAG with Azure OpenAI. -> [!IMPORTANT] -> Usage of this application requires the Vector support feature in Azure SQL Database or Managed Instance, currently in EAP. [See this blog post](https://devblogs.microsoft.com/azure-sql/announcing-eap-native-vector-support-in-azure-sql-database/) for more details. - The application is a Minimal API that exposes endpoints to load documents, generate embeddings and save them into the database as Vectors, and perform searches using Vector Search and RAG. Currently, only PDF files are supported. Vectors are saved and retrieved using direct SQL queries with [Dapper](https://github.com/DapperLib/Dapper). Embedding and Chat Completion are integrated with [Semantic Kernel](https://github.com/microsoft/semantic-kernel). > [!NOTE] > If you prefer to use Entity Framework Core, check out the [master branch](https://github.com/marcominerva/SqlDatabaseVectorSearch/tree/master). -![SQL Database Vector Search](https://github.com/marcominerva/SqlDatabaseVectorSearch/blob/master/SqlDatabaseVectorSearch.png) +![SQL Database Vector Search](https://github.com/marcominerva/SqlDatabaseVectorSearch/blob/sql/SqlDatabaseVectorSearch.png) ### Setup - [Create an Azure SQL Database](https://learn.microsoft.com/en-us/azure/azure-sql/database/single-database-create-quickstart) on a server that has the Vector Support feature enabled -- Execute the [Scripts.sql](https://github.com/marcominerva/SqlDatabaseVectorSearch/blob/master/Scripts.sql) file to create the tables needed by the application - - You may need to update the size of the [`VECTOR`](https://github.com/marcominerva/SqlDatabaseVectorSearch/blob/master/Scripts.sql#L17) column to match the size of the embedding model. Currently, the maximum allowed value is 1998. -- Open the [appsettings.json](https://github.com/marcominerva/SqlDatabaseVectorSearch/blob/master/SqlDatabaseVectorSearch/appsettings.json) file and set the connection string to the database and the other settings required by Azure OpenAI - - If your embedding model supports shortening, like **text-embedding-3-small** and **text-embedding-3-large**, and you want to use this feature, you need to set the [`Dimension`](https://github.com/marcominerva/SqlDatabaseVectorSearch/blob/master/SqlDatabaseVectorSearch/appsettings.json#L17) property to match the value you have used in the SQL script. If your model doesn't provide this feature, or do you want to use the default size, just leave the [`Dimension`](https://github.com/marcominerva/SqlDatabaseVectorSearch/blob/master/SqlDatabaseVectorSearch/appsettings.json#L17) property to NULL. Keep in mind that **text-embedding-3-small** has a dimension of 1536, while **text-embedding-3-large** uses vectors with 3072 elements, so with this latter model it is mandatory to specify a value (that, as said, must be less or equal to 1998). +- Execute the [Scripts.sql](https://github.com/marcominerva/SqlDatabaseVectorSearch/blob/sql/Scripts.sql) file to create the tables needed by the application + - You may need to update the size of the [`VECTOR`](https://github.com/marcominerva/SqlDatabaseVectorSearch/blob/sql/Scripts.sql#L17) column to match the size of the embedding model. Currently, the maximum allowed value is 1998. +- Open the [appsettings.json](https://github.com/marcominerva/SqlDatabaseVectorSearch/blob/sql/SqlDatabaseVectorSearch/appsettings.json) file and set the connection string to the database and the other settings required by Azure OpenAI + - If your embedding model supports shortening, like **text-embedding-3-small** and **text-embedding-3-large**, and you want to use this feature, you need to set the [`Dimensions`](https://github.com/marcominerva/SqlDatabaseVectorSearch/blob/sql/SqlDatabaseVectorSearch/appsettings.json#L17) property to match the value you have used in the SQL script. If your model doesn't provide this feature, or do you want to use the default size, just leave the [`Dimensions`](https://github.com/marcominerva/SqlDatabaseVectorSearch/blob/sql/SqlDatabaseVectorSearch/appsettings.json#L17) property to NULL. Keep in mind that **text-embedding-3-small** has a dimension of 1536, while **text-embedding-3-large** uses vectors with 3072 elements, so with this latter model it is mandatory to specify a value (that, as said, must be less or equal to 1998). - Run the application and start importing your PDF documents. diff --git a/SqlDatabaseVectorSearch/Services/ChatService.cs b/SqlDatabaseVectorSearch/Services/ChatService.cs index f6b1628..8b07b2c 100644 --- a/SqlDatabaseVectorSearch/Services/ChatService.cs +++ b/SqlDatabaseVectorSearch/Services/ChatService.cs @@ -44,18 +44,18 @@ public class ChatService(IMemoryCache cache, IChatCompletionService chatCompleti var prompt = new StringBuilder(""" Using the following information: - --- """); // TODO: Ensure that chunks are not too long, according to the model max token. foreach (var text in chunks) { - prompt.Append(text); prompt.AppendLine("---"); + prompt.Append(text); } prompt.AppendLine($""" + ===== Answer the following question: --- {question}