14 Commits

Author SHA1 Message Date
Marco Minerva fad66a2fbf Update package versions in SqlDatabaseVectorSearch.csproj
Updated Swashbuckle.AspNetCore.SwaggerUI from 7.1.0 to 7.2.0.
Updated TinyHelpers.AspNetCore from 4.0.5 to 4.0.6.
These updates likely include bug fixes, performance improvements,
or new features.
2024-12-11 10:01:30 +01:00
Marco Minerva 33c8fcb9dc Switch to OpenAPI and hybrid caching mechanism
Updated Program.cs to use TinyHelpers.AspNetCore.OpenApi and Microsoft.Extensions.Caching.Hybrid. Refactored ChatService.cs to use HybridCache for chat history management. Removed MessageLimit property from AppSettings.cs and appsettings.json. Updated SqlDatabaseVectorSearch.csproj to include new caching package and replace Swagger with Swagger UI.
2024-12-10 11:57:37 +01:00
Marco Minerva 09cd5cb9c7 Update TinyHelpers.AspNetCore to version 4.0.5
The version of the `TinyHelpers.AspNetCore` package has been updated from `4.0.4` to `4.0.5` in the `SqlDatabaseVectorSearch.csproj` file. This update likely includes bug fixes, improvements, or new features provided in the newer version of the package.
2024-12-04 15:35:03 +01:00
Marco Minerva 2b669c191e Update package versions and add new TinyHelpers reference
Updated Microsoft.SemanticKernel to 1.31.0, Swashbuckle.AspNetCore to 7.1.0, and TinyHelpers.AspNetCore to 4.0.4. Added new package reference for TinyHelpers.AspNetCore.Swashbuckle version 4.0.5.
2024-12-04 10:58:41 +01:00
Marco Minerva 2c5c164098 Update Microsoft.SemanticKernel to 1.30.0
Updated the Microsoft.SemanticKernel package from version 1.27.0 to 1.30.0 in the SqlDatabaseVectorSearch.csproj file. This update may include bug fixes, new features, or other improvements.
2024-11-21 17:52:03 +01:00
Marco Minerva aadab97133 Update code style, ChatService, VectorSearch, and .NET 9.0
Updated .editorconfig with new code style preferences.
Enhanced ChatService prompt string with a new directive.
Modified VectorSearchService using directives and tuple order.
Upgraded SqlDatabaseVectorSearch to target .NET 9.0 and updated packages.
2024-11-21 17:51:35 +01:00
Marco Minerva 3373fa42fe Update README.md and modify prompt logic in ChatService
README.md: Updated links to 'sql' branch and corrected property name.
ChatService.cs: Changed prompt separators for better clarity.
2024-11-07 09:42:27 +01:00
Marco Minerva 6c423fb306 Rename 'result' to 'text' in foreach loop for clarity
The variable name `result` in the `foreach` loop has been changed to `text` for better clarity and consistency. This change affects the loop that appends chunks to the `prompt` variable.
2024-11-06 17:27:47 +01:00
Marco Minerva 29b8ebe283 Change ChatService to singleton, update package version
- Changed ChatService registration in Program.cs to singleton.
- Reformatted ChatHistory initialization in ChatService.cs.
- Modified prompt construction to avoid new lines after chunks.
- Updated Microsoft.SemanticKernel package to version 1.27.0.
2024-11-06 17:23:12 +01:00
Marco Minerva 091f76e0c6 Update Microsoft.SemanticKernel to v1.26.0
The Microsoft.SemanticKernel package reference in the SqlDatabaseVectorSearch.csproj file has been updated from version 1.25.0 to 1.26.0.
2024-11-05 17:07:43 +01:00
Marco Minerva c8c989b42c Update README: clarify use of Dapper and EF Core
Revised the application description in README.md to specify the use of direct SQL queries with Dapper for saving and retrieving Vectors. The note about using Entity Framework Core has been moved and rephrased for better clarity.
2024-11-05 11:24:48 +01:00
Marco Minerva 017dda0785 Fix hyperlink to master branch in README.md
Corrected the URL in README.md to point to the master branch for using Entity Framework Core, ensuring users are directed to the correct branch.
2024-10-31 15:26:55 +01:00
Marco Minerva 6c5292d6c7 Clarify Vector support requirements in README.md
Expanded the Vector support requirement to include both Azure SQL Database and Managed Instance, both currently in EAP. Improved wording for clarity in the note about using direct SQL queries with Dapper.
2024-10-31 15:26:01 +01:00
Marco Minerva 1fc6d3c945 Update README: Add note on vector storage with Dapper
The README.md file has been updated to include a new note about how vectors are saved and retrieved using direct SQL queries with Dapper. Additionally, it provides a link to the master branch for those who prefer to use Entity Framework Core instead. This addition helps clarify the technologies used and offers options for different preferences.
2024-10-31 15:25:06 +01:00
67 changed files with 325 additions and 3051 deletions
-8
View File
@@ -22,7 +22,6 @@ dotnet_style_operator_placement_when_wrapping = beginning_of_line
dotnet_style_object_initializer = true:suggestion dotnet_style_object_initializer = true:suggestion
dotnet_style_coalesce_expression = true:suggestion dotnet_style_coalesce_expression = true:suggestion
dotnet_style_collection_initializer = true:suggestion dotnet_style_collection_initializer = true:suggestion
dotnet_style_prefer_collection_expression = when_types_loosely_match:suggestion
dotnet_style_prefer_simplified_boolean_expressions = true:suggestion dotnet_style_prefer_simplified_boolean_expressions = true:suggestion
dotnet_style_prefer_conditional_expression_over_assignment = false:silent dotnet_style_prefer_conditional_expression_over_assignment = false:silent
dotnet_style_prefer_conditional_expression_over_return = false:silent dotnet_style_prefer_conditional_expression_over_return = false:silent
@@ -82,7 +81,6 @@ csharp_style_prefer_local_over_anonymous_function = true:silent
csharp_style_prefer_extended_property_pattern = true:suggestion csharp_style_prefer_extended_property_pattern = true:suggestion
csharp_style_implicit_object_creation_when_type_is_apparent = true:silent csharp_style_implicit_object_creation_when_type_is_apparent = true:silent
csharp_style_prefer_tuple_swap = true:silent csharp_style_prefer_tuple_swap = true:silent
csharp_style_prefer_simple_property_accessors = true:suggestion
# Field preferences # Field preferences
dotnet_style_readonly_field = true:suggestion dotnet_style_readonly_field = true:suggestion
@@ -298,9 +296,3 @@ dotnet_diagnostic.IDE0010.severity = none
# IDE0072: Add missing cases # IDE0072: Add missing cases
dotnet_diagnostic.IDE0072.severity = none dotnet_diagnostic.IDE0072.severity = none
# IDE0305: Simplify collection initialization
dotnet_diagnostic.IDE0305.severity = none
# CA1873: Avoid potentially expensive logging
dotnet_diagnostic.CA1873.severity = none
-84
View File
@@ -1,84 +0,0 @@
## General
- Make only high confidence suggestions when reviewing code changes.
- Always use the latest version C#, currently C# 14 features.
- Write code that is clean, maintainable, and easy to understand.
- Only add comments rarely to explain why a non-intuitive solution was used. The code should be self-explanatory otherwise.
- Don't add the UTF-8 BOM to files unless they have non-ASCII characters.
- Never change global.json unless explicitly asked to.
- Never change package.json or package-lock.json files unless explicitly asked to.
- Never change NuGet.config files unless explicitly asked to.
## Code Style
### Formatting
- Apply code-formatting style defined in `.editorconfig`.
- Use primary constructors where applicable.
- Prefer file-scoped namespace declarations and single-line using directives.
- Insert a newline before the opening curly brace of any code block (e.g., after `if`, `for`, `while`, `foreach`, `using`, `try`, etc.).
- Ensure that the final return statement of a method is on its own line.
- Use pattern matching and switch expressions wherever possible.
- Prefer using collection expressions when possible
- Use `is` pattern matching instead of `as` and null checks
- Use `nameof` instead of string literals when referring to member names.
- Prefer `?.` if applicable (e.g. `scope?.Dispose()`).
- Use `ObjectDisposedException.ThrowIf` where applicable.
- Use `ArgumentNullException.ThrowIfNull` to validate input parameters.
- If you add new code files, ensure they are listed in the csproj file (if other files in that folder are listed there) so they build.
### Nullable Reference Types
- Declare variables non-nullable, and check for `null` at entry points.
- Always use `is null` or `is not null` instead of `== null` or `!= null`.
- Trust the C# null annotations and don't add null checks when the type system says a value cannot be null.
## Architecture and Design Patterns
### Asynchronous Programming
- Provide both synchronous and asynchronous versions of methods where appropriate.
- Use the `Async` suffix for asynchronous methods.
- Return `Task` or `ValueTask` from asynchronous methods.
- Use `CancellationToken` parameters to support cancellation.
- Avoid async void methods except for event handlers.
- Use `ConfigureAwait(false)` only in library code that may be consumed by apps with a `SynchronizationContext` (e.g., classic ASP.NET, WPF, WinForms); it is generally unnecessary in ASP.NET Core.
### Error Handling
- Use appropriate exception types.
- Include helpful error messages.
- Avoid catching exceptions without rethrowing them.
### Performance Considerations
- Be mindful of performance implications, especially for database operations.
- Avoid unnecessary allocations.
- Consider using more efficient code that is expected to be on the hot path, even if it is less readable.
### Implementation Guidelines
- Write code that is secure by default. Avoid exposing potentially private or sensitive data.
- Make code NativeAOT compatible when possible. This means avoiding dynamic code generation, reflection, and other features that are not compatible with NativeAOT. If not possible, mark the code with an appropriate annotation or throw an exception.
## Documentation
- Include XML documentation for all public APIs. Mention the purpose, intent, and 'the why' of the code, so developers unfamiliar with the project can better understand it. If comments already exist, update them to meet the before mentioned criteria if needed. Use the full syntax of XML Doc Comments to make them as awesome as possible including references to types. Don't add any documentation that is obvious for even novice developers by reading the code.
- Add proper `<remarks>` tags with links to relevant documentation where helpful.
- For keywords like `null`, `true` or `false` use `<see langword="*" />` tags.
- Include code examples in documentation where appropriate.
- Overriding members should inherit the XML documentation from the base type via `/// <inheritdoc />`.
## Markdown
- Use Markdown for documentation files (e.g., README.md).
- Use triple backticks for code blocks, JSON snippets and bash commands, specifying the language (e.g., ```csharp, ```json and ```bash).
## Testing
- When adding new unit tests, strongly prefer to add them to existing test code files rather than creating new code files.
- We use xUnit SDK v3 for tests.
- Do not emit "Act", "Arrange" or "Assert" comments.
- Use NSubstitute for mocking in tests.
- Copy existing style in nearby files for test method names and capitalization.
- When running tests, if possible use filters and check test run counts, or look at test logs, to ensure they actually ran.
- Do not finish work with any tests commented out or disabled that were not previously commented out or disabled.
+13 -336
View File
@@ -1,341 +1,18 @@
# SQL Database Vector Search Sample # SQL Database Vector Search Sample
A repository that showcases the native VECTOR type in Azure SQL Database to perform embeddings and RAG with Azure OpenAI.
[![.NET 10](https://img.shields.io/badge/.NET-10-blue)](https://dotnet.microsoft.com/en-us/download/dotnet/10.0) The application is a Minimal API that exposes endpoints to load documents, generate embeddings and save them into the database as Vectors, and perform searches using Vector Search and RAG. Currently, only PDF files are supported. Vectors are saved and retrieved using direct SQL queries with [Dapper](https://github.com/DapperLib/Dapper). Embedding and Chat Completion are integrated with [Semantic Kernel](https://github.com/microsoft/semantic-kernel).
[![Minimal API](https://img.shields.io/badge/Minimal%20API-Available-green)](https://dotnet.microsoft.com/apps/aspnet/apis)
[![Blazor](https://img.shields.io/badge/Blazor-WebApp-purple)](https://dotnet.microsoft.com/apps/aspnet/web-apps/blazor)
A Blazor Web App and Minimal API for performing RAG (Retrieval Augmented Generation) and vector search using the native VECTOR type in Azure SQL Database and Azure OpenAI.
## Table of Contents
- [Overview](#overview)
- [Screenshots](#screenshots)
- [Prerequisites](#prerequisites)
- [Project Structure](#project-structure)
- [Setup](#setup)
- [Supported Features](#supported-features)
- [How to Use](#how-to-use)
- [Limitations & FAQ](#limitations-faq)
- [Contributing](#contributing)
- [License](#license)
---
## Overview
This application allows you to:
- Load documents (PDF, DOCX, TXT, MD)
- Generate embeddings and save them as vectors in Azure SQL Database
- Perform semantic search and RAG using Azure OpenAI
- Interact via a Blazor Web App or programmatically via Minimal API
Embeddings and chat completion are powered by [Semantic Kernel](https://github.com/microsoft/semantic-kernel).
## Screenshots
### Web App
![SQL Database Vector Search Web App](https://github.com/marcominerva/SqlDatabaseVectorSearch/blob/master/assets/SqlDatabaseVectorSearch_WebApp.png)
### Web API
![SQL Database Vector Search API](https://github.com/marcominerva/SqlDatabaseVectorSearch/blob/master/assets/SqlDatabaseVectorSearch_API.png)
## Prerequisites
- [.NET 10 SDK](https://dotnet.microsoft.com/en-us/download/dotnet/10.0)
- [Azure SQL Database](https://learn.microsoft.com/en-us/azure/azure-sql/database/single-database-create-quickstart)
- Azure OpenAI resource and API keys
## Project Structure
- `SqlDatabaseVectorSearch/` - Main Blazor Web App and API
- `Components/` - Blazor UI components
- `Data/` - EF Core context, migrations, and entities
- `Endpoints/` - Minimal API endpoints
- `Services/` - Business logic and integration services
- `TextChunkers/` - Text splitting utilities
- `Settings/` - Configuration classes
## Setup
1. Clone the repository
```bash
git clone https://github.com/marcominerva/SqlDatabaseVectorSearch.git
```
2. Configure the database and OpenAI settings
- Edit `SqlDatabaseVectorSearch/appsettings.json` and set your Azure SQL connection string and OpenAI settings.
- **Important**: The `ModelId` values for both `ChatCompletion` and `Embedding` are used for token counting via `Microsoft.ML.Tokenizers`. These values must be valid model identifiers supported by the tokenizer library (e.g., `gpt-4o`, `gpt-4`, `gpt-3.5-turbo`, `text-embedding-3-small`, `text-embedding-3-large`, `text-embedding-ada-002`). The `ModelId` may differ from the actual deployment name you're using in Azure OpenAI. For example, for gpt-4.1 and gpt-5 models set the `ModelId` to `gpt-4o` for proper token counting.
- If using embedding models with shortening (e.g., `text-embedding-3-small` or `text-embedding-3-large`), set the `Dimensions` property accordingly. For `text-embedding-3-large`, you must specify a value <= 1998.
- If you change the VECTOR size, update both the [ApplicationDbContext](SqlDatabaseVectorSearch/Data/ApplicationDbContext.cs) and the [Initial Migration](SqlDatabaseVectorSearch/Data/Migrations/00000000000000_Initial.cs).
3. Run the application
```bash
dotnet run --project SqlDatabaseVectorSearch/SqlDatabaseVectorSearch.csproj
```
5. Access the Web App
- Navigate to `https://localhost:5001` (or the port shown in the console)
## Supported features
- **Conversation History with Question Reformulation**: This feature allows users to view the history of their conversations, including the ability to reformulate questions for better clarity and understanding. This ensures that users can track their interactions and refine their queries as needed.
- **Information about Token Usage**: Users can access detailed information about token usage, which helps in understanding the consumption of tokens during interactions. This feature provides transparency and helps users manage their token usage effectively.
- **Response Streaming**: This feature enables real-time streaming of responses, allowing users to receive information as it is being processed. This ensures a seamless and efficient flow of information, enhancing the overall user experience.
- **Citations**: The application provides citations for the sources used to justify each answer. This allows users to verify the information and understand the origin of the content provided by the system.
## How to Use
- **Web App**: Use the Blazor interface to upload documents, search, and chat with RAG.
- **API**: Import documents via `POST /api/documents` and ask questions via `POST /api/ask` or `POST /api/ask-streaming`.
#### Example API Request
```
POST /api/ask
Content-Type: application/json
{
"conversationId": "3d0bd178-499d-433a-b2bc-c35e488d9e2c"
"text": "Why is Mars called the red planet?"
}
```
#### Example API Response
```json
{
"originalQuestion": "why is mars called the red planet?",
"reformulatedQuestion": "Why is the planet Mars called the red planet?",
"answer": "Mars is called the Red Planet because its surface has an orange-red color due to being covered in iron(III) oxide dust, also known as rust. This iron oxide gives Mars its distinctive reddish appearance when observed from Earth and is the origin of its well-known nickname",
"streamState": "End",
"tokenUsage": {
"reformulation": {
"promptTokens": 812,
"completionTokens": 11,
"totalTokens": 823
},
"embeddingTokenCount": 10,
"question": {
"promptTokens": 31708,
"completionTokens": 227,
"totalTokens": 31935
}
},
"citations": [
{
"documentId": "b1870ad7-4685-42a3-576a-08ddb01159d5",
"chunkId": "749aba1e-0db5-4033-cfa6-08ddb0115da3",
"fileName": "Mars.pdf",
"quote": "surface of Mars is orange-red because it is covered in iron(III) oxide",
"pageNumber": 1,
"indexOnPage": 0
},
{
"documentId": "b1870ad7-4685-42a3-576a-08ddb01159d5",
"chunkId": "215e7197-513f-4fbe-cfa8-08ddb0115da3",
"fileName": "Mars.pdf",
"quote": "Martian surface is caused by ferric oxide, or rust",
"pageNumber": 3,
"indexOnPage": 0
}
]
}
```
### How response streaming works
When using the `/api/ask-streaming` endpoint, answers will be streamed as with the typical response from OpenAI. The format of the response is as follows:
```json
[
{
"originalQuestion": "why is mars called the red planet?",
"reformulatedQuestion": "Why is the planet Mars known as the red planet?",
"answer": null,
"streamState": "Start",
"tokenUsage": {
"reformulation": {
"promptTokens": 541,
"completionTokens": 12,
"totalTokens": 553
},
"embeddingTokenCount": 11,
"question": null
},
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": "Mars",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " is",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " known",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " as",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " the",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " red",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " planet",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " because",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " its",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " surface",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " is",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " covered",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " in",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " iron",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
/// ...
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": null,
"streamState": "End",
"tokenUsage": {
"reformulation": null,
"embeddingTokenCount": null,
"question": {
"promptTokens": 30949,
"completionTokens": 221,
"totalTokens": 31170
}
},
"citations": [
{
"documentId": "b1870ad7-4685-42a3-576a-08ddb01159d5",
"chunkId": "749aba1e-0db5-4033-cfa6-08ddb0115da3",
"fileName": "Mars.pdf",
"quote": "surface of Mars is orange-red",
"pageNumber": 1,
"indexOnPage": 0
},
{
"documentId": "b1870ad7-4685-42a3-576a-08ddb01159d5",
"chunkId": "215e7197-513f-4fbe-cfa8-08ddb0115da3",
"fileName": "Mars.pdf",
"quote": "red-orange appearance of the Martian surface is caused by ferric oxide, or rust",
"pageNumber": 3,
"indexOnPage": 0
}
]
}
]
```
- The first piece of the response has the following characteristics:
- The *streamState* property is set to `Start`.
- It contains the question and its reformulation (if not requested, *reformulatedQuestion* will be equal to *originalQuestion*).
- The *tokenUsage* section holds information about tokens used for reformulation (if done) and for the embedding of the question.
- Then, there are as many elements for the actual answer as necessary:
- Each one contains a token.
- The *streamState* property is set to `Append`.
- *originalQuestion*, *reformulatedQuestion*, *tokenUsage* and *citations* are always `null`.
- The stream ends when an element with *streamState* equals `End` is received. This element contains token usage information for the question and the whole answer, and the list of citations.
## Limitations & FAQ
- **VECTOR column size**: Maximum allowed is 1998. For `text-embedding-3-large`, set `Dimensions` <= 1998.
- **Supported file types**: PDF, DOCX, TXT, MD.
- **Known Issues**: See [Issues](https://github.com/marcominerva/SqlDatabaseVectorSearch/issues)
## Contributing
Contributions are welcome! Please open issues or pull requests. For major changes, discuss them first via an issue.
## License
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
---
> [!NOTE] > [!NOTE]
> If you prefer to use straight SQL, check out the [sql branch](https://github.com/marcominerva/SqlDatabaseVectorSearch/tree/sql). > If you prefer to use Entity Framework Core, check out the [master branch](https://github.com/marcominerva/SqlDatabaseVectorSearch/tree/master).
![SQL Database Vector Search](https://github.com/marcominerva/SqlDatabaseVectorSearch/blob/sql/SqlDatabaseVectorSearch.png)
### Setup
- [Create an Azure SQL Database](https://learn.microsoft.com/en-us/azure/azure-sql/database/single-database-create-quickstart) on a server that has the Vector Support feature enabled
- Execute the [Scripts.sql](https://github.com/marcominerva/SqlDatabaseVectorSearch/blob/sql/Scripts.sql) file to create the tables needed by the application
- You may need to update the size of the [`VECTOR`](https://github.com/marcominerva/SqlDatabaseVectorSearch/blob/sql/Scripts.sql#L17) column to match the size of the embedding model. Currently, the maximum allowed value is 1998.
- Open the [appsettings.json](https://github.com/marcominerva/SqlDatabaseVectorSearch/blob/sql/SqlDatabaseVectorSearch/appsettings.json) file and set the connection string to the database and the other settings required by Azure OpenAI
- If your embedding model supports shortening, like **text-embedding-3-small** and **text-embedding-3-large**, and you want to use this feature, you need to set the [`Dimensions`](https://github.com/marcominerva/SqlDatabaseVectorSearch/blob/sql/SqlDatabaseVectorSearch/appsettings.json#L17) property to match the value you have used in the SQL script. If your model doesn't provide this feature, or do you want to use the default size, just leave the [`Dimensions`](https://github.com/marcominerva/SqlDatabaseVectorSearch/blob/sql/SqlDatabaseVectorSearch/appsettings.json#L17) property to NULL. Keep in mind that **text-embedding-3-small** has a dimension of 1536, while **text-embedding-3-large** uses vectors with 3072 elements, so with this latter model it is mandatory to specify a value (that, as said, must be less or equal to 1998).
- Run the application and start importing your PDF documents.
Binary file not shown.

After

Width:  |  Height:  |  Size: 63 KiB

+32
View File
@@ -0,0 +1,32 @@
Microsoft Visual Studio Solution File, Format Version 12.00
# Visual Studio Version 17
VisualStudioVersion = 17.8.34330.188
MinimumVisualStudioVersion = 10.0.40219.1
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "SqlDatabaseVectorSearch", "SqlDatabaseVectorSearch\SqlDatabaseVectorSearch.csproj", "{A30F41AA-3FC1-41BE-99B7-7637A6EADDDC}"
EndProject
Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "Solution Items", "Solution Items", "{0D00EFA8-60BD-47AF-BE33-9D219B8AC7F6}"
ProjectSection(SolutionItems) = preProject
.editorconfig = .editorconfig
Directory.Build.props = Directory.Build.props
README.md = README.md
EndProjectSection
EndProject
Global
GlobalSection(SolutionConfigurationPlatforms) = preSolution
Debug|Any CPU = Debug|Any CPU
Release|Any CPU = Release|Any CPU
EndGlobalSection
GlobalSection(ProjectConfigurationPlatforms) = postSolution
{A30F41AA-3FC1-41BE-99B7-7637A6EADDDC}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{A30F41AA-3FC1-41BE-99B7-7637A6EADDDC}.Debug|Any CPU.Build.0 = Debug|Any CPU
{A30F41AA-3FC1-41BE-99B7-7637A6EADDDC}.Release|Any CPU.ActiveCfg = Release|Any CPU
{A30F41AA-3FC1-41BE-99B7-7637A6EADDDC}.Release|Any CPU.Build.0 = Release|Any CPU
EndGlobalSection
GlobalSection(SolutionProperties) = preSolution
HideSolutionNode = FALSE
EndGlobalSection
GlobalSection(ExtensibilityGlobals) = postSolution
SolutionGuid = {F8D9A242-E395-4B2D-BF14-0C15B70E9D10}
EndGlobalSection
EndGlobal
-8
View File
@@ -1,8 +0,0 @@
<Solution>
<Folder Name="/Solution Items/">
<File Path=".editorconfig" />
<File Path="Directory.Build.props" />
<File Path="README.md" />
</Folder>
<Project Path="SqlDatabaseVectorSearch/SqlDatabaseVectorSearch.csproj" />
</Solution>
@@ -1,35 +0,0 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<base href="/" />
<ResourcePreloader />
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.3/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-QWTKZyjpPEjISv5WaRU9OFeRpok6YctnYmDr5pNlyT2bRjXh0JMhjY6hW+ALEwIH" crossorigin="anonymous">
<link href="https://cdn.jsdelivr.net/npm/bootstrap-icons@1.11.3/font/bootstrap-icons.css" rel="stylesheet" />
<link href="_content/Blazor.Bootstrap/blazor.bootstrap.css" rel="stylesheet" />
<script src="https://kit.fontawesome.com/f7a7b34f96.js" crossorigin="anonymous"></script>
<link rel="stylesheet" href="@Assets["css/app.css"]" />
<link rel="stylesheet" href="@Assets["SqlDatabaseVectorSearch.styles.css"]" />
<ImportMap />
<link rel="icon" type="image/png" href="favicon.png" />
<HeadOutlet @rendermode="InteractiveServer" />
</head>
<body>
<Routes @rendermode="InteractiveServer" />
<ReconnectModal />
<script src="_framework/blazor.web.js"></script>
<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.3/dist/js/bootstrap.bundle.min.js" integrity="sha384-YvpcrYf0tY3lHB60NNkmXc5s9fDVZLESaAA55NDzOxhy9GkcIdslK1eN7N6jIeHz" crossorigin="anonymous"></script>
<!-- Add chart.js reference if chart components are used in your application. -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/Chart.js/4.0.1/chart.umd.js" integrity="sha512-gQhCDsnnnUfaRzD8k1L5llCCV6O9HN09zClIzzeJ8OJ9MpGmIlCxm+pdCkqTwqJ4JcjbojFr79rl2F1mzcoLMQ==" crossorigin="anonymous" referrerpolicy="no-referrer"></script>
<!-- Add chartjs-plugin-datalabels.min.js reference if chart components with data label feature is used in your application. -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/chartjs-plugin-datalabels/2.2.0/chartjs-plugin-datalabels.min.js" integrity="sha512-JPcRR8yFa8mmCsfrw4TNte1ZvF1e3+1SdGMslZvmrzDYxS69J7J49vkFL8u6u8PlPJK+H3voElBtUCzaXj+6ig==" crossorigin="anonymous" referrerpolicy="no-referrer"></script>
<!-- Add sortable.js reference if SortableList component is used in your application. -->
<script src="https://cdn.jsdelivr.net/npm/sortablejs@latest/Sortable.min.js"></script>
<script src="_content/Blazor.Bootstrap/blazor.bootstrap.js" asp-append-version="true"></script>
<script src="js/functions.js" asp-append-version="true"></script>
</body>
</html>
@@ -1,59 +0,0 @@
@inherits LayoutComponentBase
<Toasts class="p-3" AutoHide="true" Placement="ToastsPlacement.TopRight" />
<BlazorBootstrapLayout StickyHeader="true">
<HeaderSection>
<a href="/swagger" target="_blank" class="text-decoration-none" title="OpenAPI documentation">
<Icon Name="IconName.FileTypeJson" Class="ps-3 ps-lg-2" Size="IconSize.x2" Color="IconColor.Muted"></Icon>
</a>
<a href="https://github.com/marcominerva/SqlDatabaseVectorSearch" target="_blank" class="text-decoration-none" title="View on GitHub">
<Icon Name="IconName.Github" Class="ps-4 ps-lg-4" Size="IconSize.x2" Color="IconColor.Muted"></Icon>
</a>
</HeaderSection>
<SidebarSection>
<Sidebar2 Href="/"
IconName="IconName.Search"
Title="SQL Vector Search"
DataProvider="Sidebar2DataProvider"
WidthUnit="Unit.Px" />
</SidebarSection>
<ContentSection>
@Body
</ContentSection>
</BlazorBootstrapLayout>
@code {
private IEnumerable<NavItem> navItems = default!;
private Task<Sidebar2DataProviderResult> Sidebar2DataProvider(Sidebar2DataProviderRequest request)
{
if (navItems is null)
{
navItems = GetNavItems();
}
var result = request.ApplyTo(navItems);
return Task.FromResult(result);
}
private IEnumerable<NavItem> GetNavItems()
{
navItems = [
new() { Id = "1", Href = "/", IconName = IconName.HouseDoorFill, Text = "Home", Match = NavLinkMatch.All},
new() { Id = "2", Href= "/documents", IconName = IconName.FileText, Text = "Documents" },
new() { Id = "3", Href = "/ask", IconName = IconName.ChatDots, Text = "Ask"}
];
return navItems;
}
}
<div id="blazor-error-ui" data-nosnippet>
An unhandled error has occurred.
<a href="." class="reload">Reload</a>
<span class="dismiss">🗙</span>
</div>
@@ -1,20 +0,0 @@
#blazor-error-ui {
color-scheme: light only;
background: lightyellow;
bottom: 0;
box-shadow: 0 -1px 2px rgba(0, 0, 0, 0.2);
box-sizing: border-box;
display: none;
left: 0;
padding: 0.6rem 1.25rem 0.7rem 1.25rem;
position: fixed;
width: 100%;
z-index: 1000;
}
#blazor-error-ui .dismiss {
cursor: pointer;
position: absolute;
right: 0.75rem;
top: 0.5rem;
}
@@ -1,31 +0,0 @@
<script type="module" src="@Assets["Components/Layout/ReconnectModal.razor.js"]"></script>
<dialog id="components-reconnect-modal" data-nosnippet>
<div class="components-reconnect-container">
<div class="components-rejoining-animation" aria-hidden="true">
<div></div>
<div></div>
</div>
<p class="components-reconnect-first-attempt-visible">
Rejoining the server...
</p>
<p class="components-reconnect-repeated-attempt-visible">
Rejoin failed... Trying again in <span id="components-seconds-to-next-attempt"></span> seconds.
</p>
<p class="components-reconnect-failed-visible">
Failed to rejoin.<br />Please retry or reload the page.
</p>
<button id="components-reconnect-button" class="components-reconnect-failed-visible">
Retry
</button>
<p class="components-pause-visible">
The session has been paused by the server.
</p>
<button id="components-resume-button" class="components-pause-visible">
Resume
</button>
<p class="components-resume-failed-visible">
Failed to resume the session.<br />Please reload the page.
</p>
</div>
</dialog>
@@ -1,157 +0,0 @@
.components-reconnect-first-attempt-visible,
.components-reconnect-repeated-attempt-visible,
.components-reconnect-failed-visible,
.components-pause-visible,
.components-resume-failed-visible,
.components-rejoining-animation {
display: none;
}
#components-reconnect-modal.components-reconnect-show .components-reconnect-first-attempt-visible,
#components-reconnect-modal.components-reconnect-show .components-rejoining-animation,
#components-reconnect-modal.components-reconnect-paused .components-pause-visible,
#components-reconnect-modal.components-reconnect-resume-failed .components-resume-failed-visible,
#components-reconnect-modal.components-reconnect-retrying,
#components-reconnect-modal.components-reconnect-retrying .components-reconnect-repeated-attempt-visible,
#components-reconnect-modal.components-reconnect-retrying .components-rejoining-animation,
#components-reconnect-modal.components-reconnect-failed,
#components-reconnect-modal.components-reconnect-failed .components-reconnect-failed-visible {
display: block;
}
#components-reconnect-modal {
background-color: white;
width: 20rem;
margin: 20vh auto;
padding: 2rem;
border: 0;
border-radius: 0.5rem;
box-shadow: 0 3px 6px 2px rgba(0, 0, 0, 0.3);
opacity: 0;
transition: display 0.5s allow-discrete, overlay 0.5s allow-discrete;
animation: components-reconnect-modal-fadeOutOpacity 0.5s both;
&[open]
{
animation: components-reconnect-modal-slideUp 1.5s cubic-bezier(.05, .89, .25, 1.02) 0.3s, components-reconnect-modal-fadeInOpacity 0.5s ease-in-out 0.3s;
animation-fill-mode: both;
}
}
#components-reconnect-modal::backdrop {
background-color: rgba(0, 0, 0, 0.4);
animation: components-reconnect-modal-fadeInOpacity 0.5s ease-in-out;
opacity: 1;
}
@keyframes components-reconnect-modal-slideUp {
0% {
transform: translateY(30px) scale(0.95);
}
100% {
transform: translateY(0);
}
}
@keyframes components-reconnect-modal-fadeInOpacity {
0% {
opacity: 0;
}
100% {
opacity: 1;
}
}
@keyframes components-reconnect-modal-fadeOutOpacity {
0% {
opacity: 1;
}
100% {
opacity: 0;
}
}
.components-reconnect-container {
display: flex;
flex-direction: column;
align-items: center;
gap: 1rem;
}
#components-reconnect-modal p {
margin: 0;
text-align: center;
}
#components-reconnect-modal button {
border: 0;
background-color: #6b9ed2;
color: white;
padding: 4px 24px;
border-radius: 4px;
}
#components-reconnect-modal button:hover {
background-color: #3b6ea2;
}
#components-reconnect-modal button:active {
background-color: #6b9ed2;
}
.components-rejoining-animation {
position: relative;
width: 80px;
height: 80px;
}
.components-rejoining-animation div {
position: absolute;
border: 3px solid #0087ff;
opacity: 1;
border-radius: 50%;
animation: components-rejoining-animation 1.5s cubic-bezier(0, 0.2, 0.8, 1) infinite;
}
.components-rejoining-animation div:nth-child(2) {
animation-delay: -0.5s;
}
@keyframes components-rejoining-animation {
0% {
top: 40px;
left: 40px;
width: 0;
height: 0;
opacity: 0;
}
4.9% {
top: 40px;
left: 40px;
width: 0;
height: 0;
opacity: 0;
}
5% {
top: 40px;
left: 40px;
width: 0;
height: 0;
opacity: 1;
}
100% {
top: 0px;
left: 0px;
width: 80px;
height: 80px;
opacity: 0;
}
}
@@ -1,63 +0,0 @@
// Set up event handlers
const reconnectModal = document.getElementById("components-reconnect-modal");
reconnectModal.addEventListener("components-reconnect-state-changed", handleReconnectStateChanged);
const retryButton = document.getElementById("components-reconnect-button");
retryButton.addEventListener("click", retry);
const resumeButton = document.getElementById("components-resume-button");
resumeButton.addEventListener("click", resume);
function handleReconnectStateChanged(event) {
if (event.detail.state === "show") {
reconnectModal.showModal();
} else if (event.detail.state === "hide") {
reconnectModal.close();
} else if (event.detail.state === "failed") {
document.addEventListener("visibilitychange", retryWhenDocumentBecomesVisible);
} else if (event.detail.state === "rejected") {
location.reload();
}
}
async function retry() {
document.removeEventListener("visibilitychange", retryWhenDocumentBecomesVisible);
try {
// Reconnect will asynchronously return:
// - true to mean success
// - false to mean we reached the server, but it rejected the connection (e.g., unknown circuit ID)
// - exception to mean we didn't reach the server (this can be sync or async)
const successful = await Blazor.reconnect();
if (!successful) {
// We have been able to reach the server, but the circuit is no longer available.
// We'll reload the page so the user can continue using the app as quickly as possible.
const resumeSuccessful = await Blazor.resumeCircuit();
if (!resumeSuccessful) {
location.reload();
} else {
reconnectModal.close();
}
}
} catch (err) {
// We got an exception, server is currently unavailable
document.addEventListener("visibilitychange", retryWhenDocumentBecomesVisible);
}
}
async function resume() {
try {
const successful = await Blazor.resumeCircuit();
if (!successful) {
location.reload();
}
} catch {
location.reload();
}
}
async function retryWhenDocumentBecomesVisible() {
if (document.visibilityState === "visible") {
await retry();
}
}
@@ -1,348 +0,0 @@
@page "/ask"
@using System.Text.RegularExpressions
@inject IServiceProvider ServiceProvider
@inject IJSRuntime JSRuntime
<PageTitle>Chat with your data</PageTitle>
<div class="card mx-auto mt-2">
<div class="card-body">
@foreach (var message in messages)
{
if (message.Role == "user")
{
<div class="d-flex align-items-baseline text-end justify-content-end">
<div class="pe-2">
<div>
<div class="card card-text d-inline-block p-2 px-3 m-1">
<Markdown style="overflow-y:auto;">@message.Text</Markdown>
</div>
</div>
</div>
<div class="position-relative avatar">
<Image src="/images/user.png" class="img-fluid rounded-circle" alt="" />
</div>
</div>
}
else if (message.Role == "assistant")
{
<div class="d-flex align-items-baseline">
<div class="position-relative avatar">
<Image src="/images/assistant.png" class="img-fluid rounded-circle" alt="" />
</div>
<div class="pe-2">
<div>
@if (message.Text is null)
{
<div class="card card-text d-inline-block p-3 px-3 m-1">
<div class="progress-chat" role="progressbar" aria-label="I'm thinking" aria-valuenow="0" aria-valuemin="0" aria-valuemax="100">
<div class="progress-bar-chat">
<div class="progress-bar-indeterminate"></div>
</div>
</div>
</div>
}
else
{
<div class="card card-text d-inline-block p-2 px-3 m-1">
<div class="message-content">
<div class="streaming-content">
<div class="streaming-text @(message.Status == MessageStatus.Streaming ? "streaming-text-with-spinner" : "")">
<Markdown style="overflow-y:auto;">@message.Text</Markdown>
</div>
@if (message.Status == MessageStatus.Streaming)
{
<div class="streaming-spinner-bottom-left">
<Spinner Size="SpinnerSize.Small" Color="SpinnerColor.Primary" />
</div>
}
</div>
</div>
@if (message.Status == MessageStatus.Completed)
{
<div class="d-flex justify-content-between">
<div class="text-start bg-transparent mt-3">
<Tooltip Title="@message.TokenUsage" IsHtml="true" Color="TooltipColor.Primary" Placement="TooltipPlacement.Bottom">
<Icon Class="d-flex text-body-secondary" Name="IconName.InfoCircle"></Icon>
</Tooltip>
</div>
<div class="text-end bg-transparent">
<Tooltip Title="@toolTipText" Color="TooltipColor.Dark" Placement="TooltipPlacement.Bottom">
<Button Type="ButtonType.Button" Outline="false" @onclick="@(async () => await CopyToClipboardAsync(message.Text))">
@if (showCopyConfirmation)
{
<Icon Name="IconName.Check" Class="text-success" />
}
else
{
<Icon Name="IconName.Clipboard" />
}
</Button>
</Tooltip>
</div>
</div>
@if (message.Citations is not null && message.Citations.Count() > 0)
{
<div class="mt-3 d-flex flex-wrap">
@foreach (var citation in message.Citations)
{
<div class="border rounded p-2 me-2 mb-2 citation-box small">
<div>
<strong>@citation.FileName</strong> @if (citation.PageNumber.GetValueOrDefault() > 0)
{
<span class="ms-2">pag. @citation.PageNumber</span>
}
</div>
<div class="text-secondary small mt-1">@citation.Quote</div>
</div>
}
</div>
}
}
</div>
}
</div>
</div>
</div>
}
}
<div @ref="chat"></div>
</div>
<div class="card-footer bg-white w-100 bottom-0 m-0 p-1">
<div class="input-group">
<span class="input-group-text bg-transparent border-0">
<Tooltip Title="Messages aren't stored in any way on either the client or the server." Color="TooltipColor.Primary" Placement="TooltipPlacement.Bottom">
<Icon Class="d-flex text-body-secondary" Name="IconName.InfoCircle"></Icon>
</Tooltip>
</span>
<input @ref="askInput" type="text" @bind="@question" @bind:event="oninput" placeholder="Ask me anything..." class="form-control border-0" maxlength="2000" @onkeydown="HandleKeyDown" />
<div class="input-group-text bg-transparent border-0">
<Button Type="ButtonType.Submit" @ref="askButton" Color="ButtonColor.Primary" Disabled="@(isAsking || string.IsNullOrWhiteSpace(question))" @onclick="AskQuestion">
<Icon Name="IconName.Send" />
</Button>
<Button Type="ButtonType.Reset" @ref="resetButton" Class="ms-2" Color="ButtonColor.Secondary" Disabled="@isAsking" @onclick="Reset">
<Icon CustomIconName="bi bi-x-lg" />
</Button>
</div>
</div>
</div>
</div>
@code
{
private Button askButton = default!;
private Button resetButton = default!;
private ElementReference askInput = default!;
private ElementReference chat = default!;
private IList<Message> messages = [];
private string? question;
private Guid conversationId = Guid.NewGuid();
private bool isAsking = false;
private bool showCopyConfirmation = false;
private string toolTipText = "Copy to Clipboard";
protected override async Task OnAfterRenderAsync(bool firstRender)
{
if (!firstRender)
{
return;
}
await JSRuntime.InvokeVoidAsync("setFocus", askInput);
}
private async Task HandleKeyDown(KeyboardEventArgs e)
{
if (isAsking)
{
return;
}
if (e.Key == "Enter" && !string.IsNullOrWhiteSpace(question))
{
await AskQuestion();
}
else if (e.Key == "ArrowUp" && messages.Count >= 2)
{
question = messages[^2].Text;
}
}
private async Task AskQuestion()
{
isAsking = true;
var userQuestion = new Question(conversationId, question!);
var userMessage = new Message { Text = userQuestion.Text, Role = "user", Status = MessageStatus.Completed };
messages.Add(userMessage);
var assistantMessage = new Message { Role = "assistant", Status = MessageStatus.New };
messages.Add(assistantMessage);
question = null;
await Task.Yield();
await EnsureMessageIsVisibleAsync();
try
{
await using var scope = ServiceProvider.CreateAsyncScope();
var vectorSearchService = scope.ServiceProvider.GetRequiredService<VectorSearchService>();
var response = vectorSearchService.AskStreamingAsync(userQuestion);
await foreach (var delta in response)
{
if (delta.StreamState == StreamState.Start)
{
userMessage.Text = delta.ReformulatedQuestion;
assistantMessage.TokenUsage = FormatTokenUsage(delta.TokenUsage);
assistantMessage.Status = MessageStatus.Streaming;
}
else if (delta.StreamState == StreamState.Append)
{
// Adds tokens to the assistant message as they are received.
assistantMessage.Text += delta.Answer;
}
else if (delta.StreamState == StreamState.End)
{
// Get citations from the response.
assistantMessage.Citations = delta.Citations?.Select(c => new Citation
{
DocumentId = c.DocumentId,
ChunkId = c.ChunkId,
FileName = c.FileName,
Quote = c.Quote,
PageNumber = c.PageNumber,
IndexOnPage = c.IndexOnPage
});
assistantMessage.Status = MessageStatus.Completed;
assistantMessage.TokenUsage += FormatTokenUsage(delta.TokenUsage);
}
await Task.Yield();
StateHasChanged();
await EnsureMessageIsVisibleAsync();
}
}
catch (Exception ex)
{
assistantMessage.Text = $"There was an error while processing the question: {ex.Message}";
assistantMessage.Status = MessageStatus.Completed;
}
finally
{
await EnsureMessageIsVisibleAsync();
isAsking = false;
}
}
private void Reset()
{
question = null;
conversationId = Guid.NewGuid();
messages.Clear();
}
private async Task CopyToClipboardAsync(string text)
{
if (text is null)
return;
await JSRuntime.InvokeVoidAsync("navigator.clipboard.writeText", text);
showCopyConfirmation = true;
toolTipText = "Copied!";
StateHasChanged();
await Task.Delay(3000); // Shows the checkmark for 3 seconds
toolTipText = "Copy to Clipboard";
showCopyConfirmation = false;
StateHasChanged();
}
private static string FormatTokenUsage(TokenUsageResponse? tokenUsageResponse)
{
if (tokenUsageResponse is null)
{
return string.Empty;
}
var reformulation = tokenUsageResponse.Reformulation is not null
? $"<p><strong>Reformulation:</strong><br />{FormatTokenUsageDetails(tokenUsageResponse.Reformulation)}</p>"
: string.Empty;
var embeddingTokenCount = tokenUsageResponse.EmbeddingTokenCount.HasValue
? $"<p><strong>Embedding Token Count:</strong> {tokenUsageResponse.EmbeddingTokenCount}</p>"
: string.Empty;
var question = tokenUsageResponse.Question is not null
? $"<p><strong>Question:</strong><br />{FormatTokenUsageDetails(tokenUsageResponse.Question)}</p>"
: string.Empty;
return $"{reformulation}{embeddingTokenCount}{question}";
static string FormatTokenUsageDetails(TokenUsage? tokenUsage)
{
if (tokenUsage is null)
{
return string.Empty;
}
return $"Prompt tokens: {tokenUsage.PromptTokens}<br />" +
$"Completion tokens: {tokenUsage.CompletionTokens}<br />" +
$"Total tokens: {tokenUsage.TotalTokens}";
}
}
private async Task EnsureMessageIsVisibleAsync()
{
await JSRuntime.InvokeVoidAsync("scrollTo", chat);
}
public enum MessageStatus
{
New,
Streaming,
Completed
}
public class Message
{
public string? Text { get; set; }
public required string Role { get; set; }
public MessageStatus Status { get; set; } = MessageStatus.New;
public string? TokenUsage { get; set; }
// List of citations extracted from the answer.
public IEnumerable<Citation>? Citations { get; set; }
}
public class Citation
{
public Guid DocumentId { get; set; }
public Guid ChunkId { get; set; }
public string FileName { get; set; } = null!;
public string Quote { get; set; } = null!;
public int? PageNumber { get; set; }
public int IndexOnPage { get; set; }
}
}
@@ -1,124 +0,0 @@
.tooltip-inner {
text-align: left;
}
.avatar {
width: 50px;
height: 50px;
border-radius: 50%;
border: 2px solid #ddd;
padding: 2px;
flex: none;
}
input:focus {
outline: 0px !important;
box-shadow: none !important;
}
input[type="checkbox"],
input[type="checkbox"] + label {
cursor: pointer;
}
.card-body {
overflow: auto;
height: 560px;
}
@media (min-width: 768px) {
.card-body {
height: 650px;
}
}
@media (min-width: 2560px) {
.card-body {
height: 1020px;
}
}
.card-text {
border: 2px solid #ddd;
border-radius: 8px;
}
.progress-chat {
width: 200px;
height: 4px;
}
.progress-bar-chat {
height: 4px;
background-color: rgba(5, 114, 206, 0.2);
width: 100%;
overflow: hidden;
}
.progress-bar-indeterminate {
width: 100%;
height: 100%;
background-color: rgb(5, 114, 206);
animation: indeterminate-animation 1s infinite linear;
transform-origin: 0% 50%;
}
@keyframes indeterminate-animation {
0% {
transform: translateX(0) scaleX(0);
}
40% {
transform: translateX(0) scaleX(0.4);
}
100% {
transform: translateX(100%) scaleX(0.5);
}
}
.message-content {
position: relative;
}
.streaming-content {
position: relative;
min-height: 1.5em;
}
.streaming-text {
/* Add padding to make space for the spinner when streaming */
}
.streaming-text-with-spinner {
padding-bottom: 28px; /* Space for spinner (16px height + 8px margin + 4px extra) */
}
.streaming-spinner-bottom-left {
position: absolute;
bottom: 2px;
left: 0px;
z-index: 10;
}
.btn-clipboard {
line-height: 1;
color: var(--bs-body-color);
background-color: var(--bd-pre-bg);
border: 0;
border-radius: .25rem;
margin-right: -.4em
}
.btn-clipboard:hover {
color: var(--bs-link-hover-color)
}
.btn-clipboard:focus {
z-index: 3
}
.btn-clipboard {
position: relative;
z-index: 2;
}
@@ -1,278 +0,0 @@
@page "/documents"
@using MimeMapping
@inject IServiceProvider ServiceProvider
@inject IJSRuntime JSRuntime
<ConfirmDialog @ref="dialog" />
<PageTitle>Documents</PageTitle>
<h4 class="mb-4">
<Icon Name="IconName.Upload" class="me-2" />
Upload new document
</h4>
<EditForm Model="Model" Enhance OnValidSubmit="UploadFile">
<DataAnnotationsValidator />
<div class="row">
<div class="col-md-5 col-sm-4 col-5">
<div class="input-group">
<span class="input-group-text">
<Tooltip Title="PDF, DOCX, TXT and MD files are supported" Color="TooltipColor.Primary" Placement="TooltipPlacement.Bottom">
<Icon Class="d-flex text-body-secondary" Name="IconName.InfoCircle"></Icon>
</Tooltip>
</span>
<InputFile class="form-control" OnChange="@((e) => Model.File = e.File)" accept=".pdf,.docx,.txt,.md" id="fileInput" />
</div>
</div>
<div class="col-md-5 col-sm-5 col-5">
<div class="input-group">
<span class="input-group-text">
<Tooltip Title="The unique identifier (GUID) of the document. If not provided, a new one will be generated. If you specify an existing Document ID, the corresponding document will be overwritten." Color="TooltipColor.Primary" Placement="TooltipPlacement.Bottom">
<Icon Class="d-flex text-body-secondary me-2" Name="IconName.InfoCircle"></Icon>
</Tooltip>
Document ID
</span>
<TextInput Placeholder="Enter a valid GUID or leave empty for auto-generation" @bind-Value="@Model.DocumentId" />
</div>
<ValidationMessage For="@(() => Model.DocumentId)" />
</div>
<div class="col-md-2 col-sm-3 col-2">
<div class="d-grid gap-2">
<Button @ref="uploadButton" Type="ButtonType.Submit" Color="ButtonColor.Primary" To="#" Disabled="@(Model.File is null)" Class="w-100 py-2 fw-semibold shadow-sm"><Icon Name="IconName.Upload" /><span class="d-none d-lg-inline ps-3">Upload</span></Button>
</div>
</div>
</div>
</EditForm>
@if (isLoading && documents.Count == 0)
{
<div class="text-center">
<Spinner Type="SpinnerType.Dots" Class="me-3 mt-4" Color="SpinnerColor.Primary" />
</div>
}
else
{
<h4 class="mt-4 mb-4">
<Icon Name="IconName.Files" class="me-2" />
Available documents
</h4>
<div class="table-responsive">
<table class="table table-hover align-middle mb-0 border rounded overflow-hidden">
<thead class="table-light sticky-top">
<tr>
<th style="width:40px;"></th>
<th class="text-secondary">ID</th>
<th class="text-secondary">Name</th>
<th class="text-secondary">Content type</th>
<th class="text-secondary text-center">Chunks</th>
<th class="text-secondary">Created</th>
</tr>
</thead>
<tbody>
@foreach (var document in documents)
{
<tr class="@((document.IsSelected ? "table-primary" : null))">
<td>
<div class="d-flex justify-content-center align-items-center">
<CheckboxInput @bind-Value="document.IsSelected" />
</div>
</td>
<td class="text-break small">@document.Id</td>
<td class="fw-medium">@document.Name</td>
<td>
<span class="badge content-type-badge px-2 py-1 rounded-pill small">
@document.ContentType
</span>
</td>
<td class="text-center">@document.ChunkCount</td>
<td class="small text-secondary">@document.LocalCreationDateString</td>
</tr>
}
</tbody>
</table>
</div>
<div class="my-4"></div>
<div class="row">
<div class="col-md-2 col-sm-3 col-2">
<div class="d-grid gap-2">
<Button @ref="deleteButton" Color="ButtonColor.Danger" Disabled="@(!documents.Any(d => d.IsSelected))" @onclick="DeleteSelectedDocuments" Class="w-100 py-2 fw-semibold shadow-sm">
<Icon Name="IconName.Trash" /><span class="d-none d-lg-inline ps-3">Delete</span>
</Button>
</div>
</div>
</div>
}
@code {
private ConfirmDialog dialog = default!;
private Button uploadButton = default!;
private Button deleteButton = default!;
private bool isLoading = true;
private IList<SelectableDocument> documents = [];
private UploadDocument Model { get; set; } = new();
[Inject]
protected ToastService ToastService { get; set; } = default!;
protected override async Task OnAfterRenderAsync(bool firstRender)
{
if (!firstRender)
{
return;
}
await using var scope = ServiceProvider.CreateAsyncScope();
await LoadDocumentsAsync(scope.ServiceProvider);
StateHasChanged();
}
private async Task LoadDocumentsAsync(IServiceProvider services)
{
isLoading = true;
try
{
var documentService = services.GetRequiredService<DocumentService>();
var dbDocuments = await documentService.GetAsync();
documents.Clear();
foreach (var dbDocument in dbDocuments)
{
documents.Add(new SelectableDocument(dbDocument.Id, dbDocument.Name, dbDocument.CreationDate, dbDocument.ChunkCount)
{
LocalCreationDateString = await GetLocalDateTimeStringAsync(dbDocument.CreationDate)
});
}
}
finally
{
isLoading = false;
}
}
private async Task UploadFile()
{
if (Model.File is null)
{
return;
}
uploadButton.ShowLoading();
var fileName = Model.File.Name;
try
{
await using var inputStream = Model.File.OpenReadStream(20 * 1024 * 1024); // 20 MB
await using var stream = await inputStream.GetMemoryStreamAsync();
await using var scope = ServiceProvider.CreateAsyncScope();
var vectorSearchService = scope.ServiceProvider.GetRequiredService<VectorSearchService>();
var documentId = string.IsNullOrWhiteSpace(Model.DocumentId) ? null : (Guid?)Guid.Parse(Model.DocumentId);
await vectorSearchService.ImportAsync(stream, fileName, MimeUtility.GetMimeMapping(fileName), documentId);
ToastService.Notify(await CreateToastMessageAsync(ToastType.Success, "Upload document", $"The document {fileName} has been successfully uploaded and indexed."));
Model = new UploadDocument();
await JSRuntime.InvokeVoidAsync("resetFileInput", "fileInput");
await LoadDocumentsAsync(scope.ServiceProvider);
}
catch (Exception ex)
{
ToastService.Notify(await CreateToastMessageAsync(ToastType.Danger, "Upload error", $"There was an error while uploading the document {fileName}: {ex.Message}"));
}
finally
{
uploadButton.HideLoading();
}
}
private async Task DeleteSelectedDocuments()
{
var selectedDocumentIds = documents?.Where(d => d.IsSelected).Select(d => d.Id) ?? [];
var options = new ConfirmDialogOptions
{
YesButtonText = "Yes",
YesButtonColor = ButtonColor.Danger,
NoButtonText = "No",
NoButtonColor = ButtonColor.Secondary
};
var confirmation = await dialog.ShowAsync(
title: "Delete the selected documents?",
message1: "This will delete the documents and all the corresponding embeddings. The operation cannot be undone.",
message2: "Do you want to proceed?",
confirmDialogOptions: options);
if (!confirmation)
{
return;
}
try
{
deleteButton.ShowLoading();
await using var scope = ServiceProvider.CreateAsyncScope();
var documentService = scope.ServiceProvider.GetRequiredService<DocumentService>();
await documentService.DeleteAsync(selectedDocumentIds);
await LoadDocumentsAsync(scope.ServiceProvider);
ToastService.Notify(await CreateToastMessageAsync(ToastType.Info, "Delete documents", "The selected documents have been successfully deleted."));
}
catch (Exception ex)
{
ToastService.Notify(await CreateToastMessageAsync(ToastType.Danger, "Delete error", $"There was an error while deleting the documents: {ex.Message}"));
}
finally
{
deleteButton.HideLoading();
}
}
private async Task<ToastMessage> CreateToastMessageAsync(ToastType toastType, string title, string message)
{
var toastMessage = new ToastMessage
{
Type = toastType,
Title = title,
HelpText = await GetLocalDateTimeStringAsync(DateTimeOffset.UtcNow),
Message = message
};
return toastMessage;
}
private async Task<string> GetLocalDateTimeStringAsync(DateTimeOffset dateTime)
{
return await JSRuntime.InvokeAsync<string>("getLocalTime", dateTime);
}
private record class SelectableDocument(Guid Id, string Name, DateTimeOffset CreationDate, int ChunkCount) : Document(Id, Name, CreationDate, ChunkCount)
{
public bool IsSelected { get; set; }
public string ContentType => MimeUtility.GetMimeMapping(Name);
public string LocalCreationDateString { get; set; } = string.Empty;
}
public class UploadDocument
{
public IBrowserFile? File { get; set; }
[RegularExpression(@"^(\{|\()?[0-9a-fA-F]{8}(-?)[0-9a-fA-F]{4}(-?)[0-9a-fA-F]{4}(-?)[0-9a-fA-F]{4}(-?)[0-9a-fA-F]{12}(\}|\))?$", ErrorMessage = "Invalid GUID format.")]
public string? DocumentId { get; set; }
}
}
@@ -1,39 +0,0 @@
@page "/Error"
@using System.Diagnostics
@rendermode @(new InteractiveServerRenderMode(prerender: false))
<div class="d-flex align-items-center justify-content-center">
<div class="text-center">
@if (Code == 404)
{
<PageTitle>Page Not Found</PageTitle>
<h1 class="display-1 fw-bold">404</h1>
<p class="fs-3"><span class="text-danger">Ops!</span> Page Not Found.</p>
<p class="lead">
The page you're looking for does not exists.
</p>
}
else if (Code > 0)
{
<PageTitle>Unexpected Error</PageTitle>
<h1 class="display-1 fw-bold">500</h1>
<p class="fs-3"><span class="text-danger">Ops!</span> Unexpected error.</p>
<p class="lead">
An unexpected error occurred while loading the page. Please, wait a minute and try again.
</p>
}
<a title="Back to Home" href="/" class="btn btn-primary"><i class="bi bi-house-door-fill"></i> Back to Home</a>
</div>
</div>
@code {
[CascadingParameter]
private HttpContext? HttpContext { get; set; }
[Parameter]
[SupplyParameterFromQuery(Name = "code")]
public int Code { get; set; }
}
@@ -1,37 +0,0 @@
@page "/"
@rendermode @(new InteractiveServerRenderMode(prerender: false))
<PageTitle>SQL Database Vector Search</PageTitle>
<h1>SQL Database Vector Search</h1>
<p class="mt-3 p-3 rounded bg-light text-dark shadow-sm">
A Blazor Web App and Minimal API for Retrieval Augmented Generation (RAG) and vector search using the native VECTOR type in <img src="/images/sqldatabase.svg" style="height:1.5em;vertical-align:middle;" /> Azure SQL Database with <img src="/images/openai.svg" style="height:1.5em;vertical-align:middle;" /> Azure OpenAI.
</p>
<p>
This application allows you to:
<ul>
<li>Load documents (PDF, DOCX, TXT, MD)</li>
<li>Generate embeddings and save them as vectors in Azure SQL Database</li>
<li>Perform semantic search and RAG using Azure OpenAI</li>
<li>Interact via a Blazor Web App or programmatically via Minimal API</li>
</ul>
Embeddings and chat completion are powered by <a href="https://github.com/microsoft/semantic-kernel" target="_blank">Semantic Kernel</a>. Vectors are managed with <a href="https://github.com/efcore/EfCore.SqlServer.VectorSearch" target="_blank">EFCore.SqlServer.VectorSearch</a>.
</p>
<h3>Supported Features</h3>
<ul>
<li><strong>Conversation History with Question Reformulation</strong>: View and reformulate your conversation history for better clarity and understanding.</li>
<li><strong>Information about Token Usage</strong>: Access detailed information about token usage for transparency and management.</li>
<li><strong>Response Streaming</strong>: Receive real-time streaming of responses for a seamless and efficient user experience.</li>
<li><strong>Citations</strong>: Get citations for the sources used to justify each answer, allowing you to verify and understand the origin of the content.</li>
</ul>
<p class="mt-3 p-3 rounded bg-light text-dark shadow-sm">
Try <a href="/documents">uploading a document</a> or <a href="/ask">ask a question</a> to get started!
</p>
<p class="mt-4">
<em>For API usage and more details, see the <a href="https://github.com/marcominerva/SqlDatabaseVectorSearch#how-to-use" target="_blank">README</a>.</em>
</p>
@@ -1,6 +0,0 @@
<Router AppAssembly="typeof(Program).Assembly">
<Found Context="routeData">
<RouteView RouteData="routeData" DefaultLayout="typeof(Layout.MainLayout)" />
<FocusOnNavigate RouteData="routeData" Selector="h1" />
</Found>
</Router>
@@ -1,16 +0,0 @@
@using System.ComponentModel.DataAnnotations
@using System.Net.Http
@using System.Net.Http.Json
@using Microsoft.AspNetCore.Components.Forms
@using Microsoft.AspNetCore.Components.Routing
@using Microsoft.AspNetCore.Components.Web
@using static Microsoft.AspNetCore.Components.Web.RenderMode
@using Microsoft.AspNetCore.Components.Web.Virtualization
@using Microsoft.JSInterop
@using SqlDatabaseVectorSearch
@using SqlDatabaseVectorSearch.Components
@using SqlDatabaseVectorSearch.Components.Layout
@using SqlDatabaseVectorSearch.Extensions
@using SqlDatabaseVectorSearch.Models
@using SqlDatabaseVectorSearch.Services
@using BlazorBootstrap
@@ -1,32 +0,0 @@
using System.Text;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;
using SqlDatabaseVectorSearch.TextChunkers;
namespace SqlDatabaseVectorSearch.ContentDecoders;
public class DocxContentDecoder(IServiceProvider serviceProvider) : IContentDecoder
{
public Task<IEnumerable<Chunk>> DecodeAsync(Stream stream, string contentType, CancellationToken cancellationToken = default)
{
var textChunker = serviceProvider.GetRequiredKeyedService<ITextChunker>(contentType);
// Open a Word document for read-only access.
using var document = WordprocessingDocument.Open(stream, false);
var body = document.MainDocumentPart?.Document?.Body;
var content = new StringBuilder();
foreach (var p in body?.Descendants<Paragraph>() ?? [])
{
content.AppendLine(p.InnerText);
}
var paragraphs = textChunker.Split(content.ToString().Trim());
// Pages do not exist in the OpenXML format until they are rendered by a word processor.
// See https://stackoverflow.com/questions/43700252/how-to-get-page-numbers-based-on-openxmlelement for more details.
// Therefore, we will not assign a page number.
return Task.FromResult(paragraphs.Select((text, index) => new Chunk(null, index, text)).ToList().AsEnumerable());
}
}
@@ -1,8 +0,0 @@
namespace SqlDatabaseVectorSearch.ContentDecoders;
public interface IContentDecoder
{
Task<IEnumerable<Chunk>> DecodeAsync(Stream stream, string contentType, CancellationToken cancellationToken = default);
}
public record class Chunk(int? PageNumber, int IndexOnPage, string Content);
@@ -1,33 +0,0 @@
using SqlDatabaseVectorSearch.TextChunkers;
using UglyToad.PdfPig;
using UglyToad.PdfPig.Content;
using UglyToad.PdfPig.DocumentLayoutAnalysis.PageSegmenter;
using UglyToad.PdfPig.DocumentLayoutAnalysis.WordExtractor;
namespace SqlDatabaseVectorSearch.ContentDecoders;
public class PdfContentDecoder(IServiceProvider serviceProvider) : IContentDecoder
{
public Task<IEnumerable<Chunk>> DecodeAsync(Stream stream, string contentType, CancellationToken cancellationToken = default)
{
var textChunker = serviceProvider.GetRequiredKeyedService<ITextChunker>(contentType);
// Read the content of the PDF document.
using var pdfDocument = PdfDocument.Open(stream);
var paragraphs = pdfDocument.GetPages().SelectMany(page => GetPageParagraphs(page, textChunker)).ToList();
return Task.FromResult(paragraphs.AsEnumerable());
}
private static IEnumerable<Chunk> GetPageParagraphs(Page pdfPage, ITextChunker textChunker)
{
var letters = pdfPage.Letters;
var words = NearestNeighbourWordExtractor.Instance.GetWords(letters);
var textBlocks = DocstrumBoundingBoxes.Instance.GetBlocks(words);
var pageText = string.Join($"{Environment.NewLine}{Environment.NewLine}", textBlocks.Select(t => t.Text.ReplaceLineEndings(" ")));
var paragraphs = textChunker.Split(pageText.Trim());
return paragraphs.Where(p => !string.IsNullOrWhiteSpace(p)).Select((text, index) => new Chunk(pdfPage.Number, index, text));
}
}
@@ -1,17 +0,0 @@
using SqlDatabaseVectorSearch.TextChunkers;
namespace SqlDatabaseVectorSearch.ContentDecoders;
public class TextContentDecoder(IServiceProvider serviceProvider) : IContentDecoder
{
public async Task<IEnumerable<Chunk>> DecodeAsync(Stream stream, string contentType, CancellationToken cancellationToken = default)
{
var textChunker = serviceProvider.GetRequiredKeyedService<ITextChunker>(contentType);
using var readStream = new StreamReader(stream);
var content = await readStream.ReadToEndAsync(cancellationToken);
var paragraphs = textChunker.Split(content.Trim());
return paragraphs.Select((text, index) => new Chunk(null, index, text)).ToList();
}
}
@@ -1,51 +0,0 @@
using EntityFramework.Exceptions.SqlServer;
using Microsoft.EntityFrameworkCore;
using SqlDatabaseVectorSearch.Data.Entities;
namespace SqlDatabaseVectorSearch.Data;
public class ApplicationDbContext(DbContextOptions<ApplicationDbContext> options) : DbContext(options)
{
public virtual DbSet<Document> Documents { get; set; }
public virtual DbSet<DocumentChunk> DocumentChunks { get; set; }
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
base.OnConfiguring(optionsBuilder);
optionsBuilder.UseExceptionProcessor();
//optionsBuilder.EnableSensitiveDataLogging();
}
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
modelBuilder.Entity<Document>(entity =>
{
entity.ToTable("Documents");
entity.HasKey(e => e.Id);
entity.Property(e => e.Id).ValueGeneratedOnAdd();
entity.Property(e => e.Name)
.IsRequired()
.HasMaxLength(255);
});
modelBuilder.Entity<DocumentChunk>(entity =>
{
entity.ToTable("DocumentChunks");
entity.HasKey(e => e.Id);
entity.Property(e => e.Id).ValueGeneratedOnAdd();
entity.Property(e => e.Content).IsRequired();
entity.Property(e => e.Embedding)
.HasColumnType("vector(1536)")
.IsRequired();
entity.HasOne(d => d.Document).WithMany(p => p.Chunks)
.HasForeignKey(d => d.DocumentId)
.OnDelete(DeleteBehavior.Cascade)
.HasConstraintName("FK_DocumentChunks_Documents");
});
}
}
@@ -1,12 +0,0 @@
namespace SqlDatabaseVectorSearch.Data.Entities;
public class Document
{
public Guid Id { get; set; }
public required string Name { get; set; }
public DateTimeOffset CreationDate { get; set; }
public virtual ICollection<DocumentChunk> Chunks { get; set; } = [];
}
@@ -1,22 +0,0 @@
using Microsoft.Data.SqlTypes;
namespace SqlDatabaseVectorSearch.Data.Entities;
public class DocumentChunk
{
public Guid Id { get; set; }
public Guid DocumentId { get; set; }
public int Index { get; set; }
public int? PageNumber { get; set; }
public int IndexOnPage { get; set; }
public required string Content { get; set; }
public required SqlVector<float> Embedding { get; set; }
public virtual Document Document { get; set; } = null!;
}
@@ -1,99 +0,0 @@
// <auto-generated />
using System;
using Microsoft.Data.SqlTypes;
using Microsoft.EntityFrameworkCore;
using Microsoft.EntityFrameworkCore.Infrastructure;
using Microsoft.EntityFrameworkCore.Metadata;
using Microsoft.EntityFrameworkCore.Migrations;
using Microsoft.EntityFrameworkCore.Storage.ValueConversion;
using SqlDatabaseVectorSearch.Data;
#nullable disable
namespace SqlDatabaseVectorSearch.Migrations
{
[DbContext(typeof(ApplicationDbContext))]
[Migration("00000000000000_Initial")]
partial class Initial
{
/// <inheritdoc />
protected override void BuildTargetModel(ModelBuilder modelBuilder)
{
#pragma warning disable 612, 618
modelBuilder
.HasAnnotation("ProductVersion", "10.0.0-rc.1.25451.107")
.HasAnnotation("Relational:MaxIdentifierLength", 128);
SqlServerModelBuilderExtensions.UseIdentityColumns(modelBuilder);
modelBuilder.Entity("SqlDatabaseVectorSearch.Data.Entities.Document", b =>
{
b.Property<Guid>("Id")
.ValueGeneratedOnAdd()
.HasColumnType("uniqueidentifier");
b.Property<DateTimeOffset>("CreationDate")
.HasColumnType("datetimeoffset");
b.Property<string>("Name")
.IsRequired()
.HasMaxLength(255)
.HasColumnType("nvarchar(255)");
b.HasKey("Id");
b.ToTable("Documents", (string)null);
});
modelBuilder.Entity("SqlDatabaseVectorSearch.Data.Entities.DocumentChunk", b =>
{
b.Property<Guid>("Id")
.ValueGeneratedOnAdd()
.HasColumnType("uniqueidentifier");
b.Property<string>("Content")
.IsRequired()
.HasColumnType("nvarchar(max)");
b.Property<Guid>("DocumentId")
.HasColumnType("uniqueidentifier");
b.Property<SqlVector<float>>("Embedding")
.HasColumnType("vector(1536)");
b.Property<int>("Index")
.HasColumnType("int");
b.Property<int>("IndexOnPage")
.HasColumnType("int");
b.Property<int?>("PageNumber")
.HasColumnType("int");
b.HasKey("Id");
b.HasIndex("DocumentId");
b.ToTable("DocumentChunks", (string)null);
});
modelBuilder.Entity("SqlDatabaseVectorSearch.Data.Entities.DocumentChunk", b =>
{
b.HasOne("SqlDatabaseVectorSearch.Data.Entities.Document", "Document")
.WithMany("Chunks")
.HasForeignKey("DocumentId")
.OnDelete(DeleteBehavior.Cascade)
.IsRequired()
.HasConstraintName("FK_DocumentChunks_Documents");
b.Navigation("Document");
});
modelBuilder.Entity("SqlDatabaseVectorSearch.Data.Entities.Document", b =>
{
b.Navigation("Chunks");
});
#pragma warning restore 612, 618
}
}
}
@@ -1,67 +0,0 @@
using System;
using Microsoft.Data.SqlTypes;
using Microsoft.EntityFrameworkCore.Migrations;
#nullable disable
namespace SqlDatabaseVectorSearch.Migrations
{
/// <inheritdoc />
public partial class Initial : Migration
{
/// <inheritdoc />
protected override void Up(MigrationBuilder migrationBuilder)
{
migrationBuilder.CreateTable(
name: "Documents",
columns: table => new
{
Id = table.Column<Guid>(type: "uniqueidentifier", nullable: false),
Name = table.Column<string>(type: "nvarchar(255)", maxLength: 255, nullable: false),
CreationDate = table.Column<DateTimeOffset>(type: "datetimeoffset", nullable: false)
},
constraints: table =>
{
table.PrimaryKey("PK_Documents", x => x.Id);
});
migrationBuilder.CreateTable(
name: "DocumentChunks",
columns: table => new
{
Id = table.Column<Guid>(type: "uniqueidentifier", nullable: false),
DocumentId = table.Column<Guid>(type: "uniqueidentifier", nullable: false),
Index = table.Column<int>(type: "int", nullable: false),
PageNumber = table.Column<int>(type: "int", nullable: true),
IndexOnPage = table.Column<int>(type: "int", nullable: false),
Content = table.Column<string>(type: "nvarchar(max)", nullable: false),
Embedding = table.Column<SqlVector<float>>(type: "vector(1536)", nullable: false)
},
constraints: table =>
{
table.PrimaryKey("PK_DocumentChunks", x => x.Id);
table.ForeignKey(
name: "FK_DocumentChunks_Documents",
column: x => x.DocumentId,
principalTable: "Documents",
principalColumn: "Id",
onDelete: ReferentialAction.Cascade);
});
migrationBuilder.CreateIndex(
name: "IX_DocumentChunks_DocumentId",
table: "DocumentChunks",
column: "DocumentId");
}
/// <inheritdoc />
protected override void Down(MigrationBuilder migrationBuilder)
{
migrationBuilder.DropTable(
name: "DocumentChunks");
migrationBuilder.DropTable(
name: "Documents");
}
}
}
@@ -1,96 +0,0 @@
// <auto-generated />
using System;
using Microsoft.Data.SqlTypes;
using Microsoft.EntityFrameworkCore;
using Microsoft.EntityFrameworkCore.Infrastructure;
using Microsoft.EntityFrameworkCore.Metadata;
using Microsoft.EntityFrameworkCore.Storage.ValueConversion;
using SqlDatabaseVectorSearch.Data;
#nullable disable
namespace SqlDatabaseVectorSearch.Migrations
{
[DbContext(typeof(ApplicationDbContext))]
partial class ApplicationDbContextModelSnapshot : ModelSnapshot
{
protected override void BuildModel(ModelBuilder modelBuilder)
{
#pragma warning disable 612, 618
modelBuilder
.HasAnnotation("ProductVersion", "10.0.0")
.HasAnnotation("Relational:MaxIdentifierLength", 128);
SqlServerModelBuilderExtensions.UseIdentityColumns(modelBuilder);
modelBuilder.Entity("SqlDatabaseVectorSearch.Data.Entities.Document", b =>
{
b.Property<Guid>("Id")
.ValueGeneratedOnAdd()
.HasColumnType("uniqueidentifier");
b.Property<DateTimeOffset>("CreationDate")
.HasColumnType("datetimeoffset");
b.Property<string>("Name")
.IsRequired()
.HasMaxLength(255)
.HasColumnType("nvarchar(255)");
b.HasKey("Id");
b.ToTable("Documents", (string)null);
});
modelBuilder.Entity("SqlDatabaseVectorSearch.Data.Entities.DocumentChunk", b =>
{
b.Property<Guid>("Id")
.ValueGeneratedOnAdd()
.HasColumnType("uniqueidentifier");
b.Property<string>("Content")
.IsRequired()
.HasColumnType("nvarchar(max)");
b.Property<Guid>("DocumentId")
.HasColumnType("uniqueidentifier");
b.Property<SqlVector<float>>("Embedding")
.HasColumnType("vector(1536)");
b.Property<int>("Index")
.HasColumnType("int");
b.Property<int>("IndexOnPage")
.HasColumnType("int");
b.Property<int?>("PageNumber")
.HasColumnType("int");
b.HasKey("Id");
b.HasIndex("DocumentId");
b.ToTable("DocumentChunks", (string)null);
});
modelBuilder.Entity("SqlDatabaseVectorSearch.Data.Entities.DocumentChunk", b =>
{
b.HasOne("SqlDatabaseVectorSearch.Data.Entities.Document", "Document")
.WithMany("Chunks")
.HasForeignKey("DocumentId")
.OnDelete(DeleteBehavior.Cascade)
.IsRequired()
.HasConstraintName("FK_DocumentChunks_Documents");
b.Navigation("Document");
});
modelBuilder.Entity("SqlDatabaseVectorSearch.Data.Entities.Document", b =>
{
b.Navigation("Chunks");
});
#pragma warning restore 612, 618
}
}
}
@@ -1,44 +0,0 @@
using System.ComponentModel;
using MinimalHelpers.FluentValidation;
using SqlDatabaseVectorSearch.Models;
using SqlDatabaseVectorSearch.Services;
namespace SqlDatabaseVectorSearch.Endpoints;
public class AskEndpoints : IEndpointRouteHandlerBuilder
{
public static void MapEndpoints(IEndpointRouteBuilder endpoints)
{
endpoints.MapPost("/api/ask", async (Question question, VectorSearchService vectorSearchService, CancellationToken cancellationToken,
[Description("If true, the question will be reformulated taking into account the context of the chat identified by the given ConversationId.")] bool reformulate = true) =>
{
var response = await vectorSearchService.AskQuestionAsync(question, reformulate, cancellationToken);
return TypedResults.Ok(response);
})
.WithValidation<Question>()
.WithSummary("Asks a question")
.WithDescription("The question will be reformulated taking into account the context of the chat identified by the given ConversationId.")
.WithTags("Ask");
endpoints.MapPost("/api/ask-streaming", (Question question, VectorSearchService vectorSearchService, CancellationToken cancellationToken,
[Description("If true, the question will be reformulated taking into account the context of the chat identified by the given ConversationId.")] bool reformulate = true) =>
{
async IAsyncEnumerable<Response> Stream()
{
// Requests a streaming response.
var responseStream = vectorSearchService.AskStreamingAsync(question, reformulate, cancellationToken);
await foreach (var delta in responseStream)
{
yield return delta;
}
}
return Stream();
})
.WithValidation<Question>()
.WithSummary("Asks a question and gets the response as streaming")
.WithDescription("The question will be reformulated taking into account the context of the chat identified by the given ConversationId.")
.WithTags("Ask");
}
}
@@ -1,67 +0,0 @@
using System.ComponentModel;
using Microsoft.AspNetCore.Http.HttpResults;
using MimeMapping;
using SqlDatabaseVectorSearch.Models;
using SqlDatabaseVectorSearch.Services;
namespace SqlDatabaseVectorSearch.Endpoints;
public class DocumentEndpoints : IEndpointRouteHandlerBuilder
{
public static void MapEndpoints(IEndpointRouteBuilder endpoints)
{
var documentsApiGroup = endpoints.MapGroup("/api/documents").WithTags("Documents");
documentsApiGroup.MapGet(string.Empty, async (DocumentService documentService, CancellationToken cancellationToken) =>
{
var documents = await documentService.GetAsync(cancellationToken);
return TypedResults.Ok(documents);
})
.WithSummary("Gets the list of documents");
documentsApiGroup.MapPost(string.Empty, async (IFormFile file, VectorSearchService vectorSearchService, CancellationToken cancellationToken,
[Description("The unique identifier of the document. If not provided, a new one will be generated. If you specify an existing documentId, the corresponding document will be overwritten.")] Guid? documentId = null) =>
{
using var stream = file.OpenReadStream();
// Note: file.ContentType is not 100% reliable (for example, for markdown file).
var response = await vectorSearchService.ImportAsync(stream, file.FileName, MimeUtility.GetMimeMapping(file.FileName), documentId, cancellationToken);
return TypedResults.Ok(response);
})
.DisableAntiforgery()
.ProducesProblem(StatusCodes.Status400BadRequest)
.WithSummary("Uploads a document")
.WithDescription("Uploads a document to SQL Database and saves its embedding using the native VECTOR type. The document will be indexed and used to answer questions. Currently, PDF, DOCX, TXT and MD files are supported.");
documentsApiGroup.MapGet("{documentId:guid}/chunks", async (Guid documentId, DocumentService documentService, CancellationToken cancellationToken) =>
{
var documents = await documentService.GetChunksAsync(documentId, cancellationToken);
return TypedResults.Ok(documents);
})
.WithSummary("Gets the list of chunks of a given document")
.WithDescription("The list does not contain embedding. Use '/api/documents/{documentId}/chunks/{documentChunkId}' to get the embedding for a given chunk.");
documentsApiGroup.MapGet("{documentId:guid}/chunks/{documentChunkId:guid}", async Task<Results<Ok<DocumentChunk>, NotFound>> (Guid documentId, Guid documentChunkId, DocumentService documentService, CancellationToken cancellationToken) =>
{
var chunk = await documentService.GetChunkEmbeddingAsync(documentId, documentChunkId, cancellationToken);
if (chunk is null)
{
return TypedResults.NotFound();
}
return TypedResults.Ok(chunk);
})
.ProducesProblem(StatusCodes.Status404NotFound)
.WithSummary("Gets the details of a given chunk, includings its embedding");
documentsApiGroup.MapDelete("{documentId:guid}", async (Guid documentId, DocumentService documentService, CancellationToken cancellationToken) =>
{
await documentService.DeleteAsync(documentId, cancellationToken);
return TypedResults.NoContent();
})
.WithSummary("Deletes a document")
.WithDescription("This endpoint deletes the document and all its chunks.");
}
}
@@ -1,40 +0,0 @@
using System.Text.RegularExpressions;
using Microsoft.Net.Http.Headers;
namespace SqlDatabaseVectorSearch.Extensions;
public static partial class RequestExtensions
{
[GeneratedRegex("(android|bb\\d+|meego).+mobile|avantgo|bada\\/|blackberry|blazer|compal|elaine|fennec|hiptop|iemobile|ip(hone|od)|iris|kindle|lge |maemo|midp|mmp|mobile.+firefox|netfront|opera m(ob|in)i|palm( os)?|phone|p(ixi|re)\\/|plucker|pocket|psp|series(4|6)0|symbian|treo|up\\.(browser|link)|vodafone|wap|windows ce|xda|xiino", RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.Compiled)]
private static partial Regex MobileBrowserRegex { get; }
[GeneratedRegex("1207|6310|6590|3gso|4thp|50[1-6]i|770s|802s|a wa|abac|ac(er|oo|s\\-)|ai(ko|rn)|al(av|ca|co)|amoi|an(ex|ny|yw)|aptu|ar(ch|go)|as(te|us)|attw|au(di|\\-m|r |s )|avan|be(ck|ll|nq)|bi(lb|rd)|bl(ac|az)|br(e|v)w|bumb|bw\\-(n|u)|c55\\/|capi|ccwa|cdm\\-|cell|chtm|cldc|cmd\\-|co(mp|nd)|craw|da(it|ll|ng)|dbte|dc\\-s|devi|dica|dmob|do(c|p)o|ds(12|\\-d)|el(49|ai)|em(l2|ul)|er(ic|k0)|esl8|ez([4-7]0|os|wa|ze)|fetc|fly(\\-|_)|g1 u|g560|gene|gf\\-5|g\\-mo|go(\\.w|od)|gr(ad|un)|haie|hcit|hd\\-(m|p|t)|hei\\-|hi(pt|ta)|hp( i|ip)|hs\\-c|ht(c(\\-| |_|a|g|p|s|t)|tp)|hu(aw|tc)|i\\-(20|go|ma)|i230|iac( |\\-|\\/)|ibro|idea|ig01|ikom|im1k|inno|ipaq|iris|ja(t|v)a|jbro|jemu|jigs|kddi|keji|kgt( |\\/)|klon|kpt |kwc\\-|kyo(c|k)|le(no|xi)|lg( g|\\/(k|l|u)|50|54|\\-[a-w])|libw|lynx|m1\\-w|m3ga|m50\\/|ma(te|ui|xo)|mc(01|21|ca)|m\\-cr|me(rc|ri)|mi(o8|oa|ts)|mmef|mo(01|02|bi|de|do|t(\\-| |o|v)|zz)|mt(50|p1|v )|mwbp|mywa|n10[0-2]|n20[2-3]|n30(0|2)|n50(0|2|5)|n7(0(0|1)|10)|ne((c|m)\\-|on|tf|wf|wg|wt)|nok(6|i)|nzph|o2im|op(ti|wv)|oran|owg1|p800|pan(a|d|t)|pdxg|pg(13|\\-([1-8]|c))|phil|pire|pl(ay|uc)|pn\\-2|po(ck|rt|se)|prox|psio|pt\\-g|qa\\-a|qc(07|12|21|32|60|\\-[2-7]|i\\-)|qtek|r380|r600|raks|rim9|ro(ve|zo)|s55\\/|sa(ge|ma|mm|ms|ny|va)|sc(01|h\\-|oo|p\\-)|sdk\\/|se(c(\\-|0|1)|47|mc|nd|ri)|sgh\\-|shar|sie(\\-|m)|sk\\-0|sl(45|id)|sm(al|ar|b3|it|t5)|so(ft|ny)|sp(01|h\\-|v\\-|v )|sy(01|mb)|t2(18|50)|t6(00|10|18)|ta(gt|lk)|tcl\\-|tdg\\-|tel(i|m)|tim\\-|t\\-mo|to(pl|sh)|ts(70|m\\-|m3|m5)|tx\\-9|up(\\.b|g1|si)|utst|v400|v750|veri|vi(rg|te)|vk(40|5[0-3]|\\-v)|vm40|voda|vulc|vx(52|53|60|61|70|80|81|83|85|98)|w3c(\\-| )|webc|whit|wi(g |nc|nw)|wmlb|wonu|x700|yas\\-|your|zeto|zte\\-", RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.Compiled)]
private static partial Regex MobileBrowserVersionRegex { get; }
[GeneratedRegex(@"^/(?<culture>[a-z]{2})(/|$)", RegexOptions.IgnoreCase | RegexOptions.Compiled)]
private static partial Regex RouteCultureRegex { get; }
public static bool IsMobileRequest(this HttpContext httpContext)
=> httpContext.Request.IsMobile();
public static bool IsMobile(this HttpRequest request)
{
var userAgent = request.Headers[HeaderNames.UserAgent].ToString();
var isMobileBrowser = false;
if (userAgent?.Length > 4 && (MobileBrowserRegex.IsMatch(userAgent) || MobileBrowserVersionRegex.IsMatch(userAgent.AsSpan(0, 4))))
{
isMobileBrowser = true;
}
return isMobileBrowser;
}
public static bool IsApiRequest(this HttpContext httpContext)
=> httpContext.Request.Path.StartsWithSegments("/api");
public static bool IsSwaggerRequest(this HttpContext httpContext)
=> httpContext.Request.Path.StartsWithSegments("/swagger");
public static bool IsWebRequest(this HttpContext httpContext)
=> !httpContext.IsApiRequest() && !httpContext.IsSwaggerRequest();
}
@@ -1,16 +0,0 @@
namespace SqlDatabaseVectorSearch.Extensions;
public static class StreamExtensions
{
public static async Task<MemoryStream> GetMemoryStreamAsync(this Stream stream)
{
// Use a BufferedStream to read the file in chunks
using var bufferedStream = new BufferedStream(stream);
var ms = new MemoryStream();
await bufferedStream.CopyToAsync(ms);
ms.Position = 0;
return ms;
}
}
@@ -1,3 +0,0 @@
namespace SqlDatabaseVectorSearch.Models;
public record class ChatResponse(string? Text, TokenUsage? TokenUsage = null);
@@ -1,16 +0,0 @@
namespace SqlDatabaseVectorSearch.Models;
public class Citation
{
public Guid DocumentId { get; set; }
public Guid ChunkId { get; set; }
public string FileName { get; set; } = null!;
public string Quote { get; set; } = null!;
public int? PageNumber { get; set; }
public int IndexOnPage { get; set; }
}
@@ -1,3 +1,14 @@
namespace SqlDatabaseVectorSearch.Models; using System.Text.Json;
public record class DocumentChunk(Guid Id, int Index, string Content, int? PageNumber, int IndexOnPage, float[]? Embedding = null); namespace SqlDatabaseVectorSearch.Models;
public record class DocumentChunk(Guid Id, int Index, string Content, float[]? Embedding)
{
public DocumentChunk(Guid Id, int Index, string Content) : this(Id, Index, Content, (float[]?)null)
{
}
public DocumentChunk(Guid Id, int Index, string Content, string Embedding) : this(Id, Index, Content, JsonSerializer.Deserialize<float[]?>(Embedding))
{
}
}
@@ -1,3 +0,0 @@
namespace SqlDatabaseVectorSearch.Models;
public record class ImportDocumentResponse(Guid DocumentId, int EmbeddingTokenCount);
+1 -8
View File
@@ -1,10 +1,3 @@
namespace SqlDatabaseVectorSearch.Models; namespace SqlDatabaseVectorSearch.Models;
// Question and Answer can be null when using response streaming. public record class Response(string Question, string Answer);
public record class Response(string? OriginalQuestion, string? ReformulatedQuestion, string? Answer, StreamState? StreamState = null, TokenUsageResponse? TokenUsage = null, IEnumerable<Citation>? Citations = null)
{
public Response(string? token, StreamState streamState, TokenUsageResponse? tokenUsageResponse = null, IEnumerable<Citation>? citations = null)
: this(null, null, token, streamState, tokenUsageResponse, citations)
{
}
}
@@ -1,8 +0,0 @@
namespace SqlDatabaseVectorSearch.Models;
public enum StreamState
{
Start,
Append,
End
}
@@ -1,6 +0,0 @@
namespace SqlDatabaseVectorSearch.Models;
public record class TokenUsage(int PromptTokens, int CompletionTokens)
{
public int TotalTokens => PromptTokens + CompletionTokens;
}
@@ -1,9 +0,0 @@
namespace SqlDatabaseVectorSearch.Models;
public record class TokenUsageResponse(TokenUsage? Reformulation, int? EmbeddingTokenCount, TokenUsage? Question)
{
public TokenUsageResponse(TokenUsage? question)
: this(null, null, question)
{
}
}
@@ -0,0 +1,3 @@
namespace SqlDatabaseVectorSearch.Models;
public record class UploadDocumentResponse(Guid DocumentId);
+75 -92
View File
@@ -1,15 +1,10 @@
using System.Net.Mime; using System.ComponentModel;
using System.Text.Json.Serialization; using Microsoft.AspNetCore.Http.HttpResults;
using FluentValidation; using Microsoft.Data.SqlClient;
using Microsoft.EntityFrameworkCore;
using Microsoft.SemanticKernel; using Microsoft.SemanticKernel;
using SqlDatabaseVectorSearch.Components; using SqlDatabaseVectorSearch.Models;
using SqlDatabaseVectorSearch.ContentDecoders;
using SqlDatabaseVectorSearch.Data;
using SqlDatabaseVectorSearch.Extensions;
using SqlDatabaseVectorSearch.Services; using SqlDatabaseVectorSearch.Services;
using SqlDatabaseVectorSearch.Settings; using SqlDatabaseVectorSearch.Settings;
using SqlDatabaseVectorSearch.TextChunkers;
using TinyHelpers.AspNetCore.Extensions; using TinyHelpers.AspNetCore.Extensions;
using TinyHelpers.AspNetCore.OpenApi; using TinyHelpers.AspNetCore.OpenApi;
@@ -17,24 +12,15 @@ var builder = WebApplication.CreateBuilder(args);
builder.Configuration.AddJsonFile("appsettings.local.json", optional: true, reloadOnChange: true); builder.Configuration.AddJsonFile("appsettings.local.json", optional: true, reloadOnChange: true);
// Add services to the container. // Add services to the container.
var aiSettings = builder.Services.ConfigureAndGet<AzureOpenAISettings>(builder.Configuration, "AzureOpenAI")!; var aiSettings = builder.Configuration.GetSection<AzureOpenAISettings>("AzureOpenAI")!;
var appSettings = builder.Services.ConfigureAndGet<AppSettings>(builder.Configuration, nameof(AppSettings))!; var appSettings = builder.Services.ConfigureAndGet<AppSettings>(builder.Configuration, nameof(AppSettings))!;
builder.Services.AddRazorComponents()
.AddInteractiveServerComponents();
builder.Services.AddBlazorBootstrap();
builder.Services.ConfigureHttpJsonOptions(options =>
{
options.SerializerOptions.Converters.Add(new JsonStringEnumConverter());
});
builder.Services.AddSingleton(TimeProvider.System); builder.Services.AddSingleton(TimeProvider.System);
builder.Services.AddSqlServer<ApplicationDbContext>(builder.Configuration.GetConnectionString("SqlConnection"), optionsAction: options => builder.Services.AddScoped(_ =>
{ {
options.UseQueryTrackingBehavior(QueryTrackingBehavior.NoTracking); var sqlConnection = new SqlConnection(builder.Configuration.GetConnectionString("SqlConnection"));
return sqlConnection;
}); });
builder.Services.AddHybridCache(options => builder.Services.AddHybridCache(options =>
@@ -45,103 +31,100 @@ builder.Services.AddHybridCache(options =>
}; };
}); });
builder.Services.ConfigureHttpClientDefaults(configure =>
{
configure.AddStandardResilienceHandler(options =>
{
options.AttemptTimeout.Timeout = TimeSpan.FromSeconds(15);
options.TotalRequestTimeout.Timeout = TimeSpan.FromMinutes(2);
});
});
// Semantic Kernel is used to generate embeddings and to reformulate questions taking into account all the previous interactions, // Semantic Kernel is used to generate embeddings and to reformulate questions taking into account all the previous interactions,
// so that embeddings themselves can be generated more accurately. // so that embeddings themselves can be generated more accurately.
builder.Services.AddKernel() builder.Services.AddKernel()
.AddAzureOpenAIEmbeddingGenerator(aiSettings.Embedding.Deployment, aiSettings.Embedding.Endpoint, aiSettings.Embedding.ApiKey, modelId: aiSettings.Embedding.ModelId, dimensions: aiSettings.Embedding.Dimensions) .AddAzureOpenAITextEmbeddingGeneration(aiSettings.Embedding.Deployment, aiSettings.Embedding.Endpoint, aiSettings.Embedding.ApiKey, dimensions: aiSettings.Embedding.Dimensions)
.AddAzureOpenAIChatCompletion(aiSettings.ChatCompletion.Deployment, aiSettings.ChatCompletion.Endpoint, aiSettings.ChatCompletion.ApiKey, modelId: aiSettings.ChatCompletion.ModelId); .AddAzureOpenAIChatCompletion(aiSettings.ChatCompletion.Deployment, aiSettings.ChatCompletion.Endpoint, aiSettings.ChatCompletion.ApiKey);
builder.Services.AddKeyedSingleton<IContentDecoder, PdfContentDecoder>(MediaTypeNames.Application.Pdf);
builder.Services.AddKeyedSingleton<IContentDecoder, DocxContentDecoder>("application/vnd.openxmlformats-officedocument.wordprocessingml.document");
builder.Services.AddKeyedSingleton<IContentDecoder, TextContentDecoder>(MediaTypeNames.Text.Plain);
builder.Services.AddKeyedSingleton<IContentDecoder, TextContentDecoder>(MediaTypeNames.Text.Markdown);
builder.Services.AddKeyedSingleton<ITextChunker, DefaultTextChunker>(KeyedService.AnyKey);
builder.Services.AddKeyedSingleton<ITextChunker, MarkdownTextChunker>(MediaTypeNames.Text.Markdown);
builder.Services.AddSingleton<TokenizerService>();
builder.Services.AddSingleton<ChatService>(); builder.Services.AddSingleton<ChatService>();
builder.Services.AddScoped<DocumentService>();
builder.Services.AddScoped<VectorSearchService>(); builder.Services.AddScoped<VectorSearchService>();
builder.Services.AddOpenApi(options => builder.Services.AddOpenApi(options =>
{ {
options.RemoveServerList(); options.AddDefaultResponse();
options.AddDefaultProblemDetailsResponse();
}); });
ValidatorOptions.Global.LanguageManager.Enabled = false;
builder.Services.AddValidatorsFromAssemblyContaining<Program>();
builder.Services.AddDefaultProblemDetails(); builder.Services.AddDefaultProblemDetails();
builder.Services.AddDefaultExceptionHandler(); builder.Services.AddDefaultExceptionHandler();
var app = builder.Build(); var app = builder.Build();
await ConfigureDatabaseAsync(app.Services);
// Configure the HTTP request pipeline. // Configure the HTTP request pipeline.
app.UseHttpsRedirection(); app.UseHttpsRedirection();
app.UseWhen(context => context.IsWebRequest(), builder => app.UseExceptionHandler();
{ app.UseStatusCodePages();
if (!app.Environment.IsDevelopment())
{
builder.UseExceptionHandler("/error", createScopeForErrors: true);
// The default HSTS value is 30 days. if (app.Environment.IsDevelopment())
builder.UseHsts(); {
app.MapOpenApi();
app.UseSwaggerUI(options =>
{
options.RoutePrefix = string.Empty;
options.SwaggerEndpoint("/openapi/v1.json", builder.Environment.ApplicationName);
});
}
var documentsApiGroup = app.MapGroup("/api/documents").WithTags("Documents");
documentsApiGroup.MapGet(string.Empty, async (VectorSearchService vectorSearchService) =>
{
var documents = await vectorSearchService.GetDocumentsAsync();
return TypedResults.Ok(documents);
})
.WithSummary("Gets the list of documents");
documentsApiGroup.MapGet("{documentId:guid}/chunks", async (Guid documentId, VectorSearchService vectorSearchService) =>
{
var documents = await vectorSearchService.GetDocumentChunksAsync(documentId);
return TypedResults.Ok(documents);
})
.WithSummary("Gets the list of chunks of a given document")
.WithDescription("The list does not contain embedding. Use '/api/documents/{documentId}/chunks/{documentChunkId}' to get the embedding for a given chunk.");
documentsApiGroup.MapGet("{documentId:guid}/chunks/{documentChunkId:guid}", async Task<Results<Ok<DocumentChunk>, NotFound>> (Guid documentId, Guid documentChunkId, VectorSearchService vectorSearchService) =>
{
var chunk = await vectorSearchService.GetDocumentChunkEmbeddingAsync(documentId, documentChunkId);
if (chunk is null)
{
return TypedResults.NotFound();
} }
builder.UseStatusCodePagesWithRedirects("/error?code={0}"); return TypedResults.Ok(chunk);
}); })
.ProducesProblem(StatusCodes.Status404NotFound)
.WithSummary("Gets the details of a given chunk, includings its embedding");
app.UseWhen(context => context.IsApiRequest(), builder => documentsApiGroup.MapPost(string.Empty, async (IFormFile file, VectorSearchService vectorSearchService,
[Description("The unique identifier of the document. If not provided, a new one will be generated. If you specify an existing documentId, the corresponding document will be overwritten.")] Guid? documentId = null) =>
{ {
app.UseExceptionHandler(new ExceptionHandlerOptions using var stream = file.OpenReadStream();
{ documentId = await vectorSearchService.ImportAsync(stream, file.FileName, documentId);
StatusCodeSelector = exception => exception switch
{
NotSupportedException => StatusCodes.Status501NotImplemented,
_ => StatusCodes.Status500InternalServerError
}
});
builder.UseStatusCodePages(); return TypedResults.Ok(new UploadDocumentResponse(documentId.Value));
}); })
.DisableAntiforgery()
.ProducesProblem(StatusCodes.Status400BadRequest)
.WithSummary("Uploads a document")
.WithDescription("Uploads a document to SQL Database and saves its embedding using the new native Vector type. The document will be indexed and used to answer questions. Currently, only PDF files are supported.");
app.MapOpenApi(); documentsApiGroup.MapDelete("{documentId:guid}", async (Guid documentId, VectorSearchService vectorSearchService) =>
app.UseSwaggerUI(options =>
{ {
options.SwaggerEndpoint("/openapi/v1.json", builder.Environment.ApplicationName); await vectorSearchService.DeleteDocumentAsync(documentId);
}); return TypedResults.NoContent();
})
.WithSummary("Deletes a document")
.WithDescription("This endpoint deletes the document and all its chunks.");
app.UseRouting(); app.MapPost("/api/ask", async (Question question, VectorSearchService vectorSearchService,
app.UseRequestLocalization(); [Description("If true, the question will be reformulated taking into account the context of the chat identified by the given ConversationId.")] bool reformulate = true) =>
{
app.UseAntiforgery(); var response = await vectorSearchService.AskQuestionAsync(question, reformulate);
return TypedResults.Ok(response);
app.MapStaticAssets(); })
app.MapRazorComponents<App>() .WithSummary("Asks a question")
.AddInteractiveServerRenderMode(); .WithDescription("The question will be reformulated taking into account the context of the chat identified by the given ConversationId.")
.WithTags("Ask");
app.MapEndpoints();
app.Run(); app.Run();
static async Task ConfigureDatabaseAsync(IServiceProvider serviceProvider)
{
await using var scope = serviceProvider.CreateAsyncScope();
var dbContext = scope.ServiceProvider.GetRequiredService<ApplicationDbContext>();
await dbContext.Database.MigrateAsync();
}
@@ -5,7 +5,8 @@
"commandName": "Project", "commandName": "Project",
"dotnetRunMessages": true, "dotnetRunMessages": true,
"launchBrowser": true, "launchBrowser": true,
"applicationUrl": "https://localhost:7025;http://localhost:5178", "launchUrl": "",
"applicationUrl": "https://localhost:7024;http://localhost:5178",
"environmentVariables": { "environmentVariables": {
"ASPNETCORE_ENVIRONMENT": "Development" "ASPNETCORE_ENVIRONMENT": "Development"
} }
+48 -200
View File
@@ -1,247 +1,95 @@
using System.Runtime.CompilerServices; using System.Text;
using System.Text;
using Microsoft.Extensions.Caching.Hybrid; using Microsoft.Extensions.Caching.Hybrid;
using Microsoft.Extensions.Options;
using Microsoft.SemanticKernel.ChatCompletion; using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel.Connectors.AzureOpenAI;
using OpenAI.Chat;
using SqlDatabaseVectorSearch.Models;
using SqlDatabaseVectorSearch.Settings;
using Entities = SqlDatabaseVectorSearch.Data.Entities;
namespace SqlDatabaseVectorSearch.Services; namespace SqlDatabaseVectorSearch.Services;
public class ChatService(IChatCompletionService chatCompletionService, TokenizerService tokenizerService, HybridCache cache, IOptions<AppSettings> appSettingsOptions, ILogger<ChatService> logger) public class ChatService(IChatCompletionService chatCompletionService, HybridCache cache)
{ {
private readonly AppSettings appSettings = appSettingsOptions.Value; public async Task<string> CreateQuestionAsync(Guid conversationId, string question)
private static readonly string systemPromptForReformulation = """
You are a helpful assistant that reformulates questions to perform embeddings search.
Your task is to reformulate the question taking into account the context of the chat.
The reformulated question must always explicitly contain the subject of the question.
You MUST reformulate the question in the SAME language as the user's question. For example, if the user asks a question in English, the reformulated question MUST be in English. If the user asks in Italian, the reformulated question MUST be in Italian.
If asking a clarifying question to the user would help, ask the question.
Never add "in this chat", "in the context of this chat", "in the context of our conversation", "search for" or something like that in your answer.
""";
private static readonly string systemPromptForAnswering = """
You can use only the information provided in this chat to answer questions. If you don't know the answer, reply suggesting to refine the question.
For example, if the user asks "What is the capital of Italy?" and in this chat there isn't information about Italy, you should reply something like:
- This information isn't available in the given context.
- I'm sorry, I don't know the answer to that question.
- I don't have that information.
- I don't know.
- Given the context, I can't answer that question.
- I'm sorry, I don't have enough information to answer that question.
Never answer questions that are not related to this chat.
LANGUAGE RULE: You MUST ALWAYS answer in the SAME language as the user's question. For example, if the user asks a question in English, the answer MUST be in English. If the user asks in Italian, the answer MUST be in Italian. This rule applies NO MATTER what language the documents are written in. The language of your response must match the language of the question, NOT the language of the documents.
FORMATTING REQUIREMENT: Your answer MUST ALWAYS end with a period followed by a space before the citations block.
If your answer doesn't naturally end with a period, you MUST add one followed by a space.
After the answer, you need to include citations following the XML format below ONLY IF you know the answer and are providing information from the context. If you do NOT know the answer, DO NOT include the citations section at all.
<citation document-id="document_id" chunk-id="chunk_id" filename="string" page-number="page_number" index-on-page="index_on_page">exact quote here</citation>
<citation document-id="document_id" chunk-id="chunk_id" filename="string" page-number="page_number" index-on-page="index_on_page">exact quote here</citation>
The entire list of XML citations MUST be enclosed between and (U+3010 and U+3011) and must exactly match the above format.
The quote in each <citation> MUST be MAXIMUM 5 words, taken word-for-word from the search result.
IMPORTANT CITATION RULES:
1. NEVER put citations inside your answer text.
2. ALWAYS provide your complete answer FIRST.
3. ONLY AFTER completing your answer, add ALL citations in a block at the very end.
4. The citations block MUST be the last thing in your response, with absolutely nothing (no text, no spaces, no newlines, no punctuation, no comments) after it.
5. NEVER reference citations by number or mention them in your answer text.
6. The citations MUST ALWAYS follow the XML format exactly as shown below. Any other format is NOT ACCEPTED.
7. If you add anything after the citations block, your answer will be considered invalid.
8. If you do NOT know the answer, DO NOT include the citations block at all.
9. ALWAYS check that your answer ends with a period followed by a space before adding citations.
---
Example of a correct answer:
The capital of Italy is Rome.
<citation document-id="123" chunk-id="456" filename="italy.pdf" page-number="1" index-on-page="1">capital of Italy is Rome</citation>
Example of a correct answer when you do NOT know the answer:
I'm sorry, I don't know the answer to that question.
Example of an incorrect answer (NOT ACCEPTED):
The capital of Italy is Rome
<citation document-id="123" chunk-id="456" filename="italy.pdf" page-number="1" index-on-page="1">capital of Italy is Rome</citation>
Thank you for your question.
Another incorrect example (NOT ACCEPTED):
The capital of Italy is Rome.
<citation document-id="123" chunk-id="456" filename="italy.pdf" page-number="1" index-on-page="1">capital of Italy is Rome</citation>
[1] italy.pdf, page 1
---
Only the correct format is accepted. If you do not follow the XML format exactly, or if you add anything after the citations block, your answer will be considered invalid.
If you do NOT know the answer, DO NOT include the citations block at all.
Remember to ALWAYS end your answer with a period followed by a space before adding citations.
""";
public async Task<ChatResponse> CreateReformulateQuestionAsync(Guid conversationId, string question, CancellationToken cancellationToken = default)
{ {
var chat = await GetChatHistoryAsync(conversationId, cancellationToken); var chat = await GetChatHistoryAsync(conversationId);
var settings = new AzureOpenAIPromptExecutionSettings
{
ChatSystemPrompt = systemPromptForReformulation
};
var embeddingQuestion = $""" var embeddingQuestion = $"""
Reformulate the following question: Reformulate the following question taking into account the context of the chat to perform embeddings search:
--- ---
{question} {question}
---
You must reformulate the question in the same language of the user's question.
Never add "in this chat", "in the context of this chat", "in the context of our conversation", "search for" or something like that in your answer.
"""; """;
chat.AddUserMessage(embeddingQuestion); chat.AddUserMessage(embeddingQuestion);
var reformulatedQuestion = await chatCompletionService.GetChatMessageContentAsync(chat, settings, cancellationToken: cancellationToken); var reformulatedQuestion = await chatCompletionService.GetChatMessageContentAsync(chat)!;
chat.AddAssistantMessage(reformulatedQuestion.Content!); chat.AddAssistantMessage(reformulatedQuestion.Content!);
await UpdateCacheAsync(conversationId, chat, cancellationToken); await UpdateCacheAsync(conversationId, chat);
var tokenUsage = GetTokenUsage(reformulatedQuestion); return reformulatedQuestion.Content!;
logger.LogDebug("Reformulation: {TokenUsage}", tokenUsage);
return new(reformulatedQuestion.Content!, tokenUsage);
} }
public async Task<ChatResponse> AskQuestionAsync(Guid conversationId, IEnumerable<Entities.DocumentChunk> chunks, string question, CancellationToken cancellationToken = default) public async Task<string> AskQuestionAsync(Guid conversationId, IEnumerable<string> chunks, string question)
{ {
var (chat, settings) = CreateChatAsync(chunks, question); var chat = new ChatHistory("""
You can use only the information provided in this chat to answer questions. If you don't know the answer, reply suggesting to refine the question.
For example, if the user asks "What is the capital of France?" and in this chat there isn't information about France, you should reply something like "This information isn't available in the given context".
Never answer to questions that are not related to this chat.
You must answer in the same language of the user's question.
""");
var answer = await chatCompletionService.GetChatMessageContentAsync(chat, settings, cancellationToken: cancellationToken); var prompt = new StringBuilder("""
// Add question and answer to the chat history.
await SetChatHistoryAsync(conversationId, question, answer.Content!, cancellationToken);
var tokenUsage = GetTokenUsage(answer);
logger.LogDebug("Ask question: {TokenUsage}", tokenUsage);
return new(answer.Content!, tokenUsage);
}
public async IAsyncEnumerable<ChatResponse> AskStreamingAsync(Guid conversationId, IEnumerable<Entities.DocumentChunk> chunks, string question, [EnumeratorCancellation] CancellationToken cancellationToken = default)
{
var (chat, settings) = CreateChatAsync(chunks, question);
var answer = new StringBuilder();
await foreach (var token in chatCompletionService.GetStreamingChatMessageContentsAsync(chat, settings, cancellationToken: cancellationToken))
{
if (!string.IsNullOrEmpty(token.Content))
{
yield return new(token.Content);
answer.Append(token.Content);
}
else if (token.Content is null)
{
// Token usage is returned in the last message, when the Content is null.
var tokenUsage = GetTokenUsage(token);
if (tokenUsage is not null)
{
logger.LogDebug("Ask streaming: {TokenUsage}", tokenUsage);
yield return new(null, tokenUsage);
}
}
}
// Add question and answer to the chat history.
await SetChatHistoryAsync(conversationId, question, answer.ToString(), cancellationToken).ConfigureAwait(false);
}
private static TokenUsage? GetTokenUsage(Microsoft.SemanticKernel.ChatMessageContent message) =>
message.InnerContent is ChatCompletion content && content.Usage is not null
? new(content.Usage.InputTokenCount, content.Usage.OutputTokenCount) : null;
private static TokenUsage? GetTokenUsage(Microsoft.SemanticKernel.StreamingChatMessageContent message) =>
message.InnerContent is StreamingChatCompletionUpdate content && content.Usage is not null
? new(content.Usage.InputTokenCount, content.Usage.OutputTokenCount) : null;
private (ChatHistory Chat, AzureOpenAIPromptExecutionSettings Settings) CreateChatAsync(IEnumerable<Entities.DocumentChunk> chunks, string question)
{
var settings = new AzureOpenAIPromptExecutionSettings
{
MaxTokens = appSettings.MaxOutputTokens,
ChatSystemPrompt = systemPromptForAnswering
};
var prompt = new StringBuilder($"""
Answer the following question:
---
{question}
=====
Using the following information: Using the following information:
"""); """);
var availableTokens = appSettings.MaxInputTokens // TODO: Ensure that chunks are not too long, according to the model max token.
- tokenizerService.CountChatCompletionTokens(systemPromptForAnswering) // System prompt. foreach (var text in chunks)
- tokenizerService.CountChatCompletionTokens(prompt.ToString()) // Initial user prompt.
- appSettings.MaxOutputTokens; // To ensure there is enough space for the answer.
foreach (var chunk in chunks)
{ {
var text = $"--- {chunk.Document.Name} (Document ID: {chunk.Document.Id} | Chunk ID: {chunk.Id} | Page Number: {chunk.PageNumber} | Index on Page: {chunk.IndexOnPage}) {Environment.NewLine}{chunk.Content}{Environment.NewLine}"; prompt.AppendLine("---");
var tokenCount = tokenizerService.CountChatCompletionTokens(text);
if (tokenCount > availableTokens)
{
// There isn't enough space to add the current chunk.
break;
}
prompt.Append(text); prompt.Append(text);
availableTokens -= tokenCount;
if (availableTokens <= 0)
{
// There isn't enough space to add more chunks.
break;
}
} }
var chat = new ChatHistory(); prompt.AppendLine($"""
=====
Answer the following question:
---
{question}
""");
chat.AddUserMessage(prompt.ToString()); chat.AddUserMessage(prompt.ToString());
return (chat, settings); var answer = await chatCompletionService.GetChatMessageContentAsync(chat)!;
// Add question and answer to the chat history.
await SetChatHistoryAsync(conversationId, question, answer.Content!);
return answer.Content!;
} }
private async Task UpdateCacheAsync(Guid conversationId, ChatHistory chat, CancellationToken cancellationToken) private async Task UpdateCacheAsync(Guid conversationId, ChatHistory chat)
{ => await cache.SetAsync(conversationId.ToString(), chat);
if (chat.Count > appSettings.MessageLimit)
{
chat.RemoveRange(0, chat.Count - appSettings.MessageLimit);
}
await cache.SetAsync(conversationId.ToString(), chat, cancellationToken: cancellationToken); private async Task<ChatHistory> GetChatHistoryAsync(Guid conversationId)
}
private async Task<ChatHistory> GetChatHistoryAsync(Guid conversationId, CancellationToken cancellationToken)
{ {
var chat = await cache.GetOrCreateAsync(conversationId.ToString(), (cancellationToken) => var historyCache = await cache.GetOrCreateAsync(conversationId.ToString(),
(cancellationToken) =>
{ {
return ValueTask.FromResult<ChatHistory>([]); return ValueTask.FromResult<ChatHistory>([]);
}, cancellationToken: cancellationToken); });
var chat = new ChatHistory(historyCache);
return chat; return chat;
} }
private async Task SetChatHistoryAsync(Guid conversationId, string question, string answer, CancellationToken cancellationToken) private async Task SetChatHistoryAsync(Guid conversationId, string question, string answer)
{ {
var chat = await GetChatHistoryAsync(conversationId, cancellationToken); var history = await GetChatHistoryAsync(conversationId);
chat.AddUserMessage(question); history.AddUserMessage(question);
chat.AddAssistantMessage(answer); history.AddAssistantMessage(answer);
await UpdateCacheAsync(conversationId, chat, cancellationToken); await UpdateCacheAsync(conversationId, history);
} }
} }
@@ -1,42 +0,0 @@
using System.Data;
using Microsoft.EntityFrameworkCore;
using SqlDatabaseVectorSearch.Data;
using SqlDatabaseVectorSearch.Models;
namespace SqlDatabaseVectorSearch.Services;
public class DocumentService(ApplicationDbContext dbContext)
{
public async Task<IEnumerable<Document>> GetAsync(CancellationToken cancellationToken = default)
{
var documents = await dbContext.Documents.OrderBy(d => d.Name)
.Select(d => new Document(d.Id, d.Name, d.CreationDate, d.Chunks.Count))
.ToListAsync(cancellationToken);
return documents;
}
public async Task<IEnumerable<DocumentChunk>> GetChunksAsync(Guid documentId, CancellationToken cancellationToken = default)
{
var documentChunks = await dbContext.DocumentChunks.Where(c => c.DocumentId == documentId).OrderBy(c => c.Index)
.Select(c => new DocumentChunk(c.Id, c.Index, c.Content, c.PageNumber, c.IndexOnPage, null))
.ToListAsync(cancellationToken);
return documentChunks;
}
public async Task<DocumentChunk?> GetChunkEmbeddingAsync(Guid documentId, Guid documentChunkId, CancellationToken cancellationToken = default)
{
var documentChunk = await dbContext.DocumentChunks.Where(c => c.Id == documentChunkId && c.DocumentId == documentId)
.Select(c => new DocumentChunk(c.Id, c.Index, c.Content, c.PageNumber, c.IndexOnPage, c.Embedding.Memory.ToArray()))
.FirstOrDefaultAsync(cancellationToken);
return documentChunk;
}
public Task DeleteAsync(Guid documentId, CancellationToken cancellationToken = default)
=> dbContext.Documents.Where(d => d.Id == documentId).ExecuteDeleteAsync(cancellationToken);
public Task DeleteAsync(IEnumerable<Guid> documentIds, CancellationToken cancellationToken = default)
=> dbContext.Documents.Where(d => documentIds.Contains(d.Id)).ExecuteDeleteAsync(cancellationToken);
}
@@ -1,18 +0,0 @@
using Microsoft.Extensions.Options;
using Microsoft.ML.Tokenizers;
using SqlDatabaseVectorSearch.Settings;
namespace SqlDatabaseVectorSearch.Services;
public class TokenizerService(IOptions<AzureOpenAISettings> settingsOptions)
{
private readonly TiktokenTokenizer chatCompletiontokenizer = TiktokenTokenizer.CreateForModel(settingsOptions.Value.ChatCompletion.ModelId);
private readonly TiktokenTokenizer embeddingTokenizer = TiktokenTokenizer.CreateForModel(settingsOptions.Value.Embedding.ModelId);
public int CountChatCompletionTokens(string input)
=> chatCompletiontokenizer.CountTokens(input);
public int CountEmbeddingTokens(string input)
=> embeddingTokenizer.CountTokens(input);
}
@@ -1,200 +1,131 @@
using System.Data; using System.Data;
using System.Runtime.CompilerServices; using System.Data.Common;
using System.Text; using System.Text;
using System.Text.RegularExpressions; using System.Text.Json;
using Microsoft.Data.SqlTypes; using Dapper;
using Microsoft.EntityFrameworkCore; using Microsoft.Data.SqlClient;
using Microsoft.Extensions.AI;
using Microsoft.Extensions.Options; using Microsoft.Extensions.Options;
using SqlDatabaseVectorSearch.ContentDecoders; using Microsoft.SemanticKernel.Embeddings;
using SqlDatabaseVectorSearch.Data; using Microsoft.SemanticKernel.Text;
using SqlDatabaseVectorSearch.Models; using SqlDatabaseVectorSearch.Models;
using SqlDatabaseVectorSearch.Settings; using SqlDatabaseVectorSearch.Settings;
using ChatResponse = SqlDatabaseVectorSearch.Models.ChatResponse; using UglyToad.PdfPig;
using Entities = SqlDatabaseVectorSearch.Data.Entities; using UglyToad.PdfPig.DocumentLayoutAnalysis.TextExtractor;
namespace SqlDatabaseVectorSearch.Services; namespace SqlDatabaseVectorSearch.Services;
public partial class VectorSearchService(IServiceProvider serviceProvider, ApplicationDbContext dbContext, DocumentService documentService, IEmbeddingGenerator<string, Embedding<float>> embeddingGenerator, TokenizerService tokenizerService, ChatService chatService, TimeProvider timeProvider, IOptions<AppSettings> appSettingsOptions, ILogger<VectorSearchService> logger) public class VectorSearchService(SqlConnection sqlConnection, ITextEmbeddingGenerationService textEmbeddingGenerationService, ChatService chatService, TimeProvider timeProvider, IOptions<AppSettings> appSettingsOptions)
{ {
private readonly AppSettings appSettings = appSettingsOptions.Value; private readonly AppSettings appSettings = appSettingsOptions.Value;
public async Task<ImportDocumentResponse> ImportAsync(Stream stream, string name, string contentType, Guid? documentId, CancellationToken cancellationToken = default) public async Task<Guid> ImportAsync(Stream stream, string name, Guid? documentId)
{ {
// Extract the contents of the file. // Extract the contents of the file (currently, only PDF files are supported).
var decoder = serviceProvider.GetKeyedService<IContentDecoder>(contentType) ?? throw new NotSupportedException($"Content type '{contentType}' is not supported."); var content = await GetContentAsync(stream);
var chunks = await decoder.DecodeAsync(stream, contentType, cancellationToken);
var chunkContents = chunks.Select(p => p.Content).ToList();
// We get the token count of the whole document because it is the total number of token used by embedding (it may be necessary, for example, for cost analysis). await sqlConnection.OpenAsync();
var tokenCount = tokenizerService.CountEmbeddingTokens(string.Join(" ", chunkContents)); await using var transaction = await sqlConnection.BeginTransactionAsync();
var strategy = dbContext.Database.CreateExecutionStrategy(); if (documentId.HasValue)
var document = await strategy.ExecuteAsync(async (cancellationToken) =>
{ {
await dbContext.Database.BeginTransactionAsync(cancellationToken); // If the user is importing a document that already exists, delete the previous one.
await DeleteDocumentAsync(documentId.Value, transaction);
if (documentId.HasValue)
{
// If the user is importing a document that already exists, delete the previous one.
await documentService.DeleteAsync(documentId.Value, cancellationToken);
}
var document = new Entities.Document { Id = documentId.GetValueOrDefault(), Name = name, CreationDate = timeProvider.GetUtcNow() };
dbContext.Documents.Add(document);
// Process paragraphs in batches.
var embeddings = new List<Embedding<float>>();
foreach (var batch in chunkContents.Chunk(appSettings.EmbeddingBatchSize))
{
logger.LogDebug("Processing batch of {Count} chunks for embedding generation...", batch.Length);
// Generate embeddings for this batch.
var batchEmbeddings = await embeddingGenerator.GenerateAsync(batch, cancellationToken: cancellationToken);
embeddings.AddRange(batchEmbeddings);
}
// Save the document chunks and the corresponding embedding in the database.
foreach (var (index, embedding) in embeddings.Index())
{
var chunk = chunks.ElementAt(index);
logger.LogDebug("Storing a chunk of {TokenCount} tokens.", tokenizerService.CountEmbeddingTokens(chunk.Content));
var documentChunk = new Entities.DocumentChunk
{
Document = document,
Index = index,
PageNumber = chunk.PageNumber,
IndexOnPage = chunk.IndexOnPage,
Content = chunk.Content,
Embedding = new SqlVector<float>(embedding.Vector)
};
dbContext.DocumentChunks.Add(documentChunk);
}
await dbContext.SaveChangesAsync(cancellationToken);
await dbContext.Database.CommitTransactionAsync(cancellationToken);
return document;
}, cancellationToken);
return new(document.Id, tokenCount);
}
public async Task<Response> AskQuestionAsync(Question question, bool reformulate = true, CancellationToken cancellationToken = default)
{
// It the user doesn't want to reforulate the question, CreateContextAsync returns the original one.
var (reformulatedQuestion, embeddingTokenCount, chunks) = await CreateContextAsync(question, reformulate, cancellationToken);
var (fullAnswer, tokenUsage) = await chatService.AskQuestionAsync(question.ConversationId, chunks, reformulatedQuestion.Text!, cancellationToken);
// Extract citations from the answer.
var (answer, citations) = ExtractCitations(fullAnswer);
return new(question.Text, reformulatedQuestion.Text!, answer, StreamState.End, new(reformulatedQuestion.TokenUsage, embeddingTokenCount, tokenUsage), citations);
}
public async IAsyncEnumerable<Response> AskStreamingAsync(Question question, bool reformulate = true, [EnumeratorCancellation] CancellationToken cancellationToken = default)
{
// It the user doesn't want to reforulate the question, CreateContextAsync returns the original one.
var (reformulatedQuestion, embeddingTokenCount, chunks) = await CreateContextAsync(question, reformulate, cancellationToken);
var answerStream = chatService.AskStreamingAsync(question.ConversationId, chunks, reformulatedQuestion.Text!, cancellationToken: cancellationToken);
// The first message contains the question and the corresponding token usage (if reformulated).
yield return new(question.Text, reformulatedQuestion.Text!, null, StreamState.Start, new(reformulatedQuestion.TokenUsage, embeddingTokenCount, null));
TokenUsageResponse? tokenUsageResponse = null;
var fullAnswer = new StringBuilder();
var citationsStarted = false;
// Returns each token as a partial response.
await foreach (var (token, tokenUsage) in answerStream)
{
if (token is not null) // token can be null when the stream ends.
{
fullAnswer.Append(token);
if (token.Contains('【'))
{
// Citations start when we encounter a token containing a 【 character.
// We need to track it because we don't want to return the citations in the actual response.
citationsStarted = true;
}
if (!citationsStarted)
{
yield return new(token, StreamState.Append);
}
}
else
{
// Token usage is expected in the last message, when token is null.
tokenUsageResponse ??= tokenUsage is not null ? new(tokenUsage) : null;
}
} }
// Extract citations at the end of streaming. documentId = await sqlConnection.ExecuteScalarAsync<Guid>($"""
var (_, citations) = ExtractCitations(fullAnswer.ToString()); INSERT INTO Documents (Id, [Name], CreationDate)
yield return new(null, StreamState.End, tokenUsageResponse, citations); OUTPUT INSERTED.Id
VALUES (@Id, @Name, @CreationDate);
""", new { Id = documentId.GetValueOrDefault(Guid.NewGuid()), Name = name, CreationDate = timeProvider.GetUtcNow() },
transaction);
// Split the content into chunks and generate the embeddings for each one.
var paragraphs = TextChunker.SplitPlainTextParagraphs(TextChunker.SplitPlainTextLines(content, appSettings.MaxTokensPerLine), appSettings.MaxTokensPerParagraph, appSettings.OverlapTokens);
var embeddings = await textEmbeddingGenerationService.GenerateEmbeddingsAsync(paragraphs);
// Save the document chunks and the corresponding embedding in the database.
foreach (var (index, paragraph) in paragraphs.Index())
{
await sqlConnection.ExecuteAsync($"""
INSERT INTO DocumentChunks (DocumentId, [Index], Content, Embedding)
VALUES (@DocumentId, @Index, @Content, CAST(@Embedding AS VECTOR({embeddings[index].Length})));
""", new { DocumentId = documentId, Index = index, Content = paragraph, Embedding = JsonSerializer.Serialize(embeddings[index]) },
transaction);
}
await transaction.CommitAsync();
return documentId.Value;
} }
private async Task<(ChatResponse ReformulatedQuestion, int EmbeddingTokenCount, IEnumerable<Entities.DocumentChunk> Chunks)> CreateContextAsync(Question question, bool reformulate, CancellationToken cancellationToken) public async Task<IEnumerable<Document>> GetDocumentsAsync()
{ {
// Reformulate the question taking into account the context of the chat to perform keyword search and embeddings. var documents = await sqlConnection.QueryAsync<Document>("""
var reformulatedQuestion = reformulate ? await chatService.CreateReformulateQuestionAsync(question.ConversationId, question.Text, cancellationToken) : new(question.Text); SELECT Id, [Name], CreationDate, ChunkCount = (SELECT COUNT(*) FROM DocumentChunks WHERE DocumentId = Documents.Id)
FROM Documents
ORDER BY [Name];
""");
var embeddingTokenCount = tokenizerService.CountEmbeddingTokens(reformulatedQuestion.Text!); return documents;
logger.LogDebug("Embedding Token Count: {EmbeddingTokenCount}", embeddingTokenCount); }
public async Task<IEnumerable<DocumentChunk>> GetDocumentChunksAsync(Guid documentId)
{
var documentChunks = await sqlConnection.QueryAsync<DocumentChunk>("""
SELECT Id, [Index], Content
FROM DocumentChunks
WHERE DocumentId = @DocumentId
ORDER BY [Index];
""", new { documentId });
return documentChunks;
}
public async Task<DocumentChunk?> GetDocumentChunkEmbeddingAsync(Guid documentId, Guid documentChunkId)
{
var documentChunk = await sqlConnection.QueryFirstOrDefaultAsync<DocumentChunk>("""
SELECT Id, [Index], Content, CAST(Embedding AS NVARCHAR(MAX)) AS Embedding
FROM DocumentChunks
WHERE Id = @DocumentChunkId AND DocumentId = @DocumentId;
""", new { documentId, documentChunkId });
return documentChunk;
}
public Task DeleteDocumentAsync(Guid documentId, DbTransaction? transaction = null)
=> sqlConnection.ExecuteAsync("DELETE FROM Documents WHERE Id = @DocumentId", new { DocumentId = documentId }, transaction);
public async Task<Response> AskQuestionAsync(Question question, bool reformulate = true)
{
// Reformulate the following question taking into account the context of the chat to perform keyword search and embeddings:
var reformulatedQuestion = reformulate ? await chatService.CreateQuestionAsync(question.ConversationId, question.Text) : question.Text;
// Perform Vector Search on SQL Database. // Perform Vector Search on SQL Database.
var questionEmbedding = await embeddingGenerator.GenerateVectorAsync(reformulatedQuestion.Text!, cancellationToken: cancellationToken); var questionEmbedding = await textEmbeddingGenerationService.GenerateEmbeddingAsync(reformulatedQuestion);
var embeddingVector = new SqlVector<float>(questionEmbedding);
var chunks = await dbContext.DocumentChunks.Include(c => c.Document) var chunks = await sqlConnection.QueryAsync<string>($"""
.OrderBy(c => EF.Functions.VectorDistance("cosine", c.Embedding, embeddingVector)) SELECT TOP (@MaxRelevantChunks) Content
.Take(appSettings.MaxRelevantChunks) FROM DocumentChunks
.ToListAsync(cancellationToken); ORDER BY VECTOR_DISTANCE('cosine', Embedding, CAST(@QuestionEmbedding AS VECTOR({questionEmbedding.Length})));
""", new { appSettings.MaxRelevantChunks, QuestionEmbedding = JsonSerializer.Serialize(questionEmbedding) });
return (reformulatedQuestion, embeddingTokenCount, chunks); var answer = await chatService.AskQuestionAsync(question.ConversationId, chunks, reformulatedQuestion);
return new Response(reformulatedQuestion, answer);
} }
private static (string, IEnumerable<Citation>) ExtractCitations(string? text) private static Task<string> GetContentAsync(Stream stream)
{ {
var citations = new List<Citation>(); var content = new StringBuilder();
if (string.IsNullOrEmpty(text)) // Read the content of the PDF document.
using var pdfDocument = PdfDocument.Open(stream);
foreach (var page in pdfDocument.GetPages().Where(x => x is not null))
{ {
return (text ?? string.Empty, citations); var pageContent = ContentOrderTextExtractor.GetText(page) ?? string.Empty;
content.AppendLine(pageContent);
} }
var matches = CitationRegEx.Matches(text); return Task.FromResult(content.ToString());
foreach (Match match in matches)
{
if (match.Success)
{
citations.Add(new Citation
{
DocumentId = Guid.Parse(match.Groups["documentId"].Value),
ChunkId = Guid.Parse(match.Groups["chunkId"].Value),
FileName = match.Groups["filename"].Value,
PageNumber = int.TryParse(match.Groups["pageNumber"].Value, out var pageNumber) && pageNumber > 0 ? pageNumber : null,
IndexOnPage = int.TryParse(match.Groups["indexOnPage"].Value, out var indexOnPage) ? indexOnPage : 0,
Quote = match.Groups["quote"].Value
});
}
}
// Remove all content between 【 and 】.
var cleanText = RemoveCitationsRegEx.Replace(text, string.Empty).TrimEnd();
return (cleanText, citations.OrderBy(c => c.FileName).ThenBy(c => c.PageNumber));
} }
[GeneratedRegex(@"<citation\s+document-id=(?:""|'|)(?<documentId>[^""']*)(?:""|'|)\s+chunk-id=(?:""|'|)(?<chunkId>[^""']*)(?:""|'|)\s+filename=(?:""|'|)(?<filename>[^""']*)(?:""|'|)\s+page-number=(?:""|'|)(?<pageNumber>[^""']*)(?:""|'|)\s+index-on-page=(?:""|'|)(?<indexOnPage>[^""']*)(?:""|'|)>\s*(?<quote>.*?)\s*</citation>", RegexOptions.Singleline)]
private static partial Regex CitationRegEx { get; }
[GeneratedRegex(@"【.*?】", RegexOptions.Singleline)]
private static partial Regex RemoveCitationsRegEx { get; }
} }
@@ -2,21 +2,13 @@
public class AppSettings public class AppSettings
{ {
public int EmbeddingBatchSize { get; init; } = 32;
public int MaxTokensPerLine { get; init; } = 300; public int MaxTokensPerLine { get; init; } = 300;
public int MaxTokensPerParagraph { get; init; } = 1000; public int MaxTokensPerParagraph { get; init; } = 1024;
public int OverlapTokens { get; init; } = 100; public int OverlapTokens { get; init; } = 100;
public int MaxRelevantChunks { get; init; } = 5; public int MaxRelevantChunks { get; init; } = 5;
public int MaxInputTokens { get; init; } = 16385;
public int MaxOutputTokens { get; init; } = 800;
public TimeSpan MessageExpiration { get; init; } public TimeSpan MessageExpiration { get; init; }
public int MessageLimit { get; set; } = 20;
} }
@@ -4,7 +4,7 @@ public class AzureOpenAISettings
{ {
public required ServiceSettings ChatCompletion { get; init; } public required ServiceSettings ChatCompletion { get; init; }
public required EmbeddingSettings Embedding { get; init; } public required EmbeddingServiceSettings Embedding { get; init; }
} }
public class ServiceSettings public class ServiceSettings
@@ -13,12 +13,10 @@ public class ServiceSettings
public required string Deployment { get; init; } public required string Deployment { get; init; }
public required string ModelId { get; init; }
public required string ApiKey { get; init; } public required string ApiKey { get; init; }
} }
public class EmbeddingSettings : ServiceSettings public class EmbeddingServiceSettings : ServiceSettings
{ {
public int? Dimensions { get; set; } public int? Dimensions { get; set; }
} }
@@ -1,35 +1,22 @@
<Project Sdk="Microsoft.NET.Sdk.Web"> <Project Sdk="Microsoft.NET.Sdk.Web">
<PropertyGroup> <PropertyGroup>
<TargetFramework>net10.0</TargetFramework> <TargetFramework>net9.0</TargetFramework>
<ImplicitUsings>enable</ImplicitUsings> <ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable> <Nullable>enable</Nullable>
<NoWarn>$(NoWarn);SKEXP0010;SKEXP0050</NoWarn> <NoWarn>$(NoWarn);SKEXP0001;SKEXP0010;SKEXP0050;EXTEXP0018</NoWarn>
</PropertyGroup> </PropertyGroup>
<ItemGroup> <ItemGroup>
<PackageReference Include="Blazor.Bootstrap" Version="3.5.0" /> <PackageReference Include="Dapper" Version="2.1.35" />
<PackageReference Include="DocumentFormat.OpenXml" Version="3.5.1" /> <PackageReference Include="Microsoft.AspNetCore.OpenApi" Version="9.0.0" />
<PackageReference Include="EntityFrameworkCore.Exceptions.SqlServer" Version="10.0.1" /> <PackageReference Include="Microsoft.Data.SqlClient" Version="5.2.2" />
<PackageReference Include="FluentValidation.DependencyInjectionExtensions" Version="12.1.1" /> <PackageReference Include="Microsoft.Extensions.Caching.Hybrid" Version="9.0.0-preview.9.24556.5" />
<PackageReference Include="Microsoft.AspNetCore.OpenApi" Version="10.0.9" /> <PackageReference Include="Microsoft.SemanticKernel" Version="1.32.0" />
<PackageReference Include="Microsoft.EntityFrameworkCore.SqlServer" Version="10.0.9" /> <PackageReference Include="MinimalHelpers.OpenApi" Version="2.1.2" />
<PackageReference Include="Microsoft.EntityFrameworkCore.Tools" Version="10.0.9"> <PackageReference Include="PdfPig" Version="0.1.9" />
<PrivateAssets>all</PrivateAssets> <PackageReference Include="Swashbuckle.AspNetCore.SwaggerUI" Version="7.2.0" />
<IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets> <PackageReference Include="TinyHelpers.AspNetCore" Version="4.0.6" />
</PackageReference> </ItemGroup>
<PackageReference Include="Microsoft.Extensions.Caching.Hybrid" Version="10.7.0" />
<PackageReference Include="Microsoft.Extensions.Http.Resilience" Version="10.7.0" />
<PackageReference Include="Microsoft.ML.Tokenizers" Version="2.0.0" />
<PackageReference Include="Microsoft.ML.Tokenizers.Data.Cl100kBase" Version="2.0.0" />
<PackageReference Include="Microsoft.ML.Tokenizers.Data.O200kBase" Version="2.0.0" />
<PackageReference Include="Microsoft.SemanticKernel" Version="1.77.0" />
<PackageReference Include="MimeMapping" Version="4.0.0" />
<PackageReference Include="MinimalHelpers.FluentValidation" Version="1.1.8" />
<PackageReference Include="MinimalHelpers.Routing.Analyzers" Version="1.2.2" />
<PackageReference Include="PdfPig" Version="0.1.14" />
<PackageReference Include="Swashbuckle.AspNetCore.SwaggerUI" Version="10.2.1" />
<PackageReference Include="TinyHelpers.AspNetCore" Version="4.2.12" />
</ItemGroup>
</Project> </Project>
@@ -1,19 +0,0 @@
using Microsoft.Extensions.Options;
using Microsoft.SemanticKernel.Text;
using SqlDatabaseVectorSearch.Services;
using SqlDatabaseVectorSearch.Settings;
namespace SqlDatabaseVectorSearch.TextChunkers;
public class DefaultTextChunker(TokenizerService tokenizerService, IOptions<AppSettings> appSettingsOptions) : ITextChunker
{
private readonly AppSettings appSettings = appSettingsOptions.Value;
public IList<string> Split(string text)
{
var lines = TextChunker.SplitPlainTextLines(text, appSettings.MaxTokensPerLine, tokenizerService.CountEmbeddingTokens);
var paragraphs = TextChunker.SplitPlainTextParagraphs(lines, appSettings.MaxTokensPerParagraph, appSettings.OverlapTokens, tokenCounter: tokenizerService.CountEmbeddingTokens);
return paragraphs;
}
}
@@ -1,6 +0,0 @@
namespace SqlDatabaseVectorSearch.TextChunkers;
public interface ITextChunker
{
IList<string> Split(string text);
}
@@ -1,19 +0,0 @@
using Microsoft.Extensions.Options;
using Microsoft.SemanticKernel.Text;
using SqlDatabaseVectorSearch.Services;
using SqlDatabaseVectorSearch.Settings;
namespace SqlDatabaseVectorSearch.TextChunkers;
public class MarkdownTextChunker(TokenizerService tokenizerService, IOptions<AppSettings> appSettingsOptions) : ITextChunker
{
private readonly AppSettings appSettings = appSettingsOptions.Value;
public IList<string> Split(string text)
{
var lines = TextChunker.SplitMarkDownLines(text, appSettings.MaxTokensPerLine, tokenizerService.CountEmbeddingTokens);
var paragraphs = TextChunker.SplitMarkdownParagraphs(lines, appSettings.MaxTokensPerParagraph, appSettings.OverlapTokens, tokenCounter: tokenizerService.CountEmbeddingTokens);
return paragraphs;
}
}
@@ -1,12 +0,0 @@
using FluentValidation;
using SqlDatabaseVectorSearch.Models;
namespace SqlDatabaseVectorSearch.Validations;
public class QuestionValidator : AbstractValidator<Question>
{
public QuestionValidator()
{
RuleFor(x => x.Text).NotEmpty().MaximumLength(4096).WithName("Question Text");
}
}
@@ -3,8 +3,7 @@
"LogLevel": { "LogLevel": {
"Default": "Information", "Default": "Information",
"Microsoft.AspNetCore": "Warning", "Microsoft.AspNetCore": "Warning",
"Microsoft.AspNetCore.Watch.BrowserRefresh": "Warning", "Microsoft.KernelMemory": "Debug"
"SqlDatabaseVectorSearch": "Debug"
} }
} }
} }
+4 -10
View File
@@ -6,13 +6,11 @@
"ChatCompletion": { "ChatCompletion": {
"Endpoint": "", "Endpoint": "",
"Deployment": "", "Deployment": "",
"ModelId": "", // gpt-4o, gpt-4, gpt-3.5, etc. Note that for gpt-4.1 and gpt-5 models, the ModelId must be set to gpt-4o.
"ApiKey": "" "ApiKey": ""
}, },
"Embedding": { "Embedding": {
"Endpoint": "", "Endpoint": "",
"Deployment": "", "Deployment": "",
"ModelId": "", // text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002
"ApiKey": "", "ApiKey": "",
// Set this value only if you're using a model that allows to specify the dimensions of the embeddings // Set this value only if you're using a model that allows to specify the dimensions of the embeddings
// (e.g. text-embedding-3-small or text-embedding-3-large). Currently, a maximum value of 1998 is supported. // (e.g. text-embedding-3-small or text-embedding-3-large). Currently, a maximum value of 1998 is supported.
@@ -20,15 +18,11 @@
} }
}, },
"AppSettings": { "AppSettings": {
"EmbeddingBatchSize": 32, "MaxTokenPerLine": 300,
"MaxTokensPerLine": 300, "MaxTokensPerParagraph": 1024,
"MaxTokensPerParagraph": 1000,
"OverlapTokens": 100, "OverlapTokens": 100,
"MaxRelevantChunks": 50, "MaxRelevantChunks": 10,
"MaxInputTokens": 32768, "MessageExpiration": "00:05:00"
"MaxOutputTokens": 800,
"MessageExpiration": "00:05:00",
"MessageLimit": 20
}, },
"Logging": { "Logging": {
"LogLevel": { "LogLevel": {
@@ -1,76 +0,0 @@
body, html {
margin: 0;
padding: 0;
height: 100%;
}
:root {
--bb-sidebar2-width: 270px;
--bb-sidebar2-collapsed-width: 50px;
--bb-sidebar2-background-color: rgba(234, 234, 234, 1);
--bb-sidebar2-top-row-background-color: rgba(0,0,0,0.08);
--bb-sidebar2-top-row-border-color: rgb(194,192,192);
--bb-sidebar2-title-text-color: rgb(0,0,0);
--bb-sidebar2-brand-icon-color: rgb(0,0,0);
--bb-sidebar2-brand-image-width: 24px;
--bb-sidebar2-brand-image-height: 24px;
--bb-sidebar2-title-badge-text-color: rgb(255,255,255);
--bb-sidebar2-title-badge-background-color: rgba(25,135,84,var(--bs-bg-opacity,1));
--bb-sidebar2-navbar-toggler-icon-color: rgb(0,0,0);
--bb-sidebar2-navbar-toggler-background-color: rgba(0,0,0,0.08);
--bb-sidebar2-content-border-color: rgb(194,192,192);
--bb-sidebar2-nav-item-text-color: rgba(0,0,0,0.9);
--bb-sidebar2-nav-item-text-active-color-rgb: 0,0,0;
--bb-sidebar2-nav-item-text-hover-color: rgba(var(--bb-sidebar-nav-item-text-active-color-rgb),0.9);
--bb-sidebar2-nav-item-text-active-color: rgba(var(--bb-sidebar-nav-item-text-active-color-rgb),0.9);
--bb-sidebar2-nav-item-background-hover-color: rgba(var(--bb-sidebar-nav-item-text-active-color-rgb),0.08);
--bb-sidebar2-nav-item-group-background-color: rgba(var(--bb-sidebar-nav-item-text-active-color-rgb),0.08);
}
.bb-sidebar2 nav .nav-item a:hover {
background-color: rgba(0,0,0,0.08) !important;
color: rgba(0,0,0,0.9) !important;
}
.bb-sidebar2 nav .nav-item a.active {
background-color: rgb(194,192,192) !important;
color: rgba(0,0,0,0.9) !important;
}
h1:focus {
outline: none;
}
.valid.modified:not([type=checkbox]) {
outline: 1px solid #26b050;
}
.invalid {
outline: 1px solid red;
}
.validation-message {
color: red;
}
.blazor-error-boundary {
background: url(data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iNTYiIGhlaWdodD0iNDkiIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIgeG1sbnM6eGxpbms9Imh0dHA6Ly93d3cudzMub3JnLzE5OTkveGxpbmsiIG92ZXJmbG93PSJoaWRkZW4iPjxkZWZzPjxjbGlwUGF0aCBpZD0iY2xpcDAiPjxyZWN0IHg9IjIzNSIgeT0iNTEiIHdpZHRoPSI1NiIgaGVpZ2h0PSI0OSIvPjwvY2xpcFBhdGg+PC9kZWZzPjxnIGNsaXAtcGF0aD0idXJsKCNjbGlwMCkiIHRyYW5zZm9ybT0idHJhbnNsYXRlKC0yMzUgLTUxKSI+PHBhdGggZD0iTTI2My41MDYgNTFDMjY0LjcxNyA1MSAyNjUuODEzIDUxLjQ4MzcgMjY2LjYwNiA1Mi4yNjU4TDI2Ny4wNTIgNTIuNzk4NyAyNjcuNTM5IDUzLjYyODMgMjkwLjE4NSA5Mi4xODMxIDI5MC41NDUgOTIuNzk1IDI5MC42NTYgOTIuOTk2QzI5MC44NzcgOTMuNTEzIDI5MSA5NC4wODE1IDI5MSA5NC42NzgyIDI5MSA5Ny4wNjUxIDI4OS4wMzggOTkgMjg2LjYxNyA5OUwyNDAuMzgzIDk5QzIzNy45NjMgOTkgMjM2IDk3LjA2NTEgMjM2IDk0LjY3ODIgMjM2IDk0LjM3OTkgMjM2LjAzMSA5NC4wODg2IDIzNi4wODkgOTMuODA3MkwyMzYuMzM4IDkzLjAxNjIgMjM2Ljg1OCA5Mi4xMzE0IDI1OS40NzMgNTMuNjI5NCAyNTkuOTYxIDUyLjc5ODUgMjYwLjQwNyA1Mi4yNjU4QzI2MS4yIDUxLjQ4MzcgMjYyLjI5NiA1MSAyNjMuNTA2IDUxWk0yNjMuNTg2IDY2LjAxODNDMjYwLjczNyA2Ni4wMTgzIDI1OS4zMTMgNjcuMTI0NSAyNTkuMzEzIDY5LjMzNyAyNTkuMzEzIDY5LjYxMDIgMjU5LjMzMiA2OS44NjA4IDI1OS4zNzEgNzAuMDg4N0wyNjEuNzk1IDg0LjAxNjEgMjY1LjM4IDg0LjAxNjEgMjY3LjgyMSA2OS43NDc1QzI2Ny44NiA2OS43MzA5IDI2Ny44NzkgNjkuNTg3NyAyNjcuODc5IDY5LjMxNzkgMjY3Ljg3OSA2Ny4xMTgyIDI2Ni40NDggNjYuMDE4MyAyNjMuNTg2IDY2LjAxODNaTTI2My41NzYgODYuMDU0N0MyNjEuMDQ5IDg2LjA1NDcgMjU5Ljc4NiA4Ny4zMDA1IDI1OS43ODYgODkuNzkyMSAyNTkuNzg2IDkyLjI4MzcgMjYxLjA0OSA5My41Mjk1IDI2My41NzYgOTMuNTI5NSAyNjYuMTE2IDkzLjUyOTUgMjY3LjM4NyA5Mi4yODM3IDI2Ny4zODcgODkuNzkyMSAyNjcuMzg3IDg3LjMwMDUgMjY2LjExNiA4Ni4wNTQ3IDI2My41NzYgODYuMDU0N1oiIGZpbGw9IiNGRkU1MDAiIGZpbGwtcnVsZT0iZXZlbm9kZCIvPjwvZz48L3N2Zz4=) no-repeat 1rem/1.8rem, #b32121;
padding: 1rem 1rem 1rem 3.7rem;
color: white;
}
.blazor-error-boundary::after {
content: "An error has occurred."
}
.content-type-badge {
background-color: #e5e7eb !important;
color: #495057 !important;
border: 1px solid #d1d5db !important;
}
.citation-box {
width: fit-content;
max-width: 100%;
background-color: #f8f9fa;
}
Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.8 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 2.0 KiB

@@ -1 +0,0 @@
<svg id="uuid-adbdae8e-5a41-46d1-8c18-aa73cdbfee32" xmlns="http://www.w3.org/2000/svg" width="18" height="18" viewBox="0 0 18 18"><defs><radialGradient id="uuid-2a7407aa-b787-48dd-a96a-0d81ab6e93bb" cx="-67.981" cy="793.199" r=".45" gradientTransform="translate(-17939.03 20368.029) rotate(45) scale(25.091 -34.149)" gradientUnits="userSpaceOnUse"><stop offset="0" stop-color="#83b9f9" /><stop offset="1" stop-color="#0078d4" /></radialGradient></defs><path d="m0,2.7v12.6c0,1.491,1.209,2.7,2.7,2.7h12.6c1.491,0,2.7-1.209,2.7-2.7V2.7c0-1.491-1.209-2.7-2.7-2.7H2.7C1.209,0,0,1.209,0,2.7ZM10.8,0v3.6c0,3.976,3.224,7.2,7.2,7.2h-3.6c-3.976,0-7.199,3.222-7.2,7.198v-3.598c0-3.976-3.224-7.2-7.2-7.2h3.6c3.976,0,7.2-3.224,7.2-7.2Z" fill="url(#uuid-2a7407aa-b787-48dd-a96a-0d81ab6e93bb)" stroke-width="0" /></svg>

Before

Width:  |  Height:  |  Size: 805 B

@@ -1 +0,0 @@
<svg id="a96792b7-ce28-4ca3-9767-4e065ef4820f" xmlns="http://www.w3.org/2000/svg" width="18" height="18" viewBox="0 0 18 18"><defs><linearGradient id="ef16bf9d-a8b6-4181-b6cd-66fc5203f956" x1="2.59" y1="10.16" x2="15.41" y2="10.16" gradientUnits="userSpaceOnUse"><stop offset="0" stop-color="#005ba1" /><stop offset="0.07" stop-color="#0060a9" /><stop offset="0.36" stop-color="#0071c8" /><stop offset="0.52" stop-color="#0078d4" /><stop offset="0.64" stop-color="#0074cd" /><stop offset="0.82" stop-color="#006abb" /><stop offset="1" stop-color="#005ba1" /></linearGradient><radialGradient id="bf3846c3-4d74-4743-ab9a-f334c248bd92" cx="9.36" cy="10.57" r="7.07" gradientUnits="userSpaceOnUse"><stop offset="0" stop-color="#f2f2f2" /><stop offset="0.58" stop-color="#eee" /><stop offset="1" stop-color="#e6e6e6" /></radialGradient></defs><title>Icon-databases-130</title><path d="M9,5.14c-3.54,0-6.41-1-6.41-2.32V15.18c0,1.27,2.82,2.3,6.32,2.32H9c3.54,0,6.41-1,6.41-2.32V2.82C15.41,4.11,12.54,5.14,9,5.14Z" fill="url(#ef16bf9d-a8b6-4181-b6cd-66fc5203f956)" /><path d="M15.41,2.82c0,1.29-2.87,2.32-6.41,2.32s-6.41-1-6.41-2.32S5.46.5,9,.5s6.41,1,6.41,2.32" fill="#e8e8e8" /><path d="M13.92,2.63c0,.82-2.21,1.48-4.92,1.48S4.08,3.45,4.08,2.63,6.29,1.16,9,1.16s4.92.66,4.92,1.47" fill="#50e6ff" /><path d="M9,3a11.55,11.55,0,0,0-3.89.57A11.42,11.42,0,0,0,9,4.11a11.15,11.15,0,0,0,3.89-.58A11.84,11.84,0,0,0,9,3Z" fill="#198ab3" /><path d="M12.9,11.4V8H12v4.13h2.46V11.4ZM5.76,9.73a1.83,1.83,0,0,1-.51-.31.44.44,0,0,1-.12-.32.34.34,0,0,1,.15-.3.68.68,0,0,1,.42-.12,1.62,1.62,0,0,1,1,.29V8.11a2.58,2.58,0,0,0-1-.16,1.64,1.64,0,0,0-1.09.34,1.08,1.08,0,0,0-.42.89c0,.51.32.91,1,1.21a2.88,2.88,0,0,1,.62.36.42.42,0,0,1,.15.32.38.38,0,0,1-.16.31.81.81,0,0,1-.45.11,1.66,1.66,0,0,1-1.09-.42V12a2.17,2.17,0,0,0,1.07.24,1.88,1.88,0,0,0,1.18-.33A1.08,1.08,0,0,0,6.84,11a1.05,1.05,0,0,0-.25-.7A2.42,2.42,0,0,0,5.76,9.73ZM11,11.32a2.34,2.34,0,0,0,.33-1.26A2.32,2.32,0,0,0,11,9a1.81,1.81,0,0,0-.7-.75,2,2,0,0,0-1-.26,2.11,2.11,0,0,0-1.08.27A1.86,1.86,0,0,0,7.49,9a2.46,2.46,0,0,0-.26,1.14,2.26,2.26,0,0,0,.24,1,1.76,1.76,0,0,0,.69.74,2.06,2.06,0,0,0,1,.3l.86,1h1.21L10,12.08A1.79,1.79,0,0,0,11,11.32ZM10,11.07a.94.94,0,0,1-.76.35.92.92,0,0,1-.76-.36,1.52,1.52,0,0,1-.29-1,1.53,1.53,0,0,1,.29-1,1,1,0,0,1,.78-.37.87.87,0,0,1,.75.37,1.62,1.62,0,0,1,.27,1A1.46,1.46,0,0,1,10,11.07Z" fill="url(#bf3846c3-4d74-4743-ab9a-f334c248bd92)" /></svg>

Before

Width:  |  Height:  |  Size: 2.4 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1010 B

@@ -1,19 +0,0 @@
window.setFocus = (element) => {
if (element) {
element.focus();
}
};
window.scrollTo = (element) => {
if (element) {
element.scrollIntoView();
}
}
window.resetFileInput = (elementId) => {
document.getElementById(elementId).value = '';
};
function getLocalTime(utcDateTime) {
return new Date(utcDateTime).toLocaleString();
}
Binary file not shown.
Binary file not shown.

Before

Width:  |  Height:  |  Size: 70 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 119 KiB