Commit Graph

3 Commits

Author SHA1 Message Date
Marco Minerva 404cd7565a Switch to SqlVector<float> for embeddings
Updated the application to use SQL Server's native vector data type (`SqlVector<float>`) for embeddings, replacing the previous `float[]` or `string` representations.

- Updated `.editorconfig` with new code style preferences and diagnostic rule severities.
- Modified `DocumentChunk.cs` to use `SqlVector<float>` for the `Embedding` property.
- Updated migrations and `ApplicationDbContextModelSnapshot` to reflect the new `SqlVector<float>` type.
- Replaced `AddAzureSql` with `AddSqlServer` in `Program.cs` and removed `UseVectorSearch`.
- Adjusted `DocumentService` and `VectorSearchService` to handle `SqlVector<float>` and updated vector search logic.
- Removed the `EFCore.SqlServer.VectorSearch` package and upgraded EF Core to `10.0.0-rc.1`.
- Made minor adjustments to OpenAPI configuration and dependency management.
2025-09-10 16:45:16 +02:00
Marco Minerva cdf8356e11 Enhance citation handling and document chunk structure
- Updated `Ask.razor` to change `PageNumber` to a nullable integer and added `IndexOnPage` to the `Citation` class. Adjusted regex for citation parsing.
- Introduced `PageNumber` and `IndexOnPage` properties in `DocumentChunk.cs`, marking `Content` as required.
- Modified migration files to reflect changes in `DocumentChunk` and `Document` entities.
- Updated citation format in `ChatService.cs` to include `index-on-page` and adjusted document chunk text formatting.
- Changed embedding generation method in `VectorSearchService.cs` and updated document chunk creation to include new properties.
2025-06-06 11:26:27 +02:00
Marco Minerva fa81f01c27 Refactor content decoders and restructure data layer
Updated `DocxContentDecoder`, `PdfContentDecoder`, and `TextContentDecoder` to return `Task<IEnumerable<Chunk>>` instead of `Task<string>`, introducing a new `Chunk` record for structured output.

Restructured the `ApplicationDbContext`, `Document`, and `DocumentChunk` classes by moving them to the `SqlDatabaseVectorSearch.Data` namespace for better organization.

Updated database migration files to align with the new entity structure and modified references in `Program.cs`, `DocumentService.cs`, and `VectorSearchService.cs` to use the new namespace.
2025-05-27 17:10:17 +02:00