Add Markdown support and refactor text chunking

Updated README.md to include Markdown file support.
Introduced new endpoint for uploading Markdown documents with MIME type handling.
Removed TextChunkerService and created DefaultTextChunker and MarkdownTextChunker classes implementing ITextChunker.
Updated VectorSearchService to utilize the new chunking interface.
Added MimeMapping package reference in the project file.
This commit is contained in:
Marco Minerva
2025-02-14 12:06:52 +01:00
parent e228d0bdbc
commit 5a507e972c
7 changed files with 56 additions and 20 deletions
+1 -1
View File
@@ -1,7 +1,7 @@
# SQL Database Vector Search Sample
A repository that showcases the native VECTOR type in Azure SQL Database to perform embeddings and RAG with Azure OpenAI.
The application is a Minimal API that exposes endpoints to load documents, generate embeddings and save them into the database as Vectors, and perform searches using Vector Search and RAG. Currently, PDF, DOCX and TXT files are supported. Vectors are saved and retrieved with Entity Framework Core using the [EFCore.SqlServer.VectorSearch](https://github.com/efcore/EfCore.SqlServer.VectorSearch) library. Embedding and Chat Completion are integrated with [Semantic Kernel](https://github.com/microsoft/semantic-kernel).
The application is a Minimal API that exposes endpoints to load documents, generate embeddings and save them into the database as Vectors, and perform searches using Vector Search and RAG. Currently, PDF, DOCX, TXT and MD files are supported. Vectors are saved and retrieved with Entity Framework Core using the [EFCore.SqlServer.VectorSearch](https://github.com/efcore/EfCore.SqlServer.VectorSearch) library. Embedding and Chat Completion are integrated with [Semantic Kernel](https://github.com/microsoft/semantic-kernel).
> [!NOTE]
> If you prefer to use straight SQL, check out the [sql branch](https://github.com/marcominerva/SqlDatabaseVectorSearch/tree/sql).