Skip to main content

2 posts tagged with "AI"

View all tags

Implementing RAG with Spring AI and Pinecone: A Practical Guide

· 4 min read
Kinser
Software Engineer

Introduction

Retrieval-Augmented Generation (RAG) has emerged as a powerful technique for building AI applications that combine information retrieval with generative language models. This guide demonstrates how to implement a RAG system using Spring AI with Pinecone as the vector database, specifically for creating a documentation chatbot.

What is RAG?

RAG combines two key components:

  1. Retrieval: Finds relevant information from a knowledge base using semantic search
  2. Generation: Uses a language model to generate contextual responses based on retrieved information

System Architecture

[Documentation Website] → [Scraper] → [Chunking] → [Pinecone Vector DB]

[User Query] → [Spring AI] → [Semantic Search] → [LLM Generation] → [Response]

Prerequisites

  • Pinecone account (free tier available)
  • Spring Boot application (3.x recommended)
  • Basic understanding of vector databases

Implementation Steps

1. Setting Up Pinecone Integration

Gradle Dependency

implementation "org.springframework.ai:spring-ai-pinecone-store-spring-boot-starter"

Configuration (application.yml)

spring:
ai:
vectorstore:
pinecone:
apiKey: ${PINECONE_API_KEY}
environment: ${PINECONE_ENV}
index-name: ${PINECONE_INDEX}
project-id: ${PINECONE_PROJECT_ID}

2. Document Processing Pipeline

Web Scraper Implementation

public class DocumentationScraper {
private final Set<String> visitedUrls = new HashSet<>();
private final String baseDomain;

public DocumentationScraper(String baseUrl) {
this.baseDomain = extractDomain(baseUrl);
}

public List<Document> scrape(String startUrl) {
List<Document> documents = new ArrayList<>();
scrapeRecursive(startUrl, documents);
return documents;
}

// Includes URL normalization, same-domain checking, and content extraction
// ... (full implementation as in original)
}

Document Chunking Service

@Service
public class DocumentationService {
private final VectorStore vectorStore;
private final TokenTextSplitter textSplitter;

public DocumentationService(VectorStore vectorStore) {
this.vectorStore = vectorStore;
this.textSplitter = new TokenTextSplitter(
2000, // Optimal chunk size for technical documentation
300, // Minimum chunk size
100, // Overlap for context preservation
15, // Max chunks per page
true // Preserve document structure
);
}

public List<Document> processDocument(String content, Map<String, Object> metadata) {
Document originalDoc = new Document(content, metadata);
List<Document> chunks = textSplitter.split(originalDoc);

// Enhance metadata for better retrieval
for (int i = 0; i < chunks.size(); i++) {
chunks.get(i).getMetadata()
.put("chunk_number", i)
.put("total_chunks", chunks.size());
}
return chunks;
}
}

3. Knowledge Base Initialization

REST Endpoint for Loading Data

@RestController
@RequestMapping("/document")
@Tag(name = "AI Module API")
public class DocumentController {

private final DocumentationService documentationService;

@PostMapping("/load-data")
public ResponseEntity<String> loadDocumentation() {
documentationService.scrapeAndStoreDocumentation("https://docs.openwes.top");
return ResponseEntity.ok("Documentation loaded successfully");
}
}

4. Implementing RAG in Chat Completions

@Service
public class ChatService {

private final ChatModel chatModel;
private final VectorStore vectorStore;

public String generateResponse(String query) {
SearchRequest searchRequest = SearchRequest.defaults()
.withTopK(5) // Retrieve top 5 relevant chunks
.withSimilarityThreshold(0.7);

return ChatClient.create(chatModel)
.prompt()
.advisors(new QuestionAnswerAdvisor(vectorStore, searchRequest))
.call()
.content();
}
}

Best Practices

  1. Optimal Chunking:
  • Technical content: 1500-2500 tokens
  • Narrative content: 500-1000 tokens
  • Include overlap (100-200 tokens) for context preservation
  1. Enhanced Metadata:

    metadata.put("document_type", "API Reference");
    metadata.put("last_updated", "2024-03-01");
    metadata.put("relevance_score", 0.95);
  2. Hybrid Search:

    SearchRequest hybridRequest = SearchRequest.defaults()
    .withTopK(5)
    .withHybridSearch(true)
    .withKeywordWeight(0.3);
  3. Prompt Engineering:

    PromptTemplate template = new PromptTemplate("""
    Answer the question based on the following context:
    {context}

    Question: {question}

    If you don't know the answer, say "I don't know".
    """);

Performance Optimization

  • Caching: Implement Redis caching for frequent queries
  • Async Processing: Use @Async for document ingestion
  • Batch Processing: Process documents in batches of 50-100

Evaluation Metrics

MetricTargetMeasurement Method
Retrieval Precision>85%Human evaluation
Response Latency<2sPerformance testing
User Satisfaction>4/5Feedback surveys

Conclusion

This implementation demonstrates how to build a production-ready RAG system using Spring AI and Pinecone. Key advantages include:

  1. Accurate, context-aware responses for documentation queries
  2. Scalable vector search capabilities
  3. Easy integration with existing Spring applications

Next Steps

  1. Implement user feedback mechanism:

    @PostMapping("/feedback")
    public void logFeedback(@RequestBody FeedbackDTO feedback) {
    // Store feedback for continuous improvement
    }
  2. Add analytics dashboard for query patterns

  3. Implement automatic periodic document updates


Project Reference: The complete implementation is available on GitHub in the module-ai package. Contributions and feedback are welcome!

Building Intelligent Applications with Model Context Protocol (MCP) and Spring AI

· 3 min read
Kinser
Software Engineer

Building Intelligent Applications with Model Context Protocol (MCP) and Spring AI

1. Introduction to MCP Architecture

MCP (Model Context Protocol) standardizes interactions between AI applications and external data sources, enabling seamless integration of tools like databases, APIs, and search engines. Its client-server architecture comprises:

  • MCP Host: The AI application layer (e.g., Claude chatbot) that users interact with.
  • MCP Client: Handles communication between the host and servers, formatting requests for external systems. img.png
  • MCP Server: Middleware that connects to external resources (e.g., PostgreSQL, Google Drive) and executes operations . img_1.png

2. Install mysql MCP Server

Step 1: You can check out this github repository: https://github.com/designcomputer/mysql_mcp_server
Using Manual Installation

pip install mysql-mcp-server

Step 2: Causing we will use uv tool, then we should install it
follow this article: https://docs.astral.sh/uv/getting-started/installation/#installation-methods and install uv

3. Project Setup with Spring AI

Step 1: Add Dependencies
Include Spring AI MCP libraries in build.gradle:

    implementation 'org.springframework.ai:spring-ai-mcp-client-spring-boot-starter'  
implementation 'org.springframework.ai:spring-ai-mcp-client-webflux-spring-boot-starter' // For SSE transport

Configure repositories for milestone builds .


4. Client Integration

Step 1: Configure Spring AI configuration in application.yml

spring:
ai:
mcp:
client:
enabled: true
name: mysqlMCP # MCP server name
version: 1.0.0
type: SYNC
request-timeout: 20s
stdio:
root-change-notification: true
servers-configuration: classpath:mcp-servers-config.json # MCP server config such/same as claude desktop configs.

Step 2: Add mcp-servers-config.json

{
"mcpServers": {
"mysql": {
"command": "C:\\Users\\xxx\\.local\\bin\\uv.exe",
"args": [
"--directory",
"C:\\Users\\xxx\\AppData\\Local\\Programs\\Python\\Python311\\Lib\\site-packages\\mysql_mcp_server",
"run",
"mysql_mcp_server"
],
"env": {
"MYSQL_HOST": "localhost",
"MYSQL_PORT": "3306",
"MYSQL_USER": "root",
"MYSQL_PASSWORD": "root",
"MYSQL_DATABASE": "test"
}
}
}
}

you should check the directories of the uv.exe and mysql_mcp_server, and check all mysql configurations.


5. Simple Example

The example will use MCP to interact with a MySQL database.

@SpringBootApplication(scanBasePackages = "org.openwes")
@EnableDiscoveryClient
public class AiApplication {

public static void main(String[] args) {
SpringApplication.run(AiApplication.class, args);
}

private String userInput = "show all tables";

@Bean
public CommandLineRunner predefinedQuestions(ChatClient.Builder chatClientBuilder, ToolCallbackProvider tools,
ConfigurableApplicationContext context) {
return args -> {

var chatClient = chatClientBuilder
.defaultTools(tools)
.build();

System.out.println("\n>>> QUESTION: " + userInput);
System.out.println("\n>>> ASSISTANT: " + chatClient.prompt(userInput).call().content());

context.close();
};
}

}

Then we will see logs that show it change natural language show all tables to SQL show all tables:

received: 2025-03-27 09:21:19,799 - mysql_mcp_server - INFO - Listing tools...

>>> QUESTION: show all tables

received: 2025-03-27 09:21:20,602 - mysql_mcp_server - INFO - Calling tool: execute_sql with arguments: {'query': 'show all tables'}

>>> ASSISTANT: 以下是在MySQL服务器上执行 `SHOW TABLES` 命令后返回的所有表名:

- a_api
- a_api_config
- a_api_key
- a_api_log
- d_domain_event
- e_container_task
- e_container_task_and_business_task_relation
- e_ems_location_config
- l_change_log
- l_change_log_lock
...

6. Benefits of Spring AI MCP

  • Declarative Tool Registration: Simplify integration using annotations instead of manual SDK configurations .
  • Unified Protocol: Eliminate data source fragmentation with standardized MCP communication .
  • Scalability: Add new tools (e.g., Meilisearch, Git) without disrupting existing workflows .

7. Conclusion

By combining Spring AI’s dependency management with MCP’s protocol standardization, developers can rapidly build enterprise-grade AI applications. For advanced use cases, explore hybrid architectures where MCP servers handle both real-time data and batch processing .


This article synthesizes the latest MCP advancements with Spring AI. For full code samples, refer to the linked sources.

The code is available on GitHub: GitHub - jingsewu/open-wes