Name: Embedding Store Manager
Version: 1.0.3
Language: Java 11
Build Tool: Gradle
License: Not Specified
Repository: GitHub - shing100/embeddingStoreManager
A Java library for efficiently caching and managing text embeddings using Elasticsearch as a persistent store. Designed to reduce API calls to embedding services by implementing a smart caching layer.
- ✅ Elasticsearch-based embedding cache with automatic index rotation
- ✅ REST API integration for embedding generation
- ✅ Configurable retention policies (default: 3 months)
- ✅ Text normalization and deduplication
- ✅ Bulk operations support
- ✅ Spring Boot integration ready
embeddingStoreManager/
├── 📄 README.md # Korean installation guide
├── 📄 CLAUDE.md # AI assistant guidance
├── 📄 PROJECT_INDEX.md # This file
├── 🔧 build.gradle # Build configuration
├── 🔧 settings.gradle # Project settings
├── 🔧 jitpack.yml # JitPack configuration
├── 📦 gradle/ # Gradle wrapper
│ └── wrapper/
│ ├── gradle-wrapper.jar
│ └── gradle-wrapper.properties
├── 🔨 gradlew # Unix build script
├── 🔨 gradlew.bat # Windows build script
└── 📂 src/
├── main/
│ ├── java/com/kingname/embeddingstoremanager/
│ │ ├── 🎯 Core Classes/
│ │ │ ├── EmbeddingCacheManager.java # Main orchestrator
│ │ │ ├── EmbeddingCacheManagerConfig.java # Configuration
│ │ │ └── HashGenerator.java # SHA-256 hashing
│ │ ├── 📦 Storage Layer/
│ │ │ ├── EmbeddingCacheStore.java # Storage interface
│ │ │ ├── ESEmbeddingCacheStore.java # Elasticsearch impl
│ │ │ └── ElasticSearchClientBuilder.java # ES client factory
│ │ ├── 🔌 Generation Layer/
│ │ │ ├── EmbeddingGenerator.java # Generator interface
│ │ │ └── RestEmbeddingGenerator.java # REST API impl
│ │ ├── ⚠️ Exception Handling/
│ │ │ ├── EmbeddingCacheManagerException.java
│ │ │ ├── ElasticSearchClientException.java
│ │ │ ├── EmbeddingCacheStoreException.java
│ │ │ ├── EmbeddingGeneratorException.java
│ │ │ ├── HashGeneratorException.java
│ │ │ └── RestEmbeddingGeneratorException.java
│ │ └── 📊 Value Objects/
│ │ ├── CachedEmbeddingDocument.java # Base document
│ │ ├── EsCachedEmbeddingDocument.java # ES document
│ │ ├── EmbeddingData.java # API data
│ │ ├── EmbeddingResponse.java # API response
│ │ └── EmbeddingUsage.java # Usage metrics
│ └── resources/
│ ├── mappings.json # ES index mappings
│ └── settings.json # ES index settings
└── test/
└── java/.../EmbeddingStoreManagerApplicationTests.java
Purpose: Main entry point for embedding cache operations
public class EmbeddingCacheManager {
// Constructors
EmbeddingCacheManager(EmbeddingCacheManagerConfig)
EmbeddingCacheManager(EmbeddingCacheManagerConfig, EmbeddingCacheStore)
EmbeddingCacheManager(EmbeddingCacheManagerConfig, EmbeddingGenerator)
// Core Methods
List<Double> getEmbedding(String text) // Get with cache
List<Double> getEmbeddingFromCache(String text) // Cache only
List<Double> generateEmbedding(String text) // Generate new
void storeEmbedding(String text, List<Double>) // Store single
void storeEmbeddings(List<CachedEmbeddingDocument>) // Store bulk
String normalize(String text) // Text normalization
}Purpose: Configuration management with builder pattern
@Builder
public class EmbeddingCacheManagerConfig {
List<String> elasticSearchCacheHosts // ES cluster hosts
Integer elasticSearchCachePort // ES port
String elasticSearchCacheAliasName // Index alias
Integer retentionMonth // Default: 3
String modelName // Embedding model
String embeddingApiUrl // API endpoint
Integer maxLength // Default: 3000
}public interface EmbeddingCacheStore {
List<Double> getCachedEmbedding(String text)
void storeEmbedding(String id, String text, List<Double> embedding)
void storeEmbeddings(List<CachedEmbeddingDocument> documents)
}public interface EmbeddingGenerator {
List<Double> generateEmbedding(String text)
}repositories {
maven { url 'https://jitpack.io' }
mavenCentral()
}
dependencies {
implementation 'com.github.shing100:embeddingStoreManager:1.0.3'
}@Configuration
public class EmbeddingConfig {
@Bean
public EmbeddingCacheManager embeddingCacheManager() {
return new EmbeddingCacheManager(
EmbeddingCacheManagerConfig.builder()
.elasticSearchCacheHosts(List.of("localhost"))
.elasticSearchCachePort(9200)
.elasticSearchCacheAliasName("embeddings")
.modelName("text-embedding-ada-002")
.embeddingApiUrl("https://api.openai.com/v1/embeddings")
.retentionMonth(3) // Optional, default: 3
.maxLength(3000) // Optional, default: 3000
.build()
);
}
}@Service
public class EmbeddingService {
@Autowired
private EmbeddingCacheManager cacheManager;
public List<Double> getTextEmbedding(String text) {
// Automatically checks cache, generates if missing
return cacheManager.getEmbedding(text);
}
public void preloadEmbeddings(List<String> texts) {
List<CachedEmbeddingDocument> documents = texts.stream()
.map(text -> CachedEmbeddingDocument.builder()
.text(text)
.embedding(generateEmbedding(text))
.build())
.collect(Collectors.toList());
cacheManager.storeEmbeddings(documents);
}
}| Library | Version | Purpose |
|---|---|---|
gson |
2.8.2 | JSON serialization |
guava |
31.1-jre | Hashing utilities |
jackson-databind |
2.14.2 | JSON processing |
elasticsearch-java |
8.8.2 | ES client |
elasticsearch-rest-client |
8.8.2 | REST client |
httpclient |
4.5.13 | HTTP operations |
commons-lang3 |
3.5 | String utilities |
lombok |
1.18.10 | Code generation |
| Library | Version | Purpose |
|---|---|---|
junit |
4.12 | Unit testing |
assertj-core |
3.24.2 | Test assertions |
-
Strategy Pattern
EmbeddingGeneratorinterface allows swapping generation strategiesEmbeddingCacheStoreinterface enables different storage backends
-
Builder Pattern
EmbeddingCacheManagerConfiguses Lombok's@Builder- Provides flexible configuration initialization
-
Repository Pattern
ESEmbeddingCacheStoreabstracts Elasticsearch operations- Hides persistence complexity from business logic
-
Dependency Injection
- Constructor-based DI for all components
- Spring Boot compatible design
graph LR
A[Client Request] --> B[EmbeddingCacheManager]
B --> C{Cache Check}
C -->|Hit| D[Return Cached]
C -->|Miss| E[RestEmbeddingGenerator]
E --> F[External API]
F --> G[Store in Cache]
G --> H[ESEmbeddingCacheStore]
H --> I[Elasticsearch]
G --> D
- Pattern:
{alias}-yyyy-MM(monthly rotation) - Retention: Configurable (default: 3 months)
- Alias Management: Automatic rollover
{
"id": "sha256_hash_of_text",
"text": "normalized_input_text",
"embedding": [0.123, -0.456, ...],
"model": "text-embedding-ada-002",
"timestamp": "2024-01-15T10:30:00Z"
}# Clean build
./gradlew clean build
# Run tests
./gradlew test
# Generate JAR
./gradlew jar
# Skip tests
./gradlew build -x test
# Publish to local Maven
./gradlew publishToMavenLocal- Group:
com.kingname - Artifact:
embeddingStoreManager - Java Version: 11
- Gradle Version: 7.x
# jitpack.yml
jdk:
- openjdk11- Lines of Code: ~1,500
- Classes: 19
- Methods: 75
- Cyclomatic Complexity: 3.2 avg
- JavaDoc: 0%
⚠️ - Inline Comments: 5%
- README: Basic (Korean)
⚠️ No input validation on API URLs⚠️ No authentication mechanism⚠️ Hardcoded timeouts- ✅ SHA-256 hashing for IDs
- Cache Lookup: O(1) with ES
- Embedding Generation: ~100-500ms (API dependent)
- Bulk Operations: Supported
- Connection Timeouts: 2s request, 3s socket
- No Logging Framework - Add SLF4J/Logback
- Resource Leaks - HTTP clients not properly closed
- No Tests - Missing unit and integration tests
- Add comprehensive JavaDoc documentation
- Implement connection pooling for HTTP clients
- Add circuit breaker for API calls
- Support for multiple embedding models
- Add health check endpoints
- Implement metrics collection
- Support for vector databases (Pinecone, Weaviate)
- Async/batch processing capabilities
- Compression for large embeddings
- Multi-tenancy support
- REST API wrapper for the library
- GitHub: shing100/embeddingStoreManager
- JitPack: com.github.shing100/embeddingStoreManager
- Issues: Report bugs via GitHub Issues
- Fork the repository
- Create a feature branch
- Add tests for new features
- Ensure all tests pass
- Submit a pull request
- Latest stable release on JitPack
- Build configuration updates
- Bug fixes
- See GitHub releases for complete history
Generated: 2024 | Last Updated: Check GitHub for latest changes