CapyDB Extended JSON (EmbJSON) is a set of special data types that make working with AI and embeddings simple. With EmbJSON, you can store text, images, and other media in your database and have them automatically embedded for semantic search—without setting up separate vector databases or pipelines.
See how easy it is to use EmbJSON in your applications:
from capydb import EmbText, EmbImage
# Create a document with embedded fields
document = {
"title": "My First Document",
# EmbText automatically embeds the text for semantic search
"description": EmbText("This is a detailed description that will be embedded for semantic search"),
# EmbImage embeds the image data (base64 encoded)
"thumbnail": EmbImage(
data="base64_encoded_image_data",
mime_type="image/jpeg"
)
}
# Store in CapyDB - embedding and indexing happens automatically
collection.insert_one(document)
# Later, search semantically across all embedded fields
results = collection.find({"$semanticSearch": "design principles"})
Traditional approaches require separate systems for storing data and embeddings. EmbJSON unifies them, eliminating the need to maintain vector databases alongside document stores. Write your data once, query it semantically.
Pro Tip: EmbJSON types handle customization options like chunk sizes and embedding models. Start simple and refine as your needs evolve.
Your feedback helps us improve our documentation. Let us know what you think!