Embeddings

Generate vector representations of text. Use embeddings for semantic search, RAG pipelines, clustering, and classification.

Endpoint

POST https://backend.sovereigneg.com/v1/embeddings

Request

Parameter	Type	Required	Description
`model`	string	Yes	Embedding model ID
`input`	string/array	Yes	Text to embed (string or array of strings)

from openai import OpenAI
 
client = OpenAI(
    api_key="sk-...",
    base_url="https://backend.sovereigneg.com/v1"
)
 
response = client.embeddings.create(
    model="Qwen/Qwen3-Embedding-0.6B",
    input="What is the meaning of life?"
)
 
embedding = response.data[0].embedding
print(f"Dimension: {len(embedding)}")  # 1024

Response

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.0023, -0.0091, 0.0152, ...]
    }
  ],
  "model": "Qwen/Qwen3-Embedding-0.6B",
  "usage": {
    "prompt_tokens": 8,
    "total_tokens": 8
  }
}

Batch embeddings

Embed multiple texts in one call for efficiency:

from openai import OpenAI
 
client = OpenAI(
    api_key="sk-...",
    base_url="https://backend.sovereigneg.com/v1"
)
 
response = client.embeddings.create(
    model="Qwen/Qwen3-Embedding-0.6B",
    input=[
        "First document about AI in Egypt",
        "Second document about renewable energy",
        "Third document about ancient history"
    ]
)
 
for item in response.data:
    print(f"Text {item.index}: {len(item.embedding)} dimensions")

The catalog of currently-live embedding models is at GET /v1/models?modality=embed (or filter the model list in the dashboard by Modality = embed). The set evolves; the table below is a recent snapshot.

Model	Dimensions	Best for
`Qwen/Qwen3-Embedding-0.6B`	1024	English text, RAG, search
`sentence-transformers/all-MiniLM-L6-v2`	384	Lightweight, latency-sensitive

cURL quick-test

The fastest way to confirm an embedding model is live and your API key is wired correctly:

curl https://backend.sovereigneg.com/v1/embeddings \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen3-Embedding-0.6B",
    "input": "Cairo is the capital of Egypt.",
    "encoding_format": "float"
  }' | jq '{dim: (.data[0].embedding | length), tokens: .usage.prompt_tokens}'
# expect: { "dim": 1024, "tokens": <small int> }

A non-embedding model id will return HTTP 400 with error.code = "model_not_supported_on_endpoint" — that's the catalog contract: chat models cannot be called against /v1/embeddings, and vice-versa.

Use cases

Semantic search: Find documents by meaning, not just keywords
RAG pipelines: Retrieve relevant context for chat completions
Classification: Cluster or categorize text by similarity
Deduplication: Find near-duplicate content in large datasets