Embeddings

Generate vector representations of text. Use embeddings for semantic search, RAG pipelines, clustering, and classification.

Endpoint

POST https://sovereigneg.com/v1/embeddings

Request

ParameterTypeRequiredDescription
modelstringYesEmbedding model ID
inputstring/arrayYesText to embed (string or array of strings)
response = client.embeddings.create(
    model="Qwen/Qwen3-Embedding-0.6B",
    input="What is the meaning of life?"
)
 
embedding = response.data[0].embedding
print(f"Dimension: {len(embedding)}")  # 1024

Response

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.0023, -0.0091, 0.0152, ...]
    }
  ],
  "model": "Qwen/Qwen3-Embedding-0.6B",
  "usage": {
    "prompt_tokens": 8,
    "total_tokens": 8
  }
}

Batch embeddings

Embed multiple texts in one call for efficiency:

response = client.embeddings.create(
    model="Qwen/Qwen3-Embedding-0.6B",
    input=[
        "First document about AI in Egypt",
        "Second document about renewable energy",
        "Third document about ancient history"
    ]
)
 
for item in response.data:
    print(f"Text {item.index}: {len(item.embedding)} dimensions")

Available models

The catalog of currently-live embedding models is at GET /v1/models?modality=embed (or filter the model list in the dashboard by Modality = embed). The set evolves; the table below is a recent snapshot.

ModelDimensionsBest for
Qwen/Qwen3-Embedding-0.6B1024English text, RAG, search
sentence-transformers/all-MiniLM-L6-v2384Lightweight, latency-sensitive

cURL quick-test

The fastest way to confirm an embedding model is live and your API key is wired correctly:

curl https://sovereigneg.com/v1/embeddings \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen3-Embedding-0.6B",
    "input": "Cairo is the capital of Egypt.",
    "encoding_format": "float"
  }' | jq '{dim: (.data[0].embedding | length), tokens: .usage.prompt_tokens}'
# expect: { "dim": 1024, "tokens": <small int> }

A non-embedding model id will return HTTP 400 with error.code = "model_not_supported_on_endpoint" — that's the catalog contract: chat models cannot be called against /v1/embeddings, and vice-versa.

Use cases

  • Semantic search: Find documents by meaning, not just keywords
  • RAG pipelines: Retrieve relevant context for chat completions
  • Classification: Cluster or categorize text by similarity
  • Deduplication: Find near-duplicate content in large datasets