Embedding
General Note
Rely on external embeddings and vector stores. OpenAI APIs do not scale for retrieval.
Generate embeddings using the create
but store them externally with vector indices.
Embeddings API
Create Embedding
import openai client = OpenAI() file = client.files.create() # using file res = client.embeddings.create( model="text-embedding-ada-002", file_id="file-abc123" encoding_format="float" ) # using input res = client.embeddings.create( model="text-embedding-ada-002", input="my name is Sam", encoding_format="float" )
Response from calling the embedding create:
res = { "object": "list", "data": [ { "object": "embedding", "embedding": [ 0.0023064255, -0.009327292, .... (1536 floats total for ada-002) -0.0028842222, ], "index": 0 } ], "model": "text-embedding-ada-002", "usage": { "prompt_tokens": 8, "total_tokens": 8 } } data_vector = res['data'][0]['embedding']
Available Models (2025)
text-embedding-3-small text-embedding-3-large text-embedding-ada-002 # deprecated
Vector Stores
OpenAI also has its own vector stores.
- No need to use this.
- Mostly used for Assistants API.
# vector create vector_store = client.vector_stores.create( name="Support FAQ", ) client.vector_stores.files.upload_and_poll( vector_store_id=vector_store.id, file=open("customer_policies.txt", "rb") )