Quick Start

Get started with Nebul's Private inference-api in minutes. The Private inference-api provides instant access to the latest language and other AI models, including Mistral Large 3, OpenAI's GPT-OSS, DeepSeek R1, Qwen3, and more, all through a single OpenAI-compatible interface. Deploy AI capabilities without the complexity of managing infrastructure, while maintaining full privacy and security for your data.

Prerequisites

Before you begin, you'll need:

A Nebul AI Studio with API access
An API key (get one from your Nebul AI Studio)
If using Python, you need to use Python version 3.9 or higher

Installation

Python
cURL

bash

pip install openai

The Nebul inference-api is fully compatible with the OpenAI SDK, so you can use the same library you're already familiar with.

Available Endpoints

Endpoint	Method	Description
`/v1/chat/completions`	POST	Chat-based completions (recommended)
`/v1/completions`	POST	Legacy text completions
`/v1/responses`	POST	Advanced responses with async support
`/v1/embeddings`	POST	Generate vector embeddings
`/v1/rerank`	POST	Rerank documents by relevance
`/v1/models`	GET	List available models
`/v1/audio/transcriptions`	POST	Speech to text
`/v1/audio/speech`	POST	Text to speech
`/v1/ocr`	POST	Optical character recognition
`/v1/images/generations`	POST	Image generation
`/v1/images/edits`	POST	Image editing
`/v1/images/variations`	POST	Image variations

List Available Models

Discover which models are available to your account:

Python
cURL

python
from openai import OpenAI

client = OpenAI(
    api_key="sk-your-api-key-here",
    base_url="https://api.inference.nebul.io/v1"
)

models = client.models.list()
for model in models.data:
    print(model.id)

bash

curl https://api.inference.nebul.io/v1/models \
  -H "Authorization: Bearer sk-your-api-key-here"

Your First LLM Request

Here's a simple example to get you started:

Python
cURL

python
from openai import OpenAI

client = OpenAI(
    api_key="sk-your-api-key-here",
    base_url="https://api.inference.nebul.io/v1"
)

# Replace openai/gpt-oss-120b with the model you want to use returned by the models endpoint
response = client.chat.completions.create(
    model="openai/gpt-oss-120b",
    messages=[
        {"role": "user", "content": "Why is privacy important in the age of AI?"}
    ]
)

print(response.choices[0].message.content)

bash
curl https://api.inference.nebul.io/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-api-key-here" \
  -d '{
    "model": "openai/gpt-oss-120b",
    "messages": [
      {
        "role": "user",
        "content": "Why is privacy important in the age of AI?"
      }
    ]
  }'

Replace sk-your-api-key-here with your actual API key from Nebul AI Studio.

tip

API keys always start with sk- ("secret key-"). Store them securely in environment variables rather than hardcoding.

Error Handling

API errors follow a standard format:

json
{
  "error": {
    "message": "Invalid API key provided",
    "type": "auth_error",
    "param": null,
    "code": "401"
  }
}

Common error codes:

Code	Meaning
`401`	Invalid or missing API key
`403`	Model not available for your account
`429`	Rate limit exceeded (see limits)
`500`	Server error — retry the request

What's Next?

Now that you've made your first request, explore more capabilities:

Examples - See more code examples and use cases
Models - Browse available models and their capabilities
API Reference - Explore the complete API documentation

Prerequisites​

Installation​

Available Endpoints​

List Available Models​

Your First LLM Request​

Error Handling​

What's Next?​