API Documentation

Everything you need to integrate BgGPT into your application.

Don't have an API key? Fill out the access request form and your key will be provisioned automatically.

Getting Started

The BgGPT API is fully compatible with the OpenAI SDK. Any library or tool that supports the OpenAI Chat Completions API works with BgGPT.

Endpoint	`https://api.bggpt.ai/v1`
Model	`bggpt-gemma-3-27b-fp8`
Auth	`Authorization: Bearer YOUR_API_KEY`
Context	Up to 65,536 tokens
Quantization	FP8 dynamic (same quality, 2x faster)

Install the OpenAI Python SDK:
pip install openai
Chat Completions
Python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.bggpt.ai/v1",
    api_key="YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="bggpt-gemma-3-27b-fp8",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Кога е основан Софийският университет?"}
    ],
    max_tokens=16384,
    temperature=0.2,
)
print(response.choices[0].message.content)
curl
curl -X POST https://api.bggpt.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "bggpt-gemma-3-27b-fp8",
    "messages": [
      {"role": "user", "content": "Кога е основан Софийският университет?"}
    ],
    "max_tokens": 16384,
    "temperature": 0.2
  }'
Streaming
Set stream=True to receive tokens as they are generated:
stream = client.chat.completions.create(
    model="bggpt-gemma-3-27b-fp8",
    messages=[
        {"role": "user", "content": "Кога е основан Софийският университет?"}
    ],
    max_tokens=1024,
    stream=True,
)

for chunk in stream:
    delta = chunk.choices[0].delta
    if delta.content:
        print(delta.content, end="", flush=True)
Tool Calling
BgGPT supports OpenAI-compatible tool calling. Define tools as JSON schemas, and the model will decide when to call them.
1. Define tools and send request
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
                },
                "required": ["city"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="bggpt-gemma-3-27b-fp8",
    messages=[
        {"role": "user", "content": "What's the weather in Sofia?"}
    ],
    tools=tools,
    tool_choice="auto",
)
2. Check for tool calls
message = response.choices[0].message

if message.tool_calls:
    for tool_call in message.tool_calls:
        print(f"Function: {tool_call.function.name}")
        print(f"Arguments: {tool_call.function.arguments}")
else:
    print(message.content)
3. Send tool results back
import json

# After calling your function with the model's arguments:
tool_result = {"temperature": 22, "unit": "celsius", "condition": "sunny"}

response = client.chat.completions.create(
    model="bggpt-gemma-3-27b-fp8",
    messages=[
        {"role": "user", "content": "What's the weather in Sofia?"},
        message,  # the assistant message with tool_calls
        {
            "role": "tool",
            "tool_call_id": message.tool_calls[0].id,
            "content": json.dumps(tool_result),
        }
    ],
    tools=tools,
)
print(response.choices[0].message.content)
Supported tool_choice values:
"auto" — model decides whether to call a tool (default)
"required" — model must call at least one tool
"none" — tool calling disabled for this request
Vision
BgGPT 3.0 is multimodal and can understand images. Send images as base64 data URLs.
Base64 image
import base64

with open("photo.jpg", "rb") as f:
    b64 = base64.b64encode(f.read()).decode()

response = client.chat.completions.create(
    model="bggpt-gemma-3-27b-fp8",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this image."},
                {
                    "type": "image_url",
                    "image_url": {"url": f"data:image/jpeg;base64,{b64}"}
                }
            ]
        }
    ],
    max_tokens=1024,
)
Model Information
BgGPT-Gemma-3-27B-IT (based on gemma-3-27b-it)
FP8 dynamic quantization — uses about half the memory for weights compared to BF16, with minimal quality loss
65,536 tokens context window (the base model supports up to 131,072)
Capabilities:Text generation
Instruction following
Tool calling
Vision (image understanding)
Multi-turn conversations
Read more about the model on the BgGPT 3.0 blog
Notes
The API is OpenAI-compatible. Any SDK or framework that supports OpenAI (LangChain, LlamaIndex, etc.) works with BgGPT by changing base_url and api_key.
Your API key can be managed in your account settings.
To request API access, use the API access request form.
To run the model locally, see the HuggingFace instructions.

API документация
Всичко необходимо за интеграция на BgGPT.
Нямате API ключ? Попълнете формуляра за заявка и ключът Ви ще бъде предоставен автоматично.
Начало Чат Стрийминг Функции Изображения Модел
Начало
BgGPT API е напълно съвместим с OpenAI SDK. Всяка библиотека или инструмент, поддържащ OpenAI Chat Completions API, работи с BgGPT.
Адрес https://api.bggpt.ai/v1
Модел bggpt-gemma-3-27b-fp8
Автентикация Authorization: Bearer YOUR_API_KEY
Контекст До 65 536 токена
Квантизация FP8 dynamic
Инсталирайте OpenAI Python SDK:
pip install openai
Чат
Python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.bggpt.ai/v1",
    api_key="YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="bggpt-gemma-3-27b-fp8",
    messages=[
        {"role": "system", "content": "Ти си полезен асистент."},
        {"role": "user", "content": "Кога е основан Софийският университет?"}
    ],
    max_tokens=16384,
    temperature=0.2,
)
print(response.choices[0].message.content)
curl
curl -X POST https://api.bggpt.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "bggpt-gemma-3-27b-fp8",
    "messages": [
      {"role": "user", "content": "Кога е основан Софийският университет?"}
    ],
    "max_tokens": 16384,
    "temperature": 0.2
  }'
Стрийминг
Задайте stream=True, за да получавате токените в реално време:
stream = client.chat.completions.create(
    model="bggpt-gemma-3-27b-fp8",
    messages=[
        {"role": "user", "content": "Кога е основан Софийският университет?"}
    ],
    max_tokens=1024,
    stream=True,
)

for chunk in stream:
    delta = chunk.choices[0].delta
    if delta.content:
        print(delta.content, end="", flush=True)
Извикване на функции
BgGPT поддържа извикване на функции, съвместимо с OpenAI. Дефинирайте инструменти като JSON схеми и моделът ще реши кога да ги извика.
1. Дефиниране на инструменти и заявка
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Взима текущото време за даден град",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "Име на града"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
                },
                "required": ["city"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="bggpt-gemma-3-27b-fp8",
    messages=[
        {"role": "user", "content": "Какво е времето в София?"}
    ],
    tools=tools,
    tool_choice="auto",
)
2. Проверка за извикани функции
message = response.choices[0].message

if message.tool_calls:
    for tool_call in message.tool_calls:
        print(f"Функция: {tool_call.function.name}")
        print(f"Аргументи: {tool_call.function.arguments}")
else:
    print(message.content)
3. Изпращане на резултатите обратно
import json

# След извикване на вашата функция с аргументите от модела:
tool_result = {"temperature": 22, "unit": "celsius", "condition": "sunny"}

response = client.chat.completions.create(
    model="bggpt-gemma-3-27b-fp8",
    messages=[
        {"role": "user", "content": "Какво е времето в София?"},
        message,  # съобщението от асистента с tool_calls
        {
            "role": "tool",
            "tool_call_id": message.tool_calls[0].id,
            "content": json.dumps(tool_result),
        }
    ],
    tools=tools,
)
print(response.choices[0].message.content)
Поддържани стойности за tool_choice:
"auto" — моделът решава дали да извика функция (по подразбиране)
"required" — моделът трябва да извика поне една функция
"none" — извикването на функции е изключено за тази заявка
Изображения
BgGPT 3.0 е мултимодален и може да приема изображения. Изпратете ги в base64 формат.
Base64 изображение
import base64

with open("photo.jpg", "rb") as f:
    b64 = base64.b64encode(f.read()).decode()

response = client.chat.completions.create(
    model="bggpt-gemma-3-27b-fp8",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Опиши това изображение."},
                {
                    "type": "image_url",
                    "image_url": {"url": f"data:image/jpeg;base64,{b64}"}
                }
            ]
        }
    ],
    max_tokens=1024,
)
Информация за модела
BgGPT-Gemma-3-27B-IT (базиран на gemma-3-27b-it)
FP8 dynamic квантизация — използва около половината от паметта за тежестите спрямо BF16, с минимална загуба на качество
65 536 токена контекстов прозорец (базовият модел поддържа до 131 072)
Възможности:Генериране на текст
Следване на инструкции
Извикване на функции
Разбиране на изображения
Разговори с множество ходове
Прочетете повече за модела на Блог за BgGPT 3.0
Бележки
API-то е напълно съвместимо с OpenAI. Всяко SDK или рамка, поддържаща OpenAI (LangChain, LlamaIndex и др.), работи с BgGPT като се промени base_url и api_key.
API ключът може да се управлява в настройките на профила Ви.
Нямате API ключ? Попълнете формуляра за заявка и ключът Ви ще бъде предоставен автоматично.
За локално стартиране на модела вижте инструкциите в HuggingFace.

Адрес	`https://api.bggpt.ai/v1`
Модел	`bggpt-gemma-3-27b-fp8`
Автентикация	`Authorization: Bearer YOUR_API_KEY`
Контекст	До 65 536 токена
Квантизация	FP8 dynamic