API Overview

HelpingAI provides a comprehensive REST API that's fully compatible with OpenAI's API format while adding unique Chain of Recursive Thoughts capabilities. This overview covers the core concepts, architecture, and features you need to understand to build with HelpingAI.

Architecture

Dhanishtha-2.0-preview Model

Our flagship model is built on a revolutionary architecture that combines:

  • Chain of Recursive Thoughts Engine: Reasons mid-response unlike traditional approach of reasoning on top and then answering
  • Token-Efficient Processing: Avoids redundant reasoning chains, optimized for performance and cost
  • OpenAI Compatibility: Drop-in replacement for existing applications
  • Advanced Tool Calling: Sophisticated function calling capabilities

API Design Philosophy

HelpingAI's API is designed with these principles:

  1. Compatibility First: Works with existing OpenAI-compatible tools and libraries
  2. Reasoning Transparency: Optional reasoning visibility through hideThink parameter
  3. Performance: Optimized for speed and cost-effectiveness with 5x fewer tokens than Deepseek
  4. Developer Experience: Clear documentation and helpful error messages
  5. Efficiency: 4x faster inference through optimized processing

Core Concepts

Messages

All interactions use a message-based format:

json
{
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "How do I solve this complex problem?" },
    {
      "role": "assistant",
      "content": "Let me work through this step by step..."
    }
  ]
}

Message Roles:

  • system: Sets the AI's behavior and context
  • user: Messages from the human user
  • assistant: Messages from the AI
  • tool: Results from function calls (for tool calling)

Models

Currently available models:

ModelDescriptionContext LengthBest For
Dhanishtha-2.0-previewOur flagship model with emotional intelligence32,768 tokensGeneral use, emotional understanding

Tokens

Tokens are pieces of text that the model processes. Understanding tokens helps you:

  • Estimate costs: Pricing is based on token usage
  • Manage context: Stay within model limits
  • Optimize performance: Shorter prompts = faster responses

Token Guidelines:

  • ~4 characters = 1 token (English)
  • 1 word ≈ 1.3 tokens on average
  • Use our tokenizer tool to count tokens

Key Features

1. Emotional Intelligence

HelpingAI automatically detects and responds to emotional cues:

python
# Input with emotional context {#input-with-emotional-context}
messages = [
    {"role": "user", "content": "I just got rejected from my dream job and I'm devastated."}
]

# HelpingAI responds with empathy {#helpingai-responds-with-empathy}
# "I'm so sorry to hear about the job rejection. That must be incredibly disappointing, {#im-so-sorry-to-hear-about-the-job-rejection-that-must-be-incredibly-disappointing}
# especially when it was your dream job. It's completely natural to feel devastated right now..." {#especially-when-it-was-your-dream-job-its-completely-natural-to-feel-devastated-right-now}

2. Chain of Recursive Thoughts

See how the AI thinks with the hideThink parameter:

python
response = client.chat.completions.create(
    model="Dhanishtha-2.0-preview",
    messages=[{"role": "user", "content": "What's the best approach to solve climate change?"}],
    hideThink=False  # Shows <think>...</think> content
)

When hideThink=False, you'll see the AI's reasoning process in <think> tags before the final response.

3. Streaming

Get responses in real-time as they're generated:

python
stream = client.chat.completions.create(
    model="Dhanishtha-2.0-preview",
    messages=[{"role": "user", "content": "Write a poem about hope"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

4. Tool Calling

Execute functions and access external data:

python
tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "City name"}
            },
            "required": ["location"]
        }
    }
}]

response = client.chat.completions.create(
    model="Dhanishtha-2.0-preview",
    messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
    tools=tools,
    tool_choice="auto"
)

API Endpoints

Core Endpoints

EndpointMethodDescription
/v1/chat/completionsPOSTGenerate chat completions
/v1/modelsGETList available models

Authentication

All requests require an API key in the Authorization header:

Authorization: Bearer YOUR_API_KEY

Error Handling

HelpingAI returns standard HTTP status codes and detailed error messages:

json
{
  "error": {
    "message": "Invalid API key provided",
    "type": "authentication_error",
    "code": "invalid_api_key"
  }
}

Common Error Codes:

  • 400: Bad Request - Invalid parameters
  • 401: Unauthorized - Invalid API key
  • 429: Too Many Requests - Rate limit exceeded
  • 500: Internal Server Error - Server issue

Best Practices

1. Optimize for Emotional Context

Provide context about the user's emotional state:

python
# Good: Provides emotional context {#good-provides-emotional-context}
messages = [
    {"role": "system", "content": "The user is feeling anxious about an upcoming presentation."},
    {"role": "user", "content": "Can you help me prepare?"}
]

# Better: Let the user express emotions naturally {#better-let-the-user-express-emotions-naturally}
messages = [
    {"role": "user", "content": "I'm really nervous about my presentation tomorrow. Can you help me prepare?"}
]

2. Use System Messages Effectively

Set clear context and behavior:

python
messages = [
    {
        "role": "system",
        "content": "You are a supportive career counselor. Be empathetic and provide practical advice."
    },
    {"role": "user", "content": "I'm thinking about changing careers but I'm scared."}
]

3. Handle Streaming Gracefully

Always handle potential errors in streaming:

python
try:
    stream = client.chat.completions.create(
        model="Dhanishtha-2.0-preview",
        messages=messages,
        stream=True
    )

    for chunk in stream:
        if chunk.choices[0].delta.content:
            print(chunk.choices[0].delta.content, end="")

except Exception as e:
    print(f"Streaming error: {e}")

4. Monitor Token Usage

Track usage to optimize costs:

python
response = client.chat.completions.create(
    model="Dhanishtha-2.0-preview",
    messages=messages
)

usage = response.usage
print(f"Prompt tokens: {usage.prompt_tokens}")
print(f"Completion tokens: {usage.completion_tokens}")
print(f"Total tokens: {usage.total_tokens}")

SDKs and Libraries

Official SDKs

Compatible Libraries

Since HelpingAI is OpenAI-compatible, you can use:

  • OpenAI Python: Just change the base_url
  • OpenAI Node.js: Just change the baseURL
  • LangChain: Use with OpenAI provider
  • LlamaIndex: Compatible with OpenAI integration

Next Steps

Ready to dive deeper? Explore these resources:

Support

Need help? We're here for you: