API Overview

HelpingAI provides a comprehensive REST API that's fully compatible with OpenAI's API format while adding unique Chain of Recursive Thoughts capabilities. This overview covers the core concepts, architecture, and features you need to understand to build with HelpingAI.

Architecture

Dhanishtha-2.0-preview Model

Our flagship model is built on a revolutionary architecture that combines:

Chain of Recursive Thoughts Engine: Reasons mid-response unlike traditional approach of reasoning on top and then answering
Token-Efficient Processing: Avoids redundant reasoning chains, optimized for performance and cost
OpenAI Compatibility: Drop-in replacement for existing applications
Advanced Tool Calling: Sophisticated function calling capabilities

API Design Philosophy

HelpingAI's API is designed with these principles:

Compatibility First: Works with existing OpenAI-compatible tools and libraries
Reasoning Transparency: Optional reasoning visibility through hideThink parameter
Performance: Optimized for speed and cost-effectiveness with 5x fewer tokens than Deepseek
Developer Experience: Clear documentation and helpful error messages
Efficiency: 4x faster inference through optimized processing

Core Concepts

Messages

All interactions use a message-based format:

json

{
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "How do I solve this complex problem?" },
    {
      "role": "assistant",
      "content": "Let me work through this step by step..."
    }
  ]
}

Message Roles:

system: Sets the AI's behavior and context
user: Messages from the human user
assistant: Messages from the AI
tool: Results from function calls (for tool calling)

Models

Currently available models:

Model	Description	Context Length	Best For
`Dhanishtha-2.0-preview`	Our flagship model with emotional intelligence	32,768 tokens	General use, emotional understanding

Tokens

Tokens are pieces of text that the model processes. Understanding tokens helps you:

Estimate costs: Pricing is based on token usage
Manage context: Stay within model limits
Optimize performance: Shorter prompts = faster responses

Token Guidelines:

~4 characters = 1 token (English)
1 word ≈ 1.3 tokens on average
Use our tokenizer tool to count tokens

Key Features

1. Emotional Intelligence

HelpingAI automatically detects and responds to emotional cues:

python

# Input with emotional context {#input-with-emotional-context}
messages = [
    {"role": "user", "content": "I just got rejected from my dream job and I'm devastated."}
]

# HelpingAI responds with empathy {#helpingai-responds-with-empathy}
# "I'm so sorry to hear about the job rejection. That must be incredibly disappointing, {#im-so-sorry-to-hear-about-the-job-rejection-that-must-be-incredibly-disappointing}
# especially when it was your dream job. It's completely natural to feel devastated right now..." {#especially-when-it-was-your-dream-job-its-completely-natural-to-feel-devastated-right-now}

2. Chain of Recursive Thoughts

See how the AI thinks with the hideThink parameter:

python

response = client.chat.completions.create(
    model="Dhanishtha-2.0-preview",
    messages=[{"role": "user", "content": "What's the best approach to solve climate change?"}],
    hideThink=False  # Shows <think>...</think> content
)

When hideThink=False, you'll see the AI's reasoning process in <think> tags before the final response.

3. Streaming

Get responses in real-time as they're generated:

python

stream = client.chat.completions.create(
    model="Dhanishtha-2.0-preview",
    messages=[{"role": "user", "content": "Write a poem about hope"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

4. Tool Calling

Execute functions and access external data:

python

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "City name"}
            },
            "required": ["location"]
        }
    }
}]

response = client.chat.completions.create(
    model="Dhanishtha-2.0-preview",
    messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
    tools=tools,
    tool_choice="auto"
)

API Endpoints

Core Endpoints

Endpoint	Method	Description
`/v1/chat/completions`	POST	Generate chat completions
`/v1/models`	GET	List available models

Authentication

All requests require an API key in the Authorization header:

Authorization: Bearer YOUR_API_KEY

Error Handling

HelpingAI returns standard HTTP status codes and detailed error messages:

json

{
  "error": {
    "message": "Invalid API key provided",
    "type": "authentication_error",
    "code": "invalid_api_key"
  }
}

Common Error Codes:

400: Bad Request - Invalid parameters
401: Unauthorized - Invalid API key
429: Too Many Requests - Rate limit exceeded
500: Internal Server Error - Server issue

Best Practices

1. Optimize for Emotional Context

Provide context about the user's emotional state:

python

# Good: Provides emotional context {#good-provides-emotional-context}
messages = [
    {"role": "system", "content": "The user is feeling anxious about an upcoming presentation."},
    {"role": "user", "content": "Can you help me prepare?"}
]

# Better: Let the user express emotions naturally {#better-let-the-user-express-emotions-naturally}
messages = [
    {"role": "user", "content": "I'm really nervous about my presentation tomorrow. Can you help me prepare?"}
]

2. Use System Messages Effectively

Set clear context and behavior:

python

messages = [
    {
        "role": "system",
        "content": "You are a supportive career counselor. Be empathetic and provide practical advice."
    },
    {"role": "user", "content": "I'm thinking about changing careers but I'm scared."}
]

3. Handle Streaming Gracefully

Always handle potential errors in streaming:

python

try:
    stream = client.chat.completions.create(
        model="Dhanishtha-2.0-preview",
        messages=messages,
        stream=True
    )

    for chunk in stream:
        if chunk.choices[0].delta.content:
            print(chunk.choices[0].delta.content, end="")

except Exception as e:
    print(f"Streaming error: {e}")

4. Monitor Token Usage

Track usage to optimize costs:

python

response = client.chat.completions.create(
    model="Dhanishtha-2.0-preview",
    messages=messages
)

usage = response.usage
print(f"Prompt tokens: {usage.prompt_tokens}")
print(f"Completion tokens: {usage.completion_tokens}")
print(f"Total tokens: {usage.total_tokens}")

SDKs and Libraries

Official SDKs

Python SDK: pip install helpingai
JavaScript SDK: npm install helpingai

Compatible Libraries

Since HelpingAI is OpenAI-compatible, you can use:

OpenAI Python: Just change the base_url
OpenAI Node.js: Just change the baseURL
LangChain: Use with OpenAI provider
LlamaIndex: Compatible with OpenAI integration

Next Steps

Ready to dive deeper? Explore these resources:

Chat Completions API - Complete API reference
Streaming Guide - Real-time responses
Tool Calling Guide - Function calling
Chain of Recursive Thoughts - Understanding AI thoughts
Error Handling - Robust error management

Support

Need help? We're here for you:

Documentation - Comprehensive guides
Discord Community - Developer discussions
Support Email - Direct assistance
Status Page - API status and updates