HelpingAI offers flexible pricing plans to suit developers, businesses, and enterprises. Our emotionally intelligent AI is designed to be cost-effective while delivering superior performance.
Tokens are pieces of text that the AI processes. Understanding tokens helps you estimate costs:
~4 characters = 1 token (for English text)
1 word ≈ 1.3 tokens on average
Both input (your messages) and output (AI responses) count toward usage
Token Examples
Text
Approximate Tokens
"Hello!"
2 tokens
"How are you today?"
5 tokens
"I'm feeling overwhelmed with work"
7 tokens
A typical paragraph (100 words)
~130 tokens
A page of text (500 words)
~650 tokens
Estimating Costs
Example conversation:
Input: "I'm feeling anxious about my presentation tomorrow. Can you help me prepare?" (16 tokens)
Output: "I understand you're feeling anxious about your presentation - that's completely normal..." (150 tokens)
Cost calculation (Free/Pro tier):
Input cost: 16 tokens × 0.50/0.40 per 1M = 0.000008/0.0000064
Output cost: 150 tokens × 1.50/1.20 per 1M = 0.000225/0.00018
Total: ~0.000233/0.0001864 per conversation
Cost Optimization
1. Token Efficiency
HelpingAI is designed for efficiency:
5x fewer tokens than GPT-4 for similar quality
Chain of Recursive Thoughts reduces redundant processing
Smart caching for repeated patterns
2. Optimize Your Prompts
python
# Less efficient {#less-efficient}
messages = [
{"role": "user", "content": "Please help me understand this very complex mathematical concept that I'm struggling with and provide a detailed explanation with examples and step-by-step instructions"}
]
# More efficient {#more-efficient}
messages = [
{"role": "user", "content": "Explain calculus derivatives with examples"}
]
3. Use Appropriate Parameters
python
# For factual responses (lower cost) {#for-factual-responses-lower-cost}
response = client.chat.completions.create(
model="Dhanishtha-2.0-preview",
messages=messages,
temperature=0.3, # Lower temperature
max_tokens=100 # Limit response length
)
# For creative responses (higher cost but more creative) {#for-creative-responses-higher-cost-but-more-creative}
response = client.chat.completions.create(
model="Dhanishtha-2.0-preview",
messages=messages,
temperature=0.8, # Higher temperature
max_tokens=500 # Longer responses
)
4. Manage Context Length
python
def trim_conversation(messages, max_tokens=3000):
"""Keep conversation within token limits"""
# Estimate tokens (rough calculation)
total_tokens = sum(len(msg['content']) // 4 for msg in messages)
while total_tokens > max_tokens and len(messages) > 2:
# Remove oldest messages (keep system message)
if messages[1]['role'] != 'system':
messages.pop(1)
else:
messages.pop(2)
total_tokens = sum(len(msg['content']) // 4 for msg in messages)
return messages