Skip to main content

Managing Token Usage

Sulla Desktop actively optimizes token usage in the system, but input quality still matters.

If you paste low-quality or oversized content, token usage can spike with little value.

Most important rule

Only send the model the context it actually needs.

Large context windows are not always better. Massive inputs can increase cost, slow responses, and cause workflows to fail.

Be careful what you paste

Avoid pasting raw content like:

Full HTML pages
Microsoft Word content with hidden formatting/characters
Large copied blobs with repetitive or irrelevant text

These often add hidden tokens that provide no useful signal to the model.

Better paste workflow

Before pasting content:

Clean and simplify text first.
Remove page chrome, scripts, and markup junk.
Keep only the specific section relevant to your task.
Prefer plain text when possible.

Why oversized context causes problems

Dumping huge context can:

Increase token spend quickly
Reduce response quality (more noise, less signal)
Slow down tool execution
Hit model/context limits and break task completion

Practical ways to reduce token usage

Break large tasks into smaller steps
Send summaries instead of full documents
Ask follow-up questions with focused context
Reuse concise system instructions
Keep prompts specific and action-oriented

Recommended team habits

Treat tokens like a resource budget
Paste only what is necessary for the current step
Prefer "small context, multiple turns" over "massive one-shot prompt"

Quick checklist

Before sending a prompt, check:

Is this the minimum context needed?
Did I remove hidden-formatting sources (Word/HTML junk)?
Can I summarize this instead of pasting everything?
Is my request specific enough to avoid unnecessary back-and-forth?

Most important rule
Be careful what you paste
Better paste workflow
Why oversized context causes problems
Practical ways to reduce token usage
Recommended team habits
Quick checklist