Configuration of the AIHost settings is done in the provided AIHost/appsettings.json. The Standard ASP.NET Core JSON configuration is consumed by SystemWeaver.AIHost.exe. You can override per host via the appsettings.Local.json (gitignored) or environment variables. Keys that are prefixed with // in the shipped files are documentation comments, not settings. This article describes how to configure the settings.
Prerequisites
- The SWExtension.AIExtension folder is located in your Client's swExplorerExtensions directory
The values shown in the below appsettings.json example are the shipped defaults with the package.
Example
{
"Logging": {
"LogLevel": {
"Default": "Information",
"Microsoft.AspNetCore": "Warning"
},
"File": {
"Enabled": true,
"Directory": "%LOCALAPPDATA%\\SystemWeaver.AIHost\\logs",
"RollingInterval": "Day",
"RetainedFileCountLimit": 14
}
},
"Aife": {
"Port": 5001,
"Llm": {
"MaxOutputTokens": 32768,
"MaxToolCalls": 50
}
},
"Persistence": {
"Provider": "InMemory"
},
"Mcp": {
"Enabled": true,
"Servers": [],
"StreamableHttp": {
"AllowPlaintextHttp": false
}
},
"Chat": {
"Tools": {
"AllowMCPDiscoveredTools": false
},
"Logging": {
"DisableLogRedaction": false
}
},
"Llm": {
"Admission": {
"PerTenantRequestsPerSecond": 100,
"PerTenantBurst": 200,
"MaxConcurrentInFlight": 32,
"DefaultTenantKey": "default"
},
"Limits": {
"MaxInputBytes": 1048576,
"MaxInputTokensEstimate": 250000
},
"CircuitBreaker": {
"FailureThreshold": 5,
"ResetTimeout": "00:00:30",
"Enabled": true
},
"Routing": {
"ReservedSafetyTokens": 512
},
"AcceptedCredentialSources": "Runtime"
},
"ContextCompression": {
"Enabled": true,
"UseActiveModelWindow": true,
"CompressionTriggerRatio": 0.75,
"ReservedResponseTokens": 4000,
"MaxContextTokens": 200000,
"RecentMessageCount": 10,
"MaxMessageCount": 50,
"TargetCompressionRatio": 0.3,
"MaxSummaryTokens": 1000,
"CompressToolCalls": true
},
"LLMRequestSizing": {
"InputTokenBufferRatio": 0.15,
"MinBufferTokens": 256
}
}An example Mcp:Servers[] entry (omitted above because the shipped array is empty):
{
"Mcp": {
"Enabled": true,
"Servers": [
{
"Name": "sysw-mcp",
"Transport": "Stdio",
"Enabled": true,
"Command": "C:\\non-install\\sysw-mcp-host\\sysw-mcp.exe",
"Arguments": [],
"Environment": {}
},
{
"Name": "remote-tools",
"Transport": "StreamableHttp",
"Enabled": true,
"Endpoint": "https://localhost:7099/mcp"
}
],
"StreamableHttp": { "AllowPlaintextHttp": false }
}
}The per-section reference below documents each key, its default, and valid range.
Logging
| Key | Default | Description |
|---|
Logging:LogLevel:Default | Information | Default MEL level. |
Logging:LogLevel:Microsoft.AspNetCore | Warning | Framework log level. |
Logging:File:Enabled | true | Master switch for the Serilog file sink (Console/Debug stay on regardless). Degrades gracefully (warning to stderr) if the directory can’t be created. |
Logging:File:Directory | %LOCALAPPDATA%\SystemWeaver.AIHost\logs | Log directory. Environment variables (%LOCALAPPDATA%, %USERPROFILE%, …) are expanded; empty/null falls back to the default. |
Logging:File:RollingInterval | Day | Serilog RollingInterval: Infinite / Year / Month / Day / Hour / Minute. |
Logging:File:RetainedFileCountLimit | 14 | Max rolled files kept before the oldest is deleted (≈ two weeks at daily rolling). |
Aife (wire plane)
The AIFE endpoint is https://localhost:<Port>/ with mandatory client-cert mTLS; loopback bind covers IPv4 + IPv6. There is no Host key (loopback is hard-wired) and the named-pipe trust handshake is automatic (no key).
| Key | Default | Description |
|---|
Aife:Port | 5001 | HTTPS port. Must match the client’s hard-coded 5001(Fixed client connection facts (not configurable in XML)). |
Aife:Llm:MaxOutputTokens | 32768 | Per-request output-token cap forwarded to LLMGenerationOptions.MaxTokens. 0 = leave unset (router applies its 4096 default). |
Aife:Llm:MaxToolCalls | 50 | Per-interact tool-call budget. 0 disables the tool-call loop entirely. |
Persistence (session store)
| Key | Default | Description |
|---|
Persistence:Provider | InMemory | InMemory = RAM-only; chat/agent sessions live for the lifetime of the AIHost process and are lost on restart. This is the shipped default and the only mode the AIExtension currently supports — recovering persisted sessions from the AIExtension is not yet supported, so leave this on InMemory. |
Mcp (Model Context Protocol servers)
| Key | Default | Description |
|---|
Mcp:Enabled | true | Master switch for MCP wiring. false skips all MCP registration. |
Mcp:Servers | [] | Array of MCP server entries (schema below). Per-server startup failures are logged and skipped. Empty = MCP effectively disabled. |
Mcp:StreamableHttp:AllowPlaintextHttp | false | TLS guard. Strict by default — StreamableHttp servers with http:// endpoints are refused (startup MCPConfigurationException) unless this is true (Development only). Does not apply to Stdio transports. |
Mcp:Servers[] entry schema
| Field | Description |
|---|
Name | Identifier for the server. |
Transport | Stdio or StreamableHttp. |
Enabled | Per-server toggle. |
Command / Arguments / Environment | Stdio transport: the process to launch and its args/env. |
Endpoint | StreamableHttp transport: the server URL (must be https:// unless AllowPlaintextHttp=true). |
Note: MCP-discovered tools are additionally gated by Chat:Tools:AllowMCPDiscoveredTools (Chat).
Example: Stdio transport (launch a local MCP server process; no TLS guard applies):
"Mcp": {
"Enabled": true,
"Servers": [
{
"Name": "sysw-mcp",
"Transport": "Stdio",
"Enabled": true,
"Command": "C:\\non-install\\sysw-mcp-host\\sysw-mcp.exe",
"Arguments": [ "--mode", "stdio" ],
"Environment": {
"SYSW_MCP_LOGLEVEL": "Information"
}
}
]
}
} | Field | Stdio meaning |
|---|
Command | Absolute path to the server executable to spawn. |
Arguments | Array of command-line arguments passed to it. |
Environment | Object of extra environment variables for the child process. |
Endpoint | (unused for Stdio) |
Example: StreamableHttp transport (connect to an HTTP MCP server; subject to the TLS guard):
{
"Mcp": {
"Enabled": true,
"Servers": [
{
"Name": "remote-tools",
"Transport": "StreamableHttp",
"Enabled": true,
"Endpoint": "https://localhost:7099/mcp"
}
],
"StreamableHttp": {
"AllowPlaintextHttp": false
}
}
} | Field | StreamableHttp meaning |
|---|
Endpoint | Server URL. Must be https:// in production; an http:// endpoint is refused at startup (MCPConfigurationException) unless Mcp:StreamableHttp:AllowPlaintextHttp = true (Development only). |
Command / Arguments / Environment | (unused for StreamableHttp) |
A Servers[] array may mix both transports. Each entry is wired independently; a single server’s startup failure is logged and skipped (the rest still load).
Chat
| Key | Default | Description |
|---|
Chat:Tools:AllowMCPDiscoveredTools | false | Strict default: MCP-discovered tools are hidden from agents (closes the prompt-injection-via-tool-description vector). Set true (Development) to surface MCP tools to the tool-call loop. |
Chat:Logging:DisableLogRedaction | false | Leave false in production: a redacting logger replaces sensitive values (bearer tokens, API keys, [Sensitive] properties) with ***REDACTED***. Escape hatch for hosts that already redact at the sink. |
Llm (router-level guards)
Per-provider model settings do not live here (see Configuring the Per-Provider LLM Plugin Settings). These are router-level guards.
Llm:Admission (per-tenant rate-limit + concurrency; AIHost is single-tenant today)
| Key | Default | Description |
|---|
PerTenantRequestsPerSecond | 100 | Token-bucket refill rate per tenant. |
PerTenantBurst | 200 | Token-bucket burst capacity. |
MaxConcurrentInFlight | 32 | Concurrency cap at the LLMService boundary. |
DefaultTenantKey | default | Tenant key used until AIFE identity context is wired. |
| Key | Default | Description |
|---|
MaxInputBytes | 1048576 (1 MiB) | Hard input-payload ceiling. Raise for bulk-summarization workloads. |
MaxInputTokensEstimate | 250000 | Heuristic ceil(bytes / 4) ceiling; size against the smallest production model’s window. |
Llm:CircuitBreaker
Process-local breaker keyed by provider+model:
| Key | Default | Description |
|---|
FailureThreshold | 5 | Consecutive failures before opening. Raise for batch workloads. |
ResetTimeout | 00:00:30 | TimeSpan before a half-open retry. Shorten for local providers. |
Enabled | true | false disables the breaker entirely. |
Llm:Routing
| Key | Default | Description |
|---|
ReservedSafetyTokens | 512 | Extra headroom in the model-selection gate: MaxContextTokens >= EstimatedInputTokens + MaxTokens + ReservedSafetyTokens. 0 disables the reserve. |
Llm:AcceptedCredentialSources
| Key | Default | Description |
|---|
Llm:AcceptedCredentialSources | Runtime | Where LLM API keys may come from. Runtime = only the AIFE InitializeBackend wire (i.e. the AIExtension <LLMApiKeys> XML, Configuring the View); keys in plugin settings.json or env vars are ignored, and a provider with no injected key fails rather than silently using a disk/env key. ([Flags]; library default is All.) |
ContextCompression (proactive history compression)
Trigger: effectiveLimit = (int)(activeWindow * CompressionTriggerRatio) - ReservedResponseTokens, sized against the active model window (ILLMService.GetActiveModelCapabilities()).
| Key | Default | Description |
|---|
Enabled | true | Master switch; false bypasses compression entirely. |
UseActiveModelWindow | true | Size the trigger against the active model window; false (or unknown model) falls back to MaxContextTokens. |
CompressionTriggerRatio | 0.75 | Fraction of the window at which compression fires. Lower = earlier/more aggressive. |
ReservedResponseTokens | 4000 | Headroom reserved for the next response, subtracted from the trigger. |
MaxContextTokens | 200000 | Absolute fallback window when UseActiveModelWindow=false / model unknown. |
RecentMessageCount | 10 | Recent messages preserved verbatim; older ones folded into the summary. |
MaxMessageCount | 50 | Also fires when raw message count exceeds this, regardless of tokens. |
TargetCompressionRatio | 0.3 | Summary length target as a fraction of the effective limit. |
MaxSummaryTokens | 1000 | Hard cap on the summary’s output tokens. |
CompressToolCalls | true | true: tool-call turns are eligible for compression along with normal messages. false: all tool-call / tool-result exchanges are preserved verbatim (never summarized) — costlier in tokens but safer for tool-heavy sessions. |
How the parameters interact

Reading the knobs against the graph:
Enabled is the top gate — false short-circuits everything.UseActiveModelWindow + MaxContextTokens decide which window feeds the limit.CompressionTriggerRatio + ReservedResponseTokens set when compression fires (lower ratio / higher reserve → fires earlier).MaxMessageCount is a parallel count-based trigger (fires even if tokens stay under the limit).CompressToolCalls decides whether tool-call turns can be summarized (false = tool exchanges always kept verbatim, never compressed).RecentMessageCount sets how much is preserved verbatim; TargetCompressionRatio + MaxSummaryTokens size the summary that replaces the rest.
bufferedTokens = (int)(rawTokens * (1 + InputTokenBufferRatio)) + MinBufferTokens, clamped >= rawTokens. The buffered estimate becomes LLMGenerationOptions.EstimatedInputTokens used by the router’s window gate.
| Key | Default | Description |
|---|
InputTokenBufferRatio | 0.15 | Fractional buffer over the raw char-derived estimate. Raise (e.g. 0.25) if late-routing failures appear. |
MinBufferTokens | 256 | Token floor added to every buffered estimate. |
What's Next?
Refer to Configuration Manual for WeaverAI to determine if any additional configuration remains to be done.