Configuration of the AIHost settings is done in the provided AIHost/appsettings.json. The Standard ASP.NET Core JSON configuration is consumed by SystemWeaver.AIHost.exe. You can override per host via the appsettings.Local.json (gitignored) or environment variables. Keys that are prefixed with // in the shipped files are documentation comments, not settings. This article describes how to configure the settings.

Prerequisites

The SWExtension.AIExtension folder is located in your Client's swExplorerExtensions directory

Example
Logging
Aife (wire plane)
Persistence (session store)
Mcp (Model Context Protocol servers)
- Mcp:Servers[] entry schema
Chat
Llm (router-level guards)
ContextCompression (proactive history compression)
- How the parameters interact
  - Reading the knobs against the graph:
LLMRequestSizing (input-estimate buffering)
What's Next?

The values shown in the below appsettings.json example are the shipped defaults with the package.

Example

{
  "Logging": {
    "LogLevel": {
      "Default": "Information",
      "Microsoft.AspNetCore": "Warning"
    },
    "File": {
      "Enabled": true,
      "Directory": "%LOCALAPPDATA%\\SystemWeaver.AIHost\\logs",
      "RollingInterval": "Day",
      "RetainedFileCountLimit": 14
    }
  },
  "Aife": {
    "Port": 5001,
    "Llm": {
      "MaxOutputTokens": 32768,
      "MaxToolCalls": 50
    }
  },
  "Persistence": {
    "Provider": "InMemory"
  },
  "Mcp": {
    "Enabled": true,
    "Servers": [],
    "StreamableHttp": {
      "AllowPlaintextHttp": false
    }
  },
  "Chat": {
    "Tools": {
      "AllowMCPDiscoveredTools": false
    },
    "Logging": {
      "DisableLogRedaction": false
    }
  },
  "Llm": {
    "Admission": {
      "PerTenantRequestsPerSecond": 100,
      "PerTenantBurst": 200,
      "MaxConcurrentInFlight": 32,
      "DefaultTenantKey": "default"
    },
    "Limits": {
      "MaxInputBytes": 1048576,
      "MaxInputTokensEstimate": 250000
    },
    "CircuitBreaker": {
      "FailureThreshold": 5,
      "ResetTimeout": "00:00:30",
      "Enabled": true
    },
    "Routing": {
      "ReservedSafetyTokens": 512
    },
    "AcceptedCredentialSources": "Runtime"
  },
  "ContextCompression": {
    "Enabled": true,
    "UseActiveModelWindow": true,
    "CompressionTriggerRatio": 0.75,
    "ReservedResponseTokens": 4000,
    "MaxContextTokens": 200000,
    "RecentMessageCount": 10,
    "MaxMessageCount": 50,
    "TargetCompressionRatio": 0.3,
    "MaxSummaryTokens": 1000,
    "CompressToolCalls": true
  },
  "LLMRequestSizing": {
    "InputTokenBufferRatio": 0.15,
    "MinBufferTokens": 256
  }
}

An example Mcp:Servers[] entry (omitted above because the shipped array is empty):

{
  "Mcp": {
    "Enabled": true,
    "Servers": [
      {
        "Name": "sysw-mcp",
        "Transport": "Stdio",
        "Enabled": true,
        "Command": "C:\\non-install\\sysw-mcp-host\\sysw-mcp.exe",
        "Arguments": [],
        "Environment": {}
      },
      {
        "Name": "remote-tools",
        "Transport": "StreamableHttp",
        "Enabled": true,
        "Endpoint": "https://localhost:7099/mcp"
      }
    ],
    "StreamableHttp": { "AllowPlaintextHttp": false }
  }
}

The per-section reference below documents each key, its default, and valid range.

Logging

Key	Default	Description
`Logging:LogLevel:Default`	`Information`	Default MEL level.
`Logging:LogLevel:Microsoft.AspNetCore`	`Warning`	Framework log level.
`Logging:File:Enabled`	`true`	Master switch for the Serilog file sink (Console/Debug stay on regardless). Degrades gracefully (warning to stderr) if the directory can’t be created.
`Logging:File:Directory`	`%LOCALAPPDATA%\SystemWeaver.AIHost\logs`	Log directory. Environment variables (`%LOCALAPPDATA%`, `%USERPROFILE%`, …) are expanded; empty/null falls back to the default.
`Logging:File:RollingInterval`	`Day`	Serilog `RollingInterval`: `Infinite` / `Year` / `Month` / `Day` / `Hour` / `Minute`.
`Logging:File:RetainedFileCountLimit`	`14`	Max rolled files kept before the oldest is deleted (≈ two weeks at daily rolling).

Aife (wire plane)

The AIFE endpoint is https://localhost:<Port>/ with mandatory client-cert mTLS; loopback bind covers IPv4 + IPv6. There is no Host key (loopback is hard-wired) and the named-pipe trust handshake is automatic (no key).

Key	Default	Description
`Aife:Port`	`5001`	HTTPS port. Must match the client’s hard-coded 5001(Fixed client connection facts (not configurable in XML)).
`Aife:Llm:MaxOutputTokens`	`32768`	Per-request output-token cap forwarded to `LLMGenerationOptions.MaxTokens`. `0` = leave unset (router applies its 4096 default).
`Aife:Llm:MaxToolCalls`	`50`	Per-interact tool-call budget. `0` disables the tool-call loop entirely.

Persistence (session store)

Key	Default	Description
`Persistence:Provider`	`InMemory`	`InMemory` = RAM-only; chat/agent sessions live for the lifetime of the AIHost process and are lost on restart. This is the shipped default and the only mode the AIExtension currently supports — recovering persisted sessions from the AIExtension is not yet supported, so leave this on `InMemory`.

Mcp (Model Context Protocol servers)

Key	Default	Description
`Mcp:Enabled`	`true`	Master switch for MCP wiring. `false` skips all MCP registration.
`Mcp:Servers`	`[]`	Array of MCP server entries (schema below). Per-server startup failures are logged and skipped. Empty = MCP effectively disabled.
`Mcp:StreamableHttp:AllowPlaintextHttp`	`false`	TLS guard. Strict by default — `StreamableHttp` servers with `http://` endpoints are refused (startup `MCPConfigurationException`) unless this is `true` (Development only). Does not apply to `Stdio` transports.

Mcp:Servers[] entry schema

Field	Description
`Name`	Identifier for the server.
`Transport`	`Stdio` or `StreamableHttp`.
`Enabled`	Per-server toggle.
`Command` / `Arguments` / `Environment`	`Stdio` transport: the process to launch and its args/env.
`Endpoint`	`StreamableHttp` transport: the server URL (must be `https://` unless `AllowPlaintextHttp=true`).

Note: MCP-discovered tools are additionally gated by Chat:Tools:AllowMCPDiscoveredTools (Chat).

Example: Stdio transport (launch a local MCP server process; no TLS guard applies):

  "Mcp": {
    "Enabled": true,
    "Servers": [
      {
        "Name": "sysw-mcp",
        "Transport": "Stdio",
        "Enabled": true,
        "Command": "C:\\non-install\\sysw-mcp-host\\sysw-mcp.exe",
        "Arguments": [ "--mode", "stdio" ],
        "Environment": {
          "SYSW_MCP_LOGLEVEL": "Information"
        }
      }
    ]
  }
}

Field	Stdio meaning
`Command`	Absolute path to the server executable to spawn.
`Arguments`	Array of command-line arguments passed to it.
`Environment`	Object of extra environment variables for the child process.
`Endpoint`	(unused for Stdio)

Example: StreamableHttp transport (connect to an HTTP MCP server; subject to the TLS guard):

{
  "Mcp": {
    "Enabled": true,
    "Servers": [
      {
        "Name": "remote-tools",
        "Transport": "StreamableHttp",
        "Enabled": true,
        "Endpoint": "https://localhost:7099/mcp"
      }
    ],
    "StreamableHttp": {
      "AllowPlaintextHttp": false
    }
  }
}

Field	StreamableHttp meaning
`Endpoint`	Server URL. Must be `https://` in production; an `http://` endpoint is refused at startup (`MCPConfigurationException`) unless `Mcp:StreamableHttp:AllowPlaintextHttp = true` (Development only).
`Command` / `Arguments` / `Environment`	(unused for StreamableHttp)

A Servers[] array may mix both transports. Each entry is wired independently; a single server’s startup failure is logged and skipped (the rest still load).

Chat

Key	Default	Description
`Chat:Tools:AllowMCPDiscoveredTools`	`false`	Strict default: MCP-discovered tools are hidden from agents (closes the prompt-injection-via-tool-description vector). Set `true` (Development) to surface MCP tools to the tool-call loop.
`Chat:Logging:DisableLogRedaction`	`false`	Leave `false` in production: a redacting logger replaces sensitive values (bearer tokens, API keys, `[Sensitive]` properties) with `*REDACTED*`. Escape hatch for hosts that already redact at the sink.

Llm (router-level guards)

Per-provider model settings do not live here (see Configuring the Per-Provider LLM Plugin Settings). These are router-level guards.

Llm:Admission (per-tenant rate-limit + concurrency; AIHost is single-tenant today)

Key	Default	Description
`PerTenantRequestsPerSecond`	`100`	Token-bucket refill rate per tenant.
`PerTenantBurst`	`200`	Token-bucket burst capacity.
`MaxConcurrentInFlight`	`32`	Concurrency cap at the `LLMService` boundary.
`DefaultTenantKey`	`default`	Tenant key used until AIFE identity context is wired.

Llm:Limits (input-payload guard)

Key	Default	Description
`MaxInputBytes`	`1048576` (1 MiB)	Hard input-payload ceiling. Raise for bulk-summarization workloads.
`MaxInputTokensEstimate`	`250000`	Heuristic `ceil(bytes / 4)` ceiling; size against the smallest production model’s window.

Llm:CircuitBreaker

Process-local breaker keyed by provider+model:

Key	Default	Description
`FailureThreshold`	`5`	Consecutive failures before opening. Raise for batch workloads.
`ResetTimeout`	`00:00:30`	`TimeSpan` before a half-open retry. Shorten for local providers.
`Enabled`	`true`	`false` disables the breaker entirely.

Llm:Routing

Key	Default	Description
`ReservedSafetyTokens`	`512`	Extra headroom in the model-selection gate: `MaxContextTokens >= EstimatedInputTokens + MaxTokens + ReservedSafetyTokens`. `0` disables the reserve.

Llm:AcceptedCredentialSources

Key	Default	Description
`Llm:AcceptedCredentialSources`	`Runtime`	Where LLM API keys may come from. `Runtime` = only the AIFE `InitializeBackend` wire (i.e. the AIExtension `<LLMApiKeys>` XML, Configuring the View); keys in plugin `settings.json` or env vars are ignored, and a provider with no injected key fails rather than silently using a disk/env key. (`[Flags]`; library default is `All`.)

ContextCompression (proactive history compression)

Trigger: effectiveLimit = (int)(activeWindow * CompressionTriggerRatio) - ReservedResponseTokens, sized against the active model window (ILLMService.GetActiveModelCapabilities()).

Key	Default	Description
`Enabled`	`true`	Master switch; `false` bypasses compression entirely.
`UseActiveModelWindow`	`true`	Size the trigger against the active model window; `false` (or unknown model) falls back to `MaxContextTokens`.
`CompressionTriggerRatio`	`0.75`	Fraction of the window at which compression fires. Lower = earlier/more aggressive.
`ReservedResponseTokens`	`4000`	Headroom reserved for the next response, subtracted from the trigger.
`MaxContextTokens`	`200000`	Absolute fallback window when `UseActiveModelWindow=false` / model unknown.
`RecentMessageCount`	`10`	Recent messages preserved verbatim; older ones folded into the summary.
`MaxMessageCount`	`50`	Also fires when raw message count exceeds this, regardless of tokens.
`TargetCompressionRatio`	`0.3`	Summary length target as a fraction of the effective limit.
`MaxSummaryTokens`	`1000`	Hard cap on the summary’s output tokens.
`CompressToolCalls`	`true`	`true`: tool-call turns are eligible for compression along with normal messages. `false`: all tool-call / tool-result exchanges are preserved verbatim (never summarized) — costlier in tokens but safer for tool-heavy sessions.

How the parameters interact

Reading the knobs against the graph:

Enabled is the top gate — false short-circuits everything.
UseActiveModelWindow + MaxContextTokens decide which window feeds the limit.
CompressionTriggerRatio + ReservedResponseTokens set when compression fires (lower ratio / higher reserve → fires earlier).
MaxMessageCount is a parallel count-based trigger (fires even if tokens stay under the limit).
CompressToolCalls decides whether tool-call turns can be summarized (false = tool exchanges always kept verbatim, never compressed).
RecentMessageCount sets how much is preserved verbatim; TargetCompressionRatio + MaxSummaryTokens size the summary that replaces the rest.

LLMRequestSizing (input-estimate buffering)

bufferedTokens = (int)(rawTokens * (1 + InputTokenBufferRatio)) + MinBufferTokens, clamped >= rawTokens. The buffered estimate becomes LLMGenerationOptions.EstimatedInputTokens used by the router’s window gate.

Key	Default	Description
`InputTokenBufferRatio`	`0.15`	Fractional buffer over the raw char-derived estimate. Raise (e.g. `0.25`) if late-routing failures appear.
`MinBufferTokens`	`256`	Token floor added to every buffered estimate.

What's Next?

Refer to Configuration Manual for WeaverAI to determine if any additional configuration remains to be done.

How can we help you today?

Configuring AIHost Settings

Prerequisites

Example

Logging

Aife (wire plane)

Persistence (session store)

Mcp (Model Context Protocol servers)

Mcp:Servers[] entry schema

Chat

Llm (router-level guards)

Llm:Admission (per-tenant rate-limit + concurrency; AIHost is single-tenant today)

Llm:Limits (input-payload guard)

Llm:CircuitBreaker

Process-local breaker keyed by provider+model:

Llm:Routing

Llm:AcceptedCredentialSources

ContextCompression (proactive history compression)

How the parameters interact

Reading the knobs against the graph:

LLMRequestSizing (input-estimate buffering)

What's Next?

How can we help you today?

Configuring AIHost Settings

Prerequisites

Example

Logging

Aife (wire plane)

Persistence (session store)

Mcp (Model Context Protocol servers)

Mcp:Servers[] entry schema

Chat

Llm (router-level guards)

Llm:Admission (per-tenant rate-limit + concurrency; AIHost is single-tenant today)

Llm:Limits (input-payload guard)

Llm:CircuitBreaker

Process-local breaker keyed by provider+model:

Llm:Routing

Llm:AcceptedCredentialSources

ContextCompression (proactive history compression)

How the parameters interact

Reading the knobs against the graph:

LLMRequestSizing (input-estimate buffering)

What's Next?

Related Articles