Configuration of the AIHost settings is done in the provided AIHost/appsettings.json. The Standard ASP.NET Core JSON configuration is consumed by SystemWeaver.AIHost.exe. You can override per host via the appsettings.Local.json (gitignored) or environment variables. Keys that are prefixed with // in the shipped files are documentation comments, not settings. This article describes how to configure the settings. 


Prerequisites

  • The SWExtension.AIExtension folder is located in your Client's swExplorerExtensions directory



The values shown in the below appsettings.json example are the shipped defaults with the package.


Example 

{
  "Logging": {
    "LogLevel": {
      "Default": "Information",
      "Microsoft.AspNetCore": "Warning"
    },
    "File": {
      "Enabled": true,
      "Directory": "%LOCALAPPDATA%\\SystemWeaver.AIHost\\logs",
      "RollingInterval": "Day",
      "RetainedFileCountLimit": 14
    }
  },
  "Aife": {
    "Port": 5001,
    "Llm": {
      "MaxOutputTokens": 32768,
      "MaxToolCalls": 50
    }
  },
  "Persistence": {
    "Provider": "InMemory"
  },
  "Mcp": {
    "Enabled": true,
    "Servers": [],
    "StreamableHttp": {
      "AllowPlaintextHttp": false
    }
  },
  "Chat": {
    "Tools": {
      "AllowMCPDiscoveredTools": false
    },
    "Logging": {
      "DisableLogRedaction": false
    }
  },
  "Llm": {
    "Admission": {
      "PerTenantRequestsPerSecond": 100,
      "PerTenantBurst": 200,
      "MaxConcurrentInFlight": 32,
      "DefaultTenantKey": "default"
    },
    "Limits": {
      "MaxInputBytes": 1048576,
      "MaxInputTokensEstimate": 250000
    },
    "CircuitBreaker": {
      "FailureThreshold": 5,
      "ResetTimeout": "00:00:30",
      "Enabled": true
    },
    "Routing": {
      "ReservedSafetyTokens": 512
    },
    "AcceptedCredentialSources": "Runtime"
  },
  "ContextCompression": {
    "Enabled": true,
    "UseActiveModelWindow": true,
    "CompressionTriggerRatio": 0.75,
    "ReservedResponseTokens": 4000,
    "MaxContextTokens": 200000,
    "RecentMessageCount": 10,
    "MaxMessageCount": 50,
    "TargetCompressionRatio": 0.3,
    "MaxSummaryTokens": 1000,
    "CompressToolCalls": true
  },
  "LLMRequestSizing": {
    "InputTokenBufferRatio": 0.15,
    "MinBufferTokens": 256
  }
}

An example Mcp:Servers[] entry (omitted above because the shipped array is empty):

{
  "Mcp": {
    "Enabled": true,
    "Servers": [
      {
        "Name": "sysw-mcp",
        "Transport": "Stdio",
        "Enabled": true,
        "Command": "C:\\non-install\\sysw-mcp-host\\sysw-mcp.exe",
        "Arguments": [],
        "Environment": {}
      },
      {
        "Name": "remote-tools",
        "Transport": "StreamableHttp",
        "Enabled": true,
        "Endpoint": "https://localhost:7099/mcp"
      }
    ],
    "StreamableHttp": { "AllowPlaintextHttp": false }
  }
}

The per-section reference below documents each key, its default, and valid range.


Logging

KeyDefaultDescription
Logging:LogLevel:DefaultInformationDefault MEL level.
Logging:LogLevel:Microsoft.AspNetCoreWarningFramework log level.
Logging:File:EnabledtrueMaster switch for the Serilog file sink (Console/Debug stay on regardless). Degrades gracefully (warning to stderr) if the directory can’t be created.
Logging:File:Directory%LOCALAPPDATA%\SystemWeaver.AIHost\logsLog directory. Environment variables (%LOCALAPPDATA%, %USERPROFILE%, …) are expanded; empty/null falls back to the default.
Logging:File:RollingIntervalDaySerilog RollingInterval: Infinite / Year / Month / Day / Hour / Minute.
Logging:File:RetainedFileCountLimit14Max rolled files kept before the oldest is deleted (≈ two weeks at daily rolling).

Aife (wire plane)

The AIFE endpoint is https://localhost:<Port>/ with mandatory client-cert mTLS; loopback bind covers IPv4 + IPv6. There is no Host key (loopback is hard-wired) and the named-pipe trust handshake is automatic (no key).

KeyDefaultDescription
Aife:Port5001HTTPS port. Must match the client’s hard-coded 5001(Fixed client connection facts (not configurable in XML)).
Aife:Llm:MaxOutputTokens32768Per-request output-token cap forwarded to LLMGenerationOptions.MaxTokens. 0 = leave unset (router applies its 4096 default).
Aife:Llm:MaxToolCalls50Per-interact tool-call budget. 0 disables the tool-call loop entirely.

Persistence (session store)

KeyDefaultDescription
Persistence:ProviderInMemoryInMemory = RAM-only; chat/agent sessions live for the lifetime of the AIHost process and are lost on restart. This is the shipped default and the only mode the AIExtension currently supports — recovering persisted sessions from the AIExtension is not yet supported, so leave this on InMemory.

Mcp (Model Context Protocol servers)

KeyDefaultDescription
Mcp:EnabledtrueMaster switch for MCP wiring. false skips all MCP registration.
Mcp:Servers[]Array of MCP server entries (schema below). Per-server startup failures are logged and skipped. Empty = MCP effectively disabled.
Mcp:StreamableHttp:AllowPlaintextHttpfalseTLS guard. Strict by default — StreamableHttp servers with http:// endpoints are refused (startup MCPConfigurationException) unless this is true (Development only). Does not apply to Stdio transports.

Mcp:Servers[] entry schema

FieldDescription
NameIdentifier for the server.
TransportStdio or StreamableHttp.
EnabledPer-server toggle.
Command / Arguments / EnvironmentStdio transport: the process to launch and its args/env.
EndpointStreamableHttp transport: the server URL (must be https:// unless AllowPlaintextHttp=true).

Note: MCP-discovered tools are additionally gated by Chat:Tools:AllowMCPDiscoveredTools (Chat).

Example:  Stdio transport (launch a local MCP server process; no TLS guard applies):


  "Mcp": {
    "Enabled": true,
    "Servers": [
      {
        "Name": "sysw-mcp",
        "Transport": "Stdio",
        "Enabled": true,
        "Command": "C:\\non-install\\sysw-mcp-host\\sysw-mcp.exe",
        "Arguments": [ "--mode", "stdio" ],
        "Environment": {
          "SYSW_MCP_LOGLEVEL": "Information"
        }
      }
    ]
  }
}
FieldStdio meaning
CommandAbsolute path to the server executable to spawn.
ArgumentsArray of command-line arguments passed to it.
EnvironmentObject of extra environment variables for the child process.
Endpoint(unused for Stdio)

Example:  StreamableHttp transport (connect to an HTTP MCP server; subject to the TLS guard):

{
  "Mcp": {
    "Enabled": true,
    "Servers": [
      {
        "Name": "remote-tools",
        "Transport": "StreamableHttp",
        "Enabled": true,
        "Endpoint": "https://localhost:7099/mcp"
      }
    ],
    "StreamableHttp": {
      "AllowPlaintextHttp": false
    }
  }
}
FieldStreamableHttp meaning
EndpointServer URL. Must be https:// in production; an http:// endpoint is refused at startup (MCPConfigurationException) unless Mcp:StreamableHttp:AllowPlaintextHttp = true (Development only).
Command / Arguments / Environment(unused for StreamableHttp)

A Servers[] array may mix both transports. Each entry is wired independently; a single server’s startup failure is logged and skipped (the rest still load).


Chat

KeyDefaultDescription
Chat:Tools:AllowMCPDiscoveredToolsfalseStrict default: MCP-discovered tools are hidden from agents (closes the prompt-injection-via-tool-description vector). Set true (Development) to surface MCP tools to the tool-call loop.
Chat:Logging:DisableLogRedactionfalseLeave false in production: a redacting logger replaces sensitive values (bearer tokens, API keys, [Sensitive] properties) with ***REDACTED***. Escape hatch for hosts that already redact at the sink.

Llm (router-level guards)

Per-provider model settings do not live here (see Configuring the Per-Provider LLM Plugin Settings). These are router-level guards.

Llm:Admission (per-tenant rate-limit + concurrency; AIHost is single-tenant today)

KeyDefaultDescription
PerTenantRequestsPerSecond100Token-bucket refill rate per tenant.
PerTenantBurst200Token-bucket burst capacity.
MaxConcurrentInFlight32Concurrency cap at the LLMService boundary.
DefaultTenantKeydefaultTenant key used until AIFE identity context is wired.

Llm:Limits (input-payload guard)

KeyDefaultDescription
MaxInputBytes1048576 (1 MiB)Hard input-payload ceiling. Raise for bulk-summarization workloads.
MaxInputTokensEstimate250000Heuristic ceil(bytes / 4) ceiling; size against the smallest production model’s window.

Llm:CircuitBreaker 

Process-local breaker keyed by provider+model:

KeyDefaultDescription
FailureThreshold5Consecutive failures before opening. Raise for batch workloads.
ResetTimeout00:00:30TimeSpan before a half-open retry. Shorten for local providers.
Enabledtruefalse disables the breaker entirely.

Llm:Routing

KeyDefaultDescription
ReservedSafetyTokens512Extra headroom in the model-selection gate: MaxContextTokens >= EstimatedInputTokens + MaxTokens + ReservedSafetyTokens. 0 disables the reserve.

Llm:AcceptedCredentialSources

KeyDefaultDescription
Llm:AcceptedCredentialSourcesRuntimeWhere LLM API keys may come from. Runtime = only the AIFE InitializeBackend wire (i.e. the AIExtension <LLMApiKeys> XML, Configuring the View); keys in plugin settings.json or env vars are ignored, and a provider with no injected key fails rather than silently using a disk/env key. ([Flags]; library default is All.)

ContextCompression (proactive history compression)

Trigger: effectiveLimit = (int)(activeWindow * CompressionTriggerRatio) - ReservedResponseTokens, sized against the active model window (ILLMService.GetActiveModelCapabilities()).

KeyDefaultDescription
EnabledtrueMaster switch; false bypasses compression entirely.
UseActiveModelWindowtrueSize the trigger against the active model window; false (or unknown model) falls back to MaxContextTokens.
CompressionTriggerRatio0.75Fraction of the window at which compression fires. Lower = earlier/more aggressive.
ReservedResponseTokens4000Headroom reserved for the next response, subtracted from the trigger.
MaxContextTokens200000Absolute fallback window when UseActiveModelWindow=false / model unknown.
RecentMessageCount10Recent messages preserved verbatim; older ones folded into the summary.
MaxMessageCount50Also fires when raw message count exceeds this, regardless of tokens.
TargetCompressionRatio0.3Summary length target as a fraction of the effective limit.
MaxSummaryTokens1000Hard cap on the summary’s output tokens.
CompressToolCallstruetrue: tool-call turns are eligible for compression along with normal messages. false: all tool-call / tool-result exchanges are preserved verbatim (never summarized) — costlier in tokens but safer for tool-heavy sessions.

How the parameters interact


Reading the knobs against the graph:

  • Enabled is the top gate — false short-circuits everything.
  • UseActiveModelWindow + MaxContextTokens decide which window feeds the limit.
  • CompressionTriggerRatio + ReservedResponseTokens set when compression fires (lower ratio / higher reserve → fires earlier).
  • MaxMessageCount is a parallel count-based trigger (fires even if tokens stay under the limit).
  • CompressToolCalls decides whether tool-call turns can be summarized (false = tool exchanges always kept verbatim, never compressed).
  • RecentMessageCount sets how much is preserved verbatim; TargetCompressionRatio + MaxSummaryTokens size the summary that replaces the rest.

LLMRequestSizing (input-estimate buffering)

bufferedTokens = (int)(rawTokens * (1 + InputTokenBufferRatio)) + MinBufferTokens, clamped >= rawTokens. The buffered estimate becomes LLMGenerationOptions.EstimatedInputTokens used by the router’s window gate.

KeyDefaultDescription
InputTokenBufferRatio0.15Fractional buffer over the raw char-derived estimate. Raise (e.g. 0.25) if late-routing failures appear.
MinBufferTokens256Token floor added to every buffered estimate.



What's Next? 

Refer to Configuration Manual for WeaverAI to determine if any additional configuration remains to be done.