Architecture
A high-level tour of how PowerAim turns a screen frame into a mouse delta.
The pipeline
flowchart LR
A[Game on screen]
C[Screen Capture<br/>DXGI / GDI]
B[YOLOv8 ONNX<br/>DirectML / CUDA]
F[Prediction Filter<br/>Multi-Class + Confidence]
S[Sticky-Aim Selector]
T[Trigger System]
P[Prediction<br/>Kalman / Velocity]
M[Mouse / Gamepad output]
A --> C --> B --> F --> S --> P --> M
S --> T --> M
Each block is an independent service with a contract:
ICapture—LastCapture: Bitmap,CaptureArea: RectangleIPredictionLogic— runs inference, exposesModelClasses, raises events with detectionsIAction—Active: bool,Execute(...)— implementations includeAimingAction,AntiRecoilAction,RecoilPatternPlaybackAction,ImageBasedAntiRecoilAction,AutoPlayLearningAction
The composition root is AIManager, which:
- Owns one
ICaptureinstance (DXGI or GDI based on machine probing) - Owns one
IPredictionLogicinstance built aroundOnnxModelSessionFactory - Hosts a list of
IActions and ticks them every frame
Capture
ScreenCapture is a façade that delegates to either:
DxgiScreenCapture— Vortice.Direct3D11 + DXGI Desktop Duplication. ~6× faster than GDI. Used by default when probing succeeds.GdiScreenCapture—BitBltfallback for systems where DXGI is unavailable or restricted (older Windows, headless, RDP).
Both expose the same ICapture API. The choice is made once per ScreenCapture construction.
The capture source is configurable in the title bar — entire monitor or a specific window. Window-mode capture uses ProcessCapture, which PrintWindows the target window with DXGI fallback to BitBlt when the target is on a different DPI / GPU adapter.
Inference
OnnxModelSessionFactory builds the ONNX Runtime session with an execution-provider fallback chain:
- DirectML (default build) or CUDA (
_cudabuild) - CPU fallback if the GPU path errors
It also probes the input shape:
- Fixed shape (e.g.
1×3×640×640) —SliderSettings.ImageSizeis snapped to the model’s declared size - Dynamic shape (e.g.
1×3×?×?) — the user picks via the Image Size Override slider
Pixel-to-tensor conversion uses a byte→float LUT for lower GC pressure than naive bitmap.Lock + cast.
Prediction filter
PredictionFilter runs after the model and applies:
- Minimum confidence (configurable via
AIMinimumConfidence) - Class filtering (
AISettings.TargetClassFilterMode+TargetClassIds) - Detection-mask exclusion (
AISettings.IgnoreRegions)
Anything that survives moves on to the sticky-aim selector.
Sticky-aim selector
StickyAimSelector keeps a target lock between frames using a composite score:
- Distance score — pixel distance to the screen center, weighted within
StickyAimThreshold - Confidence score — the model’s own probability
- Size score — favors larger detections (closer = bigger)
- Lock bonus — accumulates over time on the currently-held target, capped at
StickyAimMaxLockScore
The selector switches targets only when a non-locked candidate beats the locked target by a clear margin. This eliminates the “ping-pong” effect of two overlapping detections.
Prediction (lead-time)
PredictionManager picks one of three methods:
- Kalman Filter — custom 2D Kalman with velocity state. Lead time is fixed by default; the Adaptive Kalman Lead toggle adapts it to measured target velocity.
- Shall0e’s Prediction — velocity-based linear lead. PowerAim fixed the broken upstream implementation.
- wisethef0x’s EMA Prediction — EMA-weighted velocity lead.
The chosen method produces a predicted target position for N ms in the future.
Aim / output
AimingAction.Execute(prediction):
- Apply X/Y offsets (pixel + percentage)
- Apply EMA smoothing
- Compute mouse delta via the chosen
MovementPathType(Bezier / Lerp / Exponential / Adaptive / PerlinNoise) - Scale by
MouseSensitivity - Send via the configured
MouseMovementMethod(SendInput / ddxoft / Razer / LGHub / MouseEvent) or viaGamepadManager.GamepadSenderifUseControllerForAimis on
The trigger system runs in parallel — TriggerEngine evaluates each ActionTrigger and fires its actions when keys + intersection rules are satisfied.
Anti-Recoil
Three independent implementations, evaluated in precedence order:
RecoilPatternPlaybackAction— replay a recorded delta sequenceImageBasedAntiRecoilAction— OpenCV phase-correlation + EMA baseline (BETA)- Legacy fixed X/Y compensation (built into
AntiRecoilAction)
Only the highest-precedence active mode produces output; the others self-disable.
Controller mapping
MappingEngine is a singleton that:
- Hooks keyboard + mouse via
Gma.System.MouseKeyHook - Polls XInput at 1 ms via SharpDX.XInput
- Resolves the active profile (first enabled,
MatchProcess-matching profile) - Reads each
InputMapping’s source state, evaluates activator + modifier, writes the target - Mouse-to-stick and stick-to-mouse pumps are special — driven by sentinel mappings
For KB→Pad writes, the engine takes ownership of the channels it touches on the shared GamepadManager.GamepadSender — otherwise the sync loop (which mirrors the physical pad onto the virtual one every 1 ms) would immediately overwrite the mapping output.
AutoPlay
AutoPlayLearningAction orchestrates:
OllamaClient.SelectActionAsync— sends the captured frame + game context + action list to Ollama- Receives an action name
- Sends the corresponding input(s) via the same dispatch as triggers
- Optionally records the (state, action) pair into
AutoPlayLearningModelfor bias learning
AutoPlayLearningModel is a tiny state→action frequency table persisted as JSON.
OCR
OcrService is a polling timer (configurable interval) that:
- Captures each enabled
OcrRegion - Pre-binarizes (threshold + optional invert)
- Runs Tesseract 5.2
- Post-processes per
OcrRegionKind(Number = digits only, Health = number + slash, Text = free-form) - Stores results in
OcrService.Latest
Other subsystems can read Latest to drive triggers or AutoPlay decisions.
Replay buffer
ReplayBuffer.Push(frame, predictions) JPEG-encodes the frame on insert into a ring buffer sized at BufferSeconds × FPS. ExportAsync() flushes the ring to a timestamped folder.
Threading: insertion is lock-protected but cheap; export runs on a thread-pool task so the AI loop never blocks.
Threading
- UI thread — WPF window, every dialog, the binding hook listener
- Capture thread — DXGI / GDI capture, owned by
AIManager - AI loop thread — inference + prediction + action dispatch
- Mapping engine thread — 1 ms tick polling XInput
- OCR timer — DispatcherTimer on the UI thread
Cross-thread state is INotifyPropertyChanged with WPF dispatcher marshaling.
Dependency graph
AppConfig (singleton)
└─ owns SliderSettings, ToggleState, BindingSettings, ...
AIManager (composition root)
├─ ICapture (Dxgi / Gdi)
├─ IPredictionLogic (OnnxModelSessionFactory)
├─ Action list: AimingAction, AntiRecoilAction, ...
└─ ReplayBuffer.Instance
GamepadManager (singleton)
├─ IGamepadReader (XInput polling)
└─ IGamepadSender (ViGEm / vJoy / Internal / XInputEmu)
MappingEngine (singleton)
├─ Mouse/Keyboard hook
├─ XInput poll
└─ GamepadManager.GamepadSender (shared)
InputBindingManager (singleton)
└─ low-level keyboard + mouse hook
WindowFocusWatcher (singleton)
└─ foreground process polling for AutoPause / AutoSwitch
Source layout
PowerAim/
├── AILogic/ # capture, inference, prediction, replay, anti-recoil
│ ├── Actions/ # IAction implementations
│ ├── Contracts/ # ICapture, IPredictionLogic, IAction, IOllamaClient
│ └── ...
├── InputLogic/ # mouse + gamepad input
│ ├── Gamepad/ # ViGEm/vJoy/Internal/XInputEmu senders + readers
│ ├── HidHide/ # HidHide CLI integration
│ └── Mapping/ # MappingEngine + presets + converters
├── Config/ # AppConfig + every SettingsCard's persisted state
├── MouseMovementLibraries/ # GHubSupport, RazerSupport, SendInputSupport, ddxoftSupport
├── Visuality/ # dialogs and overlay windows
├── UILibrary/ # custom WPF controls (AToggle, ASlider, AKeyChanger, ...)
├── Localizations/ # Locale.json + per-language JSONs
└── MainWindow.xaml + .cs # composition + sidebar navigation
For licensing, see Source Available.