Detection Intermediate Malware Analysis / YARA / Detection Engineering

YARA Rule Writing

A practical guide to writing YARA rules � covering rule structure, string types, conditions, modifiers, and real examples for detecting malware and suspicious files.

20 min read 15+ example rules Blue Team

// What is YARA?

YARA is a pattern-matching tool used by malware researchers and SOC analysts to identify and classify malware samples based on textual or binary patterns. Rules describe characteristics of malware families � strings, byte sequences, file headers, and conditions that must be met for a file to match.

YARA is integrated into many security tools including VirusTotal, Cuckoo Sandbox, Any.Run, Velociraptor, and EDR platforms. Writing custom YARA rules is a core detection engineering skill.

YARA rules can match against files on disk, memory dumps, running processes, and network traffic. The same rule syntax works across all contexts.

// Rule Structure

A YARA rule has three main sections: meta (documentation), strings (patterns to find), and condition (logic to trigger the match).

rule ExampleMalware
{
    meta:
        author = "Ben Thomson"
        description = "Detects ExampleMalware based on mutex and C2 string"
        date = "2025-01-01"
        reference = "https://example.com/report"

    strings:
        $mutex = "Global\\MyMutex123" ascii
        $c2 = "evil.example.com" ascii nocase
        $mz = { 4D 5A } // MZ header � PE file

    condition:
        $mz at 0 and ($mutex or $c2)
}

// String Types

TypeSyntaxDescription
Text string"string"Matches exact ASCII text. Use nocase for case-insensitive.
Wide string"string" wideMatches UTF-16 LE encoded text (common in Windows PE strings).
ASCII + Wide"string" ascii wideMatches both ASCII and UTF-16 � covers most encoding variants.
Hex pattern{ 4D 5A ?? 00 }Matches raw bytes. ?? is a wildcard for any byte.
Hex jump{ 4D [2-4] 5A }Matches bytes with a variable gap of 2�4 bytes between them.
Regex/pattern/PCRE-compatible regex for flexible pattern matching.

// Modifiers

ModifierEffect
nocaseCase-insensitive matching (text strings only)
asciiMatch ASCII encoding (default for text strings)
wideMatch UTF-16 LE encoding
fullwordMatch only if string is a full word (not part of a larger word)
xorMatch the string XOR'd with every single-byte key (0x00�0xFF)
base64Match the string in its base64-encoded form
privateString is used in conditions but not reported in output

// Conditions

The condition section defines when a rule fires. It supports Boolean logic, counting, offsets, and built-in YARA functions.

ConditionExampleMeaning
Boolean$a and $bBoth strings must be present
OR logic$a or $b or $cAny of the strings is present
At offset$mz at 0String must appear at byte offset 0
Count#string > 3String appears more than 3 times
Any ofany of ($a, $b, $c)Any one of the named strings matches
All ofall of ($a*)All strings in the $a* set must match
N of2 of ($a, $b, $c)At least 2 of the listed strings match
File sizefilesize < 1MBOnly match files under 1 MB
PE sectionspe.sections[0].name == ".text"First PE section is .text (requires pe module)

// Example Rules

Detect Mimikatz by strings

rule Mimikatz_Strings
{
    meta:
        description = "Detects Mimikatz credential dumping tool"

    strings:
        $s1 = "sekurlsa::logonpasswords" ascii nocase
        $s2 = "lsadump::dcsync" ascii nocase
        $s3 = "mimikatz" ascii wide nocase
        $s4 = "gentilkiwi" ascii nocase

    condition:
        2 of ($s*)
}

Detect XOR-obfuscated shellcode

rule XOR_Shellcode_Pattern
{
    meta:
        description = "Detects shellcode download cradle strings XOR-obfuscated"

    strings:
        $dl = "DownloadString" xor ascii
        $iex = "Invoke-Expression" xor nocase

    condition:
        any of them
}

Detect suspicious PE with no legitimate sections

import "pe"

rule Suspicious_PE_Packed
{
    meta:
        description = "PE file with unusually few sections � possible packing"

    condition:
        uint16(0) == 0x5A4D
        and pe.number_of_sections < 3
        and filesize < 500KB
}

Detect PowerShell download cradle in scripts

rule PowerShell_Download_Cradle
{
    meta:
        description = "PowerShell download and execute pattern"

    strings:
        $dl1 = "DownloadString" nocase
        $dl2 = "DownloadFile" nocase
        $iex = "IEX" fullword nocase
        $invoke = "Invoke-Expression" nocase

    condition:
        any of ($dl*) and any of ($iex, $invoke)
}

// Running YARA

# Scan a single file
yara rule.yar suspicious.exe

# Scan a directory recursively
yara -r rule.yar /path/to/dir/

# Scan with multiple rule files
yara rules/ suspicious.exe

# Scan a running process by PID
yara -p 1234 rule.yar .

# Scan with verbose output (print matched strings)
yara -s rule.yar suspicious.exe

# Scan all running processes
yara rule.yar --scan-proc-mem

YARA rules can also be run directly in VirusTotal, Any.Run, and Intezer � submit a rule and it will match against their corpus of known samples. This is useful for validating rules before deployment.

// Best Practices

Use multiple weak strings rather than one strong one. A single unique string can be easily patched by an attacker. Two or three weaker indicators together are harder to evade.

Always add a filesize condition. Including filesize < 2MB or similar prevents YARA from scanning large files that couldn't possibly contain the target pattern, dramatically improving scan speed.

Test rules against goodware. Before deploying, scan a clean Windows install directory to ensure your rule doesn't fire on legitimate files. False positives destroy analyst trust in detections.

The xor modifier is expensive. It generates 255 variants of each string. Use it sparingly and combine with filesize limits to avoid performance issues during large scans.