Penetration Testing Beyond Compliance: Finding What Scanners Miss
Back to Blog

Penetration Testing Beyond Compliance: Finding What Scanners Miss

The Compliance-Driven Testing Trap

Most organisations that conduct penetration testing do so because something requires it of them. A framework checkbox. An insurance questionnaire. A customer security assessment. A regulatory audit. The penetration test exists to satisfy the requirement, and once the report is filed and the critical findings are patched, the test has served its purpose until next year.

This is a fundamentally broken approach to security testing: and it is the dominant model in the market.

Compliance-driven penetration testing fails for several interconnected reasons. First, scope is constrained by what the compliance requirement specifies rather than by what attackers will actually target. A PCI-DSS assessment covers the cardholder data environment. A SOC 2 test covers the systems in scope for the audit. The 43 servers that are out of scope, the HR system, the VPN concentrator that management refused to include because it would cause disruption: these are not tested. Attackers do not respect scope boundaries.

Second, the incentive structure is misaligned. When a penetration test is a procurement requirement, the buyer wants to pass the test, not to understand how they would actually be compromised. Findings get triaged and prioritised through the lens of what needs to be fixed to pass re-assessment rather than what represents the most significant actual risk. The finding is fixed; the underlying problem: the process, the architecture, the assumption: remains.

Third, annual or biannual testing cadences are mismatched with the speed of change in modern environments. A test conducted in January is accurate for the January environment. By the time the report is presented in February, a developer has pushed a new application to production. By July, the environment the test described barely resembles what is running. By the following January, the previous test's findings are historical artefacts more than current intelligence.

None of this means penetration testing is without value. It means that compliance-oriented testing, conducted to a narrow scope on an annual cadence primarily to satisfy an external requirement, is a poor use of security budget. Effective penetration testing looks different: and understanding what automated scanners can and cannot do is the starting point.

What Automated Scanners Actually Catch

Vulnerability scanners: Nessus, Qualys, Rapid7 InsightVM, OpenVAS, and their peers: are genuinely useful tools. Understanding their capabilities clearly is important, because the tendency in security discussions is to either over-rely on them or dismiss them entirely. Neither is correct.

Automated scanners are effective at identifying:

  • Known CVEs with existing detection signatures. If a vulnerability has a CVE number and the scanner vendor has written a check for it, the scanner will find it reliably across thousands of hosts in hours. This is scanning's core strength: coverage at scale.
  • Missing patches. By comparing installed software versions against vulnerability databases, scanners identify systems running outdated software with known vulnerabilities. This is unglamorous but important work.
  • Common misconfigurations. Default credentials on network devices, weak TLS configurations, open admin interfaces, anonymous FTP access, SMB signing disabled, SNMP community strings: these are detectable by pattern matching and protocol interaction.
  • Exposed services. Port scanning and service fingerprinting tell you what is listening on the network. This is foundational information that every security programme needs.
  • Web application surface issues. Automated DAST (Dynamic Application Security Testing) tools find some common web vulnerabilities: SQL injection in obvious locations, cross-site scripting in URL parameters, directory traversal in simple cases, missing security headers.

This is genuinely valuable. Organisations that run continuous vulnerability scanning and maintain a low mean-time-to-patch are significantly more resilient than those that do not. The problem is not that scanners are useless: it is that they create a false sense of completeness. A clean scanner report does not mean you are secure. It means you are not obviously vulnerable to the things scanners check for.

What Scanners Miss and Why Human Testers Are Irreplaceable

The gaps in automated scanning are not minor edge cases. They represent the most consequential categories of vulnerability: the ones that lead to actual breaches, data exfiltration, and ransomware deployment.

Scanner vs Human Tester Coverage Relative detection coverage of automated scanners versus human penetration testers across six classes of vulnerability. What each type of tester actually finds Automated scanner Human tester Known CVEs / missing patches Default credentials / weak auth Broken access control / IDOR Business-logic flaws Chained exploits / pivoting Social engineering / phishing

Business Logic Flaws

A vulnerability scanner has no understanding of what your application is supposed to do. It cannot reason about whether a workflow is being abused, whether an access control decision makes business sense, or whether a sequence of legitimate operations produces an illegitimate outcome.

Business logic flaws are the class of vulnerabilities where the application is functioning exactly as implemented: but the implementation is wrong. Examples include: a discount code that can be applied multiple times, producing a negative order total; an account balance transfer that processes the debit and credit in separate transactions, enabling a race condition that allows double-spending; a forgot-password flow that reveals which email addresses are registered, enabling user enumeration; an administrative function that is protected by checking a URL parameter rather than the user's session privileges.

These vulnerabilities are found only by a tester who understands how the application is supposed to work, maps the actual implementation against that expectation, and probes the boundaries. No scanner can do this. A scanner does not know that the order total should never be negative.

Chained Exploits

Real attacks rarely rely on a single critical vulnerability. They chain together multiple lower-severity findings: a misconfiguration here, a weak credential there, an overly permissive service account: into an exploit path that leads from external access to full domain compromise. A scanner sees each finding in isolation and rates each one by its own CVSS score. It does not see that a CVE on the VPN gateway, combined with a weak local admin password discovered via LDAP enumeration, combined with an overprivileged service account, produces a path from unauthenticated external attacker to Domain Administrator in three steps.

This is precisely what skilled penetration testers do. They think about the environment as a graph of trust relationships and access paths. They ask: given what I have found so far, what can I reach from here? What new options does that open? They are not cataloguing vulnerabilities: they are mapping attack paths. The distinction is fundamental.

Social Engineering and Phishing Simulation

Technical vulnerabilities are one attack surface. Human beings are another. And in practice, human beings are frequently the most effective initial access vector: not because employees are careless, but because sophisticated phishing campaigns are designed to defeat even trained, security-aware users.

Simulated phishing assessments: genuine red team phishing with pretexting, lookalike domains, and credential harvesting infrastructure: reveal the actual human attack surface in your organisation. They also reveal downstream failures: what happens after a credential is harvested? Does your MFA stop the attacker? Does your SIEM generate an alert? Does your incident response process trigger? The phishing exercise is not just about measuring click rates: it is about testing the entire defensive chain from initial compromise onward.

No automated scanner can send a convincing email from a spoofed vendor address, build rapport over a phone call to extract information, or tailgate through a building's physical access control. These are irreducibly human capabilities.

Physical Security Gaps

Physical access to network infrastructure is a decisive advantage for an attacker. A network tap placed in a server room, a rogue access point installed behind a printer, or a USB device left in a public area can all provide persistent access to an otherwise well-defended network. Physical penetration testing: attempting to access restricted areas through social engineering, credential cloning, or tailgating: reveals gaps in physical security controls that have direct implications for logical security.

Norwegian organisations with compliance requirements around physical security (DSB, NSM, financial regulators) benefit from physical security assessments that are integrated with the technical penetration test, rather than treated as a separate box-ticking exercise.

Custom Application Vulnerabilities

Off-the-shelf vulnerability scanners are calibrated for known vulnerabilities in known software. Custom applications: the internal HR portal, the bespoke manufacturing control system, the API built in-house for a specific business function: have vulnerabilities that are unique to their implementation. No CVE will ever be published for them.

Common custom application vulnerability classes include: authentication bypass via parameter manipulation or JWT algorithm confusion; insecure direct object references (IDOR), where changing a numeric ID in a URL gives access to another user's data; broken access control, where the application enforces authorisation inconsistently across different endpoints; mass assignment vulnerabilities, where an API endpoint accepts fields that should not be user-controllable; server-side request forgery (SSRF), enabling attackers to make the server fetch internal resources; and XML external entity injection in document processing workflows.

These require a tester to read the application, understand its logic, map its data flows, and probe its boundaries with the same creativity an attacker would apply. Automated tools will find some of these: SQL injection and XSS checkers are built into most DAST tools: but the more subtle variants, and the business logic layer above them, require human analysis.

Cloud Misconfiguration Chains

Cloud environments create new classes of vulnerability that traditional scanners are not designed to detect. The attack surface in AWS, Azure, or GCP is not primarily about CVEs in running software: it is about the relationships between identity, permissions, and resources.

A misconfigured S3 bucket is a straightforward misconfiguration. But a skilled attacker looks for chains: an IAM role attached to a Lambda function that has overly broad permissions, which can be assumed by any authenticated user in the account, which itself can be accessed via a public-facing API endpoint with a default configuration. The individual components may pass automated compliance checks. The chain produces data exfiltration from a supposedly secure bucket.

Cloud security assessments require tooling (Prowler, ScoutSuite, Pacu for AWS; Stormspotter, MicroBurst for Azure) combined with human analysis of the permission graph. The tester needs to understand cloud IAM semantics: the difference between resource-based and identity-based policies, the implications of trust relationships between accounts, the ways that service-linked roles can be abused. This is specialised knowledge that generic vulnerability scanners do not possess.

Penetration Testing Methodologies: Choosing the Right Engagement Type

MITRE ATT and CK tactic chain as an engagement kill-chain MITRE ATT and CK: every engagement maps to a real adversary chain TA0043 Reconnaissance OSINT subdomain exposed services TA0001 Initial Access phishing web exploit valid creds TA0002 Execution loaders LOLBins C2 callback TA0003 Persistence tasks keys accounts TA0004 Privilege Esc. Kerberoast ADCS ESC1 sudo TA0008 Lateral Movt. WMI SMB RDP TA0040 Impact / Goal crown jewels ransom sim exfil Engagement workflow, mapped to ATT and CK 1. Scoping goals, crown jewels, ROE 2. Recon OSINT, asset discovery 3. Exploit initial access, execution 4. Post-Exploit persist, escalate, move 5. Report & Retest findings, fix, validate Aligned with: PTES OSSTMM MITRE ATT and CK OWASP WSTG MASVS NIST CREST
An engagement is not a CVSS spreadsheet, it is an adversary walk through your environment. Every ZeroSubnet finding is mapped to a MITRE ATT and CK technique ID, then layered on top of the established methodology frameworks (PTES, OSSTMM, OWASP WSTG, MASVS, NIST, CREST) so your blue team can measure detection and response coverage directly from the report.

Before engaging a penetration testing provider, organisations need to understand what type of engagement they are buying. The three core models: black box, grey box, and white box: differ fundamentally in what they simulate and what intelligence they produce.

  • Black box testing. The tester receives minimal information: typically just a target IP range or domain name. They simulate an external attacker with no insider knowledge. Black box testing is valuable for assessing your external attack surface and your perimeter defences. It is less efficient than grey or white box testing at finding vulnerabilities deeper in the environment, because the tester spends significant time on reconnaissance and enumeration that an insider attacker would skip.
  • Grey box testing. The tester receives some information: network documentation, application credentials for a standard user account, internal network diagrams, or access to a development environment. This balances realism with efficiency. Grey box testing simulates a compromised user account, a malicious insider with normal privileges, or an attacker who has already achieved initial access and is now moving laterally. It is the most common model for comprehensive assessments.
  • White box testing. The tester receives full information: source code, architecture documentation, credentials, network diagrams, and developer access. White box testing maximises coverage and is most effective for finding vulnerabilities that would not be visible from the outside. It is appropriate for application security assessments, code reviews, and internal infrastructure assessments where the goal is to find as many vulnerabilities as possible, not to simulate a specific threat model.

The right model depends on what you are trying to learn. If you want to know how an external attacker would initially compromise your environment, start with black box external testing. If you want to know how a compromised employee credential would be weaponised, grey box internal testing is the right approach. If you want to find every vulnerability in a critical application before it goes to production, white box source code review is most effective.

Red Team vs Penetration Test vs Vulnerability Assessment: Critical Distinctions

These three terms are frequently conflated, and the conflation leads to buying the wrong service. The distinctions matter:

  • Vulnerability Assessment. A systematic identification and classification of vulnerabilities in an environment, typically using automated scanning supplemented by manual verification. The output is a list of vulnerabilities with severity ratings. A vulnerability assessment does not attempt to exploit the vulnerabilities found or demonstrate the impact of a successful compromise. It answers the question: what vulnerabilities exist?
  • Penetration Test. An active attempt to exploit identified vulnerabilities and demonstrate the impact of successful compromise. The tester uses the same techniques as an attacker but within a defined scope and rules of engagement. The output is a demonstration of exploitability, not just existence. It answers the question: which of these vulnerabilities can actually be exploited, and what is the impact?
  • Red Team Exercise. A full-scope adversary simulation targeting a specific objective: exfiltrating sensitive data, compromising executive systems, achieving domain controller access. Red team exercises are typically longer (weeks to months), involve a small team operating covertly, and test the entire defensive capability of the organisation including detection and response. The output is not a list of vulnerabilities: it is a narrative of how the simulated attack unfolded and where defences succeeded or failed. It answers the question: could a determined, skilled attacker achieve their objective against our organisation, and would we detect and respond to them?

Most organisations need penetration testing, not red team exercises. Red team exercises are appropriate once an organisation has mature security operations: a functioning SOC, an established incident response process, and confidence in its baseline controls. Testing detection and response before those capabilities exist is premature.

Cyber Attack Kill Chain Seven phases of an attack, from reconnaissance through actions-on-objectives. Automated scanners detect the first three phases; human penetration testers are required to simulate the last four. SCANNERS CATCH THIS HUMAN TESTERS REQUIRED 1 Recon 2 Weaponise 3 Deliver 4 Exploit 5 Install 6 C2 7 Actions A scanner that checks for known CVEs stops at phase 3. A penetration test covers all seven.

The Attacker Mindset: Assume Breach, Lateral Movement, Persistence

Effective penetration testing requires testers who think like attackers: not just testers who can run the right tools. The distinction is in the mental model.

The attacker mindset starts with the assumption that initial access is achievable. The interesting question is not whether initial access is possible: for a determined attacker against any organisation of meaningful size, it almost always is. The interesting questions are: what happens next? What can the attacker reach after initial access? How do they move from a compromised workstation to the domain controller? From the domain controller to the cloud tenant? From the cloud tenant to the production database?

This means skilled penetration testers focus heavily on post-exploitation techniques:

  • Lateral movement. Using the access obtained from one compromised system to gain access to adjacent systems: via credential reuse, Kerberoasting, Pass-the-Hash, SMB relay, RDP, SSH key reuse, or protocol abuse.
  • Privilege escalation. Moving from a standard user account to local administrator, from local administrator to domain administrator, from domain administrator to enterprise administrator or Azure Global Admin.
  • Persistence. Establishing mechanisms to maintain access across reboots, credential rotations, and defensive responses: scheduled tasks, registry run keys, golden tickets, backdoored service accounts.
  • Data identification and exfiltration. Locating high-value data (credentials, customer PII, financial records, IP) and demonstrating that it can be exfiltrated through existing defences.

A penetration test that finds a list of vulnerabilities but does not demonstrate the path from those vulnerabilities to business impact is less than half the work. The value of the test is in understanding what an attacker could actually do: not just that they could get a foothold.

Certifications: What They Mean and What They Do Not

Certifications in offensive security serve as a signal of technical competence, but they are not all equivalent. Understanding the certification landscape helps organisations evaluate penetration testing providers:

  • OSCP (Offensive Security Certified Professional). Widely regarded as the baseline credential for competent penetration testers. The exam requires candidates to compromise multiple machines in a controlled environment within 24 hours, entirely hands-on with no multiple-choice questions. OSCP demonstrates practical exploitation capability. It is the minimum bar worth considering for technical testers.
  • OSCE3 (Offensive Security Experienced Expert, triple certification). Comprising OSEP (advanced evasion and lateral movement), OSED (exploit development), and OSWE (web application exploitation). OSCE3 represents advanced offensive capability beyond what most engagements require, but signals a tester who can handle novel, complex environments.
  • CREST (Council of Registered Ethical Security Testers). A UK-based certification body whose certifications (CRT, CCT App, CCT Inf) are commonly specified in UK and European procurement requirements. CREST assessments are structured around a tiered competency model and are relevant for organisations whose compliance frameworks specify CREST-certified providers.
  • CEH (Certified Ethical Hacker). A widely recognised name that is considered by practitioners to be a weak signal of practical capability. CEH is primarily a knowledge-based certification with limited hands-on components. Its presence on a CV is not a meaningful indicator of penetration testing competence.

Certifications are a starting point, not a conclusion. The more important evaluation criteria are the specific experience of the testers who will conduct your engagement, the quality of reports from previous engagements (which reputable providers can share in redacted form), and the provider's understanding of your industry and regulatory context.

What Good Reporting Looks Like: Beyond CVSS Scores

The final deliverable of a penetration test is the report. This is where most testing providers fail, and where the gap between a compliance-oriented test and a genuinely useful assessment is most visible.

CVSS scores are not sufficient context for risk prioritisation. A CVSS 9.8 remote code execution vulnerability on a server that is not reachable from the internet, contains no sensitive data, and has no trust relationships with critical systems is less urgent than a CVSS 6.5 authentication bypass on the system that holds customer payment records. Context matters. Business impact matters. CVSS scores are designed to describe the technical characteristics of a vulnerability in the abstract: they are not designed to tell you what to fix first.

Good penetration test reports include:

  • An executive summary. Written for a non-technical audience (the CISO, the board, the CEO) that describes what was found, what the business impact of the most significant findings would be, and what the overall security posture conclusion is. Jargon-free, business-oriented, honest.
  • Attack path narratives. For complex findings, a step-by-step description of how the vulnerability was exploited and what the tester was able to achieve: not just the finding itself but the exploitation chain. This is how findings become comprehensible to development and operations teams.
  • Business impact context. What specific data, systems, or operational capabilities would be affected by successful exploitation? What is the realistic impact on the organisation, not the abstract CVSS severity?
  • Remediation guidance that is specific, not generic. Not merely apply patches, but the specific patch, the specific system, and the specific configuration change required. Not just implement input validation, but a description of the specific parameter, the expected validation logic, and code-level guidance for the development team.
  • Evidence. Screenshots, captured network traffic, command output. Not to demonstrate that the tester was impressive, but to enable internal teams to reproduce the finding during remediation verification and to provide audit-quality evidence of testing.

Testing Frequency Recommendations

Annual penetration testing is the minimum, not the standard. The appropriate testing cadence depends on the rate of change in your environment and your risk profile:

  • Post-deployment. Any significant new application or infrastructure component should be tested before or immediately after production deployment. Building penetration testing into the development lifecycle: as part of the security sign-off for major releases: is more effective than relying entirely on annual assessments.
  • Post-significant change. Major architectural changes, cloud migrations, new integrations with third-party systems, significant access control changes: these warrant targeted security testing. The previous assessment no longer accurately describes the current environment.
  • Annual minimum for the full environment. A comprehensive assessment covering external perimeter, internal network, and critical applications at least annually. More frequently for high-risk environments.
  • Continuous for external attack surface. External attack surface monitoring tools (not full penetration tests, but continuous assessment of the external-facing environment) provide the coverage that fills the gap between point-in-time assessments.

Norwegian Regulatory Requirements: What Is Actually Required

Norwegian organisations across several sectors face regulatory requirements that either mandate or strongly imply penetration testing:

  • Financial sector (Finanstilsynet and DORA). DORA (Digital Operational Resilience Act), applicable from January 2025, requires financial entities to conduct Threat-Led Penetration Testing (TLPT) on critical systems. TLPT is a more rigorous form of red team testing based on the TIBER-EU framework. Smaller financial entities may be exempt from TLPT but still face expectations around ICT security testing under Finanstilsynet supervision.
  • Healthcare. The Norwegian Norm for information security in health, care, and social services (Normen) sets expectations for security testing of systems handling health data. Health enterprises connected to NHN are expected to maintain documented security testing programmes. The severity of patient data breach consequences: both regulatory and operational: makes comprehensive testing a risk management imperative independent of regulatory requirements.
  • Government and critical infrastructure. NSM (Nasjonal sikkerhetsmyndighet) guidance for critical infrastructure operators (kraft, transport, telecom, water) includes expectations around security testing. Organisations subject to the Norwegian Security Act (sikkerhetsloven) have additional obligations around the security of classified systems and nationally important infrastructure.
  • NIS2 (Network and Information Security Directive 2). Being transposed into Norwegian law, NIS2 expands the scope of organisations subject to cybersecurity requirements and increases expectations around incident management, supply chain security, and security testing. Essential and important entities under NIS2: covering a broad range of sectors including energy, transport, health, digital infrastructure, and manufacturing: will face documented testing requirements.

Expanding the Scope: What We Actually Test

Web applications and external networks are the starting point, not the destination. Modern environments demand testing across far more attack surface than the traditional scope of a few years ago. Each additional surface changes the tradecraft and the toolset.

Mobile Applications

iOS and Android testing follows the OWASP Mobile Application Security Verification Standard (MASVS). The work runs on physical devices rather than emulators so findings reflect real runtime behaviour: insecure local storage, broken transport security, weak cryptographic primitives, session handling flaws, and missing anti-tampering. Dynamic instrumentation with Frida and Objection lets us hook live processes, intercept API calls, bypass client-side controls, and extract the same secrets an attacker with a rooted or jailbroken device would reach for.

APIs and Microservices

API testing is its own discipline now. REST, GraphQL, gRPC, and webhook endpoints each carry a different class of authorisation failure. Broken Object Level Authorisation (BOLA) and Broken Function Level Authorisation (BFLA) remain the dominant real-world API bugs. GraphQL adds introspection leakage and cost-of-query abuse. Webhooks introduce signature-bypass and replay paths. A competent API test enumerates every object type, validates authorisation on every operation, and maps the JWT or OAuth flow end to end including token scope abuse, refresh-token misuse, and direct-access paths that bypass the front-end entirely.

Active Directory and Identity

Active Directory, Entra ID, and federated identity systems reward specific tradecraft. Kerberoasting and ASREP-roasting remain effective on poorly managed tiers. The AD Certificate Services (ADCS) ecosystem alone produced fifteen exploitable misconfigurations from ESC1 through ESC15, each with a distinct escalation path. Delegation abuse — unconstrained, constrained, and resource-based — turns compromise of a low-privilege account into tier-0 access in a single step on unprepared estates. BloodHound and SharpHound let us compute the shortest attack paths from any user, group, or service principal to Domain Admin so the report shows the exact chain rather than a list of disconnected findings.

Cloud and Kubernetes

Azure, AWS, and GCP configurations are tested against the relevant CIS benchmarks, but that is the floor. The real exploit paths come from over-permissioned managed identities, misconfigured OIDC trust policies between cloud and CI, and lateral movement from a compromised workload into the management plane through unprotected metadata services or leaked instance profiles. Kubernetes clusters add escaped-pod scenarios, service account tokens with cluster-admin bindings, exposed etcd, and supply-chain attacks via unsigned container images or compromised Helm charts.

RF and Wireless

Wireless is not just WiFi. Enterprise EAP methods are frequently deployed with certificate validation disabled, letting a rogue access point harvest credentials from every device that roams past it. Bluetooth and BLE pairing flaws, Zigbee and LoRa mesh analysis, and software-defined radio reconnaissance against door controllers, building management systems, and industrial endpoints all fall under this scope. Evil-twin scenarios against real corporate SSIDs are performed under strict rules of engagement but produce unambiguous evidence of what a motivated attacker can collect from the car park.

Physical, RFID and Badge Cloning

Every network has a door. Physical intrusion simulation covers tailgating, covert entry, lock bypass, and badge reader attacks. RFID and NFC cloning against HID iClass, MIFARE Classic, MIFARE DESFire, and similar credentials uses Proxmark3 and hands-on tradecraft rather than theoretical attacks. A successful physical engagement typically ends with the operator seated at an unattended workstation inside the client secure area, with a camera log demonstrating exactly how they got there.

Full-Scope Red Team

Full-scope red team engagements combine all of the above into a single goal-driven operation. The objective is a specific outcome agreed with the client: acquisition of the crown-jewel dataset, simulated ransomware deployment against non-production targets, unauthorised access to executive communications, or similar. The engagement treats the entire attack surface — digital, human, physical, and RF — as in-scope and measures not just whether the outcome was achievable but how long the blue team took to detect and respond at each stage. The final report maps every action end-to-end to MITRE ATT&CK tactics and techniques so defenders can measure detection coverage directly against the kill chain that was actually used.

The Platform Multiplier: How We Run 100x More Tests Per Engagement

Traditional penetration testing scales badly. A good senior operator produces excellent findings, but they do so while also juggling: remembering which subdomains they have already enumerated, which payloads worked on which endpoint, which findings are already documented, and which piece of evidence belongs to which ticket. Most of the hours in a two-week engagement go to bookkeeping and context-switching, not to offensive work. That is the problem ZeroSubnet built our own platform to solve.

ZeroSubnet pentest platform: operator swarms, central coordination, shared findings Multi-operator pentest platform: local execution, shared intelligence Operator A Sandvika, Norway Recon agent Web app agent AD / identity agent Cloud / K8s agent LOCAL STACK nmap | ffuf | nuclei | burp hashcat | proxmark | SDR SQLite, encrypted at rest client traffic never leaves the operator machine Operator B remote, on-site, or physical RF / wireless agent Physical / RFID agent Mobile app agent API / microservices agent LOCAL STACK frida | objection | bloodhound lockpicks | cameras | badges SQLite, encrypted at rest Central Platform ZeroSubnet, Norway-hosted PostgreSQL findings store Real-time dashboard Semantic search, RAG Methodology tracking Automated report engine Scope enforcement Internal AI assistant sync share sync share TLS-encrypted sync over NATS JetStream findings live updated across operators offline-safe resume
Every operator runs their own swarm of specialist agents on their own machine, using the real tools of the trade, against the target. Findings sync to a central platform in near real time so the whole team sees the same picture at the same moment. No client traffic transits the cloud, all coordination and report generation happens on Norwegian-hosted infrastructure.

Operator Swarms, Local-First Execution

Each ZeroSubnet operator runs a private swarm of specialist agents on their own machine during the engagement. One agent focuses on external recon, another on web application testing, another on Active Directory, another on cloud workloads, and so on. Every agent has a curated set of real-world tools available as first-class capabilities: nmap, masscan, ffuf, nuclei, Burp Suite, hashcat, Frida, BloodHound, Proxmark3, SDR tooling, and hundreds more. The operator directs the work, reviews every action, and makes every decision that matters. The platform runs the busywork.

Critically, execution is local. Tools run on the operator machine against the target. Findings and evidence are written to an encrypted local database. No client traffic transits any cloud service, and the platform never sits in the attack path. A dropped internet connection does not stop the engagement. The operator keeps working and any accumulated state syncs back to the central platform automatically when connectivity returns.

Central Coordination Without Central Execution

The central platform is where findings, scope, evidence, methodology progress, and report output live. It is hosted in Norway on ZeroSubnet-managed infrastructure and scoped per engagement. Operators sync to it over TLS-encrypted messaging so the live picture of the engagement — every confirmed finding, every piece of evidence, every checkpoint in the methodology — is visible to every authorised team member in real time. Report generation runs centrally so the deliverable the client receives is assembled from the same data the team worked with, not re-keyed from notebooks.

Multi-Operator Collaboration on the Same Engagement

A single operator is a limit. Larger engagements — full-scope red team, multi-site assessments, enterprise-wide AD reviews — benefit directly from multiple operators working in parallel. The platform was built for that case from day one. Operator A in Sandvika working web applications and Operator B on-site working physical and RF both see each other findings appear in the shared dashboard within seconds. A credential recovered by the physical operator is immediately available to the AD operator as a pivot point. A shell popped on a web application becomes a lateral-movement launch point for the internal-network operator. The coordination overhead that kills most multi-operator engagements is handled by the platform rather than by a daily stand-up.

Internal AI Assistant, Not a Replacement for Expertise

The platform includes an internal AI assistant, hosted on ZeroSubnet infrastructure, that supports the operators in four specific ways: semantic search across every prior engagement so a novel target service is immediately cross-referenced against everything the team has seen before, methodology suggestion so a given finding triggers the related follow-on checks without the operator having to remember them, report drafting so raw evidence and observations are turned into a first-pass writeup the operator edits rather than writes from scratch, and query assistance for complex tools like BloodHound where the right Cypher query saves hours of manual graph exploration.

The assistant does not make exploitation decisions and does not touch client systems. It is scoped to the operator desktop as a productivity multiplier for the things that operators historically did slowly and manually. Every action on a client system is initiated and reviewed by a human operator.

What the Multiplier Actually Looks Like

The effect of this platform on engagement output is concrete and measurable against traditional testing workflows. A two-week engagement that previously produced roughly 30-50 tested areas of interest now covers hundreds because the recon, enumeration, and regression workloads that dominated operator time are handled by specialist agents in the background. Findings are cross-referenced automatically, so a credential leaked in one scope finds its way to every other scope where it might work, rather than sitting in a notebook until someone remembers it on day twelve. Evidence is captured structurally at the moment of exploitation, rather than reconstructed from scrollback when the report is due.

The output is not an automated scan report. Every finding is still reviewed, validated, and written up by a senior human operator who made the call to exploit it. The platform multiplies their depth and breadth; it does not replace their judgement.

What This Means for the Client

Two practical outcomes. First, the report is wider. A fixed-budget engagement covers more scope, with more confirmed findings, because the operators spent their time on exploitation and analysis rather than on administrative overhead. Second, the report is deeper. Findings are not just listed with a CVSS score, they are situated in the engagement narrative, mapped to MITRE ATT&CK, and linked to the evidence that proves them. Remediation retest happens inside the same platform so the client can see, finding by finding, which ones are closed and which ones still reproduce.

Closing: ZeroSubnet Penetration Testing Services

ZeroSubnet provides penetration testing services delivered by Norwegian security professionals with OSCP and advanced offensive security certifications. Our engagements cover external network testing, internal network and Active Directory assessments, web application and API testing, cloud security assessments (Azure and AWS), and social engineering simulations.

We do not offer compliance checkbox tests. Every engagement produces a report that accurately describes what we found, what the business impact would be, and what you need to do: in that order. We write reports for both the board and the development team, because a finding that does not reach the people who can fix it is a finding that will still be there at next year's assessment.

For Norwegian organisations navigating regulatory requirements: DORA, Normen, NIS2, NSM guidance: we understand both the technical and the regulatory context. We can help you design a testing programme that meets your compliance obligations while producing genuine security intelligence rather than paperwork.

Contact ZeroSubnet to discuss your environment, your compliance obligations, and what a realistic assessment of your current security posture would look like. We will tell you what we find: including the things you might prefer not to hear: because that is the only kind of testing that is actually useful.

Subscribe to our newsletter

Stay in touch and keep up to date with our latest company news and relevant updates.
  • Thank you, check your inbox

    Thank you for subscribing, we have sent you an email, please click the link in the email to confirm your subscription.

©2026 ZeroSubnet AS  ·  Org. nr. 923 669 442
Leif Tronstads plass 6, 1337 Sandvika