You push a small fix at 18:47 on a Friday. The tests pass. The build goes green. A few minutes later, someone spots a live API key in the repo history, or worse, in a frontend bundle that already shipped to users.
That’s the moment ci cd security testing stops feeling like theory.
Fast-moving teams hit this constantly. A rushed dependency upgrade. A GitHub Actions secret with wider access than anyone intended. A Supabase policy that looked fine in review but leaks data once real queries hit it. Most pipeline failures aren’t dramatic in the commit itself. They look ordinary right up until they become expensive.
Why CI/CD Security is Non-Negotiable in 2026

UK teams can’t treat pipeline security as a nice-to-have any more. In the UK, 57% of organisations reported security incidents tied to exposed secrets in insecure DevOps processes over the past two years, and for mid-sized UK firms the average cost of a single data breach reached £3.4 million. Early automated testing in the pipeline has been shown to reduce those costs by as much as 40%, according to StepSecurity’s CI/CD security analysis.
The important part isn’t only the number. It’s where the damage starts. In most startups, the pipeline has broad access by design. It can read repositories, pull dependencies, inject secrets, build containers, publish artefacts, and deploy to staging or production. If an attacker gets into that path, or if your own team accidentally leaks something inside it, they don’t need to break through the front door. You’ve already handed them the service corridor.
What goes wrong in real pipelines
The common failure modes are boring. That’s why they’re dangerous.
- Exposed secrets often come from copied environment files, debug scripts, mobile config files, or rushed CI variables.
- Weak dependency hygiene slips in when a team upgrades fast but doesn’t scan what changed.
- Misconfigured permissions show up when a build job can do far more than it needs.
- Missing policy tests hurt modern stacks, especially when database access rules live outside the main application code.
Practical rule: if a pipeline can deploy code, it deserves the same security scrutiny as production.
That’s also why “shift left” matters in practice. It doesn’t mean throwing noisy tools at developers earlier. It means catching the class of mistakes that are cheap to fix before they become release blockers, incident-response work, or legal risk.
Security has to fit how teams actually ship
A lot of advice on this topic still assumes a fairly traditional stack. Many UK startups aren’t running that. They’re shipping with Supabase, Firebase, React Native, Flutter, Next.js, edge functions, and AI-generated code. Those stacks move quickly, but they also create new blind spots around secrets, access rules, generated configs, and client-exposed endpoints.
If you’re trying to tighten your pipeline while keeping delivery moving, it helps to ground the work in proven secure software development best practices rather than one-off scans. The teams that do this well don’t bolt security on at the end. They wire it into commits, pull requests, builds, and release gates so the defaults are safer.
Security done late feels like bureaucracy. Security done early feels like quality control.
A Practical Guide to CI/CD Security Test Types
Organizations don’t need more acronyms. They need to know which test catches which problem, when to run it, and whether it’s fast enough to sit in the critical path.
The baseline set usually includes SAST, SCA, secret scanning, IaC scanning, and DAST. On modern cloud-native stacks, you often add policy and configuration checks that generic scanners won’t cover well.
CI/CD Security Test Types Compared
| Test Type | What It Checks | Best Pipeline Stage | Typical Speed | |---|---|---|---| | SAST | Source code patterns linked to security flaws | Pull request, pre-merge, build | Fast to moderate | | SCA | Third-party packages and known dependency vulnerabilities | Install or build stage | Fast | | Secret scanning | API keys, tokens, credentials, certificates in code and artefacts | Pre-commit, push, pull request | Fast | | IaC scanning | Risky cloud and infrastructure configuration in Terraform, YAML, and similar files | Pre-merge, infra plan stage | Fast to moderate | | DAST | Behaviour of a running app from the outside | Staging after deploy | Moderate to slower | | Container image scanning | Vulnerabilities and unsafe packages in built images | After image build, before deploy | Moderate | | Policy and access-rule testing | Database rules, auth logic, permissions, RPC exposure | Pre-prod and staging validation | Varies by tool |
SAST catches code issues before runtime
Static Application Security Testing looks at source code without running the app. It’s good at finding risky coding patterns early, especially in pull requests.
That makes SAST valuable for backend services, edge functions, server-side TypeScript, and mobile code. It’s less useful when your biggest risk sits in declarative config or hosted platform policies rather than in the application logic itself.
Good SAST setups work because they stay focused. They scan changed code, report into the pull request, and fail only on findings the team agrees matter.
SCA keeps your dependencies from becoming your problem
Software Composition Analysis checks your open-source libraries and transitive dependencies for known issues. If your stack pulls in npm packages, Gradle dependencies, CocoaPods, Docker base images, or SDKs, SCA should run automatically.
This matters even more in startups because teams often move quickly with community packages. The failure pattern is familiar. A library gets added to solve one urgent problem. Six weeks later nobody remembers it’s there, but it still ships to production.
SCA tends to be one of the easiest wins in ci cd security testing because it’s usually quick and easy to automate. The hard part is triage. If you block every advisory instantly, developers will stop trusting the gate.
Secret scanning has the highest value-to-effort ratio
If you add only one thing this week, add secret scanning.
It catches the mistakes that happen under time pressure. Hardcoded API keys, database passwords, service account JSON, mobile config blobs, and copied credentials from local testing. Secret scanning also belongs early. Pre-commit is great. Pull request is mandatory. Build-time only is too late if the code already landed in history.
A leaked secret is rarely just a leaked secret. It’s usually a shortcut into a larger system.
IaC scanning matters if your pipeline provisions anything
If your team uses Terraform, GitHub Actions workflow files, cloud config, or deployment manifests, Infrastructure as Code scanning helps spot dangerous defaults before they go live.
Teams catch things like:
- Over-broad roles that give build jobs too much power
- Public resources created without intent
- Missing encryption settings in managed services
- Unsafe network exposure introduced by copied templates
IaC scanners are especially useful in PR workflows because reviewers often miss infrastructure risk in otherwise normal-looking config changes.
DAST shows what the running app exposes
Dynamic Application Security Testing works against a live app, usually in staging. It can find issues that static tools won’t, such as exposed routes, weak auth behaviour, or insecure responses.
The downside is speed and signal quality. DAST is rarely something you want on every tiny commit in a small startup unless it’s tightly scoped. It fits better after deployment to a test environment, especially for release branches or nightly checks.
If you’re trying to understand the trade-offs between early static checks and later runtime checks, this breakdown of SAST vs DAST is worth reading because it mirrors what happens in pipelines rather than treating them as interchangeable.
What modern stacks need beyond the basics
Supabase and Firebase apps often need extra checks that generic enterprise tooling won’t handle well. The weak points usually sit in access rules, RPC exposure, auth wiring, mobile builds, and frontend artefacts that include too much.
A practical starter stack for a startup usually looks like this:
- Secret scanning on every push so obvious leaks never settle into the repo.
- SAST on pull requests for changed application code.
- SCA during build to catch risky dependencies.
- IaC scanning if the repo provisions cloud resources or pipeline infrastructure.
- Targeted DAST in staging for externally reachable apps.
- Policy-specific tests for database rules and backend access logic.
That mix won’t catch everything. Nothing will. But it catches the issues that repeatedly hurt fast teams.
Embedding Security Checks in Your Pipeline Code
A security programme doesn’t exist until it runs automatically. Slide decks don’t stop leaked tokens. Pipeline code does.
In the UK, 72% of organisations had integrated security testing into their CI/CD pipelines by 2024, and in compliant setups failing a critical vulnerability check blocks 95% of pre-production deployments, according to the OWASP CI/CD security cheat sheet context cited here. That aligns with what works operationally. The important checks run by default, inside the same workflow that builds and ships the app.

Start with two checks that earn their place
If you’re securing a pipeline for the first time, begin with:
- Secret scanning, because it catches high-impact mistakes early.
- SAST, because it gives developers feedback before code reaches production.
That pair gives strong coverage without turning the pipeline into treacle. You can add dependency, container, and policy checks after the team trusts the workflow.
GitHub Actions example
This is a straightforward GitHub Actions setup that runs on pull requests and pushes to the main branch. It uses a secret scanner and Semgrep for SAST.
name: ci-security
on:
pull_request:
push:
branches:
- main
jobs:
security-checks:
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Run secret scan
uses: gitleaks/gitleaks-action@v2
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.x'
- name: Install Semgrep
run: pip install semgrep
- name: Run SAST scan
run: semgrep --config=p/ci --error
This isn’t fancy, and that’s good. Security checks in CI should be obvious to maintain.
A few points matter:
permissions: contents: readkeeps the workflow tighter than the default broad permission set.- Secret scanning before SAST means you fail fast on a leak rather than spending time on later jobs.
--errorin Semgrep makes the job fail when matching findings appear, which turns the scan from passive reporting into an enforceable gate.
Tuning the GitHub workflow
A second pass is frequently needed after the first week. Raw defaults usually create friction.
Use these tuning levers carefully:
- Scope the scan by excluding generated directories, build outputs, and vendor folders that create noise.
- Split blocking and non-blocking jobs if you want quick failure for secrets but informational reporting for lower-confidence SAST checks.
- Run changed-files scans where possible for pull requests, then use full scans on the main branch or nightly workflows.
Operational advice: if developers can’t tell why a job failed from the pull request screen, they’ll work around it instead of fixing it.
GitLab CI example
GitLab CI can do the same job with very little ceremony. Here’s a compact example using Gitleaks and Semgrep.
stages:
- security
secret_scan:
stage: security
image: zricethezav/gitleaks:latest
script:
- gitleaks detect --source . --exit-code 1
only:
- merge_requests
- main
sast_scan:
stage: security
image: returntocorp/semgrep:latest
script:
- semgrep --config=p/ci --error
only:
- merge_requests
- main
This is enough to create a real gate. A merge request fails if a secret is detected or if Semgrep finds an issue matched by the selected ruleset.
GitLab users often benefit from keeping security jobs in a dedicated security stage rather than mixing them into generic test jobs. It keeps logs cleaner and ownership clearer. If your team is standardising GitLab workflows, this guide to continuous integration in GitLab is a useful reference point for structuring jobs cleanly.
What to customise first
The first customisation should never be “add more scanners”. It should be “make the first scanner useful”.
Three changes usually matter most:
-
Define what blocks a merge
Secrets should nearly always block immediately. SAST findings often need filtering by severity, confidence, or rule set. If you block on everything, you’ll train the team to distrust the gate.
-
Exclude noise
Ignore minified frontend bundles, dependency caches, generated SDK files, and test fixtures that create repeat false positives. Security signal gets buried quickly in modern JavaScript and mobile repos.
-
Separate branch behaviour
Pull requests need quick, explainable feedback. Main branch or scheduled workflows can run broader scans. That split keeps developer feedback tight without losing depth.
Where teams usually go wrong
The usual mistakes are operational, not technical.
- Running scans too late means bad changes already landed.
- Adding multiple tools at once makes triage chaotic.
- Failing without guidance leaves developers staring at logs with no clue what to do next.
- Scanning code but not configs misses a lot of cloud and pipeline risk.
- Treating all repositories the same ignores the difference between a marketing site, a mobile app, and a Supabase backend.
For mobile and backend-heavy products, I also like to keep security jobs separate from unit-test jobs in the UI. It’s a small change, but it helps teams understand whether they’re looking at a coding defect, a flaky test, or an actual security issue.
The best ci cd security testing setups feel boring in daily use. They run quickly, fail clearly, and only stop a release when the stop is defensible.
Smart Gating Strategies Without Killing Developer Velocity
The loudest objection to ci cd security testing is still the oldest one. It will slow the team down.
That can happen. It happens all the time when teams dump a pile of scanners into CI, block on every warning, and call the noise “shift left”. But the better UK data points in the opposite direction. A 2025 study of 500 UK DevOps teams found that shift-left testing reduces production vulnerabilities by 62% and can boost deployment frequency by 2.1x for teams using modern stacks, according to Palo Alto Networks’ summary of the study.

The reason is simple. Early checks are cheaper than late incidents. A fast failure in a pull request is annoying. A data leak, rollback, hotfix, customer response, and compliance review is slower.
Block selectively, not emotionally
Security gates should be opinionated, not dramatic.
A practical gating model looks like this:
- Block on confirmed critical issues such as exposed secrets, clearly exploitable auth bypasses, or dangerous public access rules.
- Warn on medium-risk findings that need human review but don’t justify stopping every build.
- Log low-confidence or low-impact findings for backlog triage, pattern review, or scheduled clean-up.
That structure keeps your main branch protected without forcing every developer to fight the scanner on every commit.
Use different gates at different stages
One of the easiest ways to preserve velocity is to stop treating CI as a single gate. It isn’t. It’s a chain of decisions.
A healthier pattern is:
| Stage | Gate style | Good use | |---|---|---| | Pre-commit or push | Fast and strict | Secrets, obvious bad patterns | | Pull request | Focused blocking | SAST, dependency checks, config changes | | Staging | Broader validation | DAST, policy behaviour, external exposure | | Release approval | Human plus automation | Exceptions, risk acceptance, audit trail |
This gives developers fast feedback where speed matters most and deeper testing where runtime context exists.
Teams lose velocity when the pipeline produces confusion, not when it enforces standards.
What actually keeps developers onside
Developers accept strict gates when the rules are stable and the output is actionable.
That means:
- Short logs with clear remediation instead of huge scanner dumps
- Consistent thresholds so one repo doesn’t block on findings another ignores
- Exception paths for urgent releases, with review and expiry
- Ownership so someone is accountable for tuning rules and reducing false positives
What doesn’t work is performative security. Nightly scans nobody reads. failing jobs with vague labels. Security dashboards that live outside the development workflow. If a finding doesn’t show up where code review already happens, it usually loses.
The right question isn’t whether security slows delivery. It’s whether your controls are sharp enough to stop the dangerous changes and quiet enough to let normal work continue. That’s what smart gating solves.
Advanced Security Testing for Modern Stacks
Generic scanners miss a lot in stacks built around managed backends, client-heavy apps, and hosted auth. That’s especially true with Supabase, Firebase, and mobile apps that embed service configuration into artefacts and bundles.
The tricky part is that these platforms often look secure in code review. The app compiles. The tests pass. The API responds. Meanwhile, a weak Row Level Security rule, an over-exposed RPC, or a permissive Firebase rule can still leak data.
Why UK compliance changes the bar
A 2025 UK NCSC report highlighted that 68% of UK software firms fail initial pipeline audits due to gaps in areas like RLS checks for exposed data, leaving them vulnerable to an average GDPR fine of £17.5m for data breaches, as cited in this Harness overview referencing the UK compliance gap.
For UK startups, that means pipeline security has to produce more than “scan passed”. It needs evidence. You need logs, reproducibility, and checks that map to how user data is accessed.
RLS fuzzing finds what static review misses
Row Level Security fuzzing is one of the clearest examples.
A static review of a policy can tell you what the SQL appears to allow. Fuzzing tests what real queries can do under different identities and access patterns. That matters because access bugs often hide in edge conditions. Nested relationships. Unexpected joins. Auth states nobody considered. Write paths that stay open even when reads look locked down.
For Supabase especially, that kind of testing belongs near release gates, not as a once-a-quarter manual exercise.
Firebase and client-heavy apps need artefact checks
Firebase projects introduce a different problem. Security issues often sit in rules, exposed config, callable functions, and client assumptions.
Mobile apps add another layer. The compiled IPA or APK may contain secrets, endpoints, analytics tokens, debug flags, or config values that never stood out in the repo. If your pipeline only scans source code, you can miss what ships.
A practical advanced test set for modern stacks includes:
- Access-rule validation for Supabase RLS or Firebase Security Rules
- RPC and function exposure checks for backend methods callable by the client
- Frontend bundle inspection for hardcoded keys and unsafe public config
- Mobile artefact scanning against built APK or IPA files before release
- Authenticated behaviour tests to verify what users can really read and write
Security for modern stacks isn’t just code scanning. It’s behaviour verification.
Don’t separate API security from pipeline security
A lot of teams still treat API security as a separate programme run later by another team. In practice, that split causes blind spots. Your CI pipeline is often the best place to catch exposed routes, over-trusting endpoints, and weak auth assumptions before release. This guide to API security is a useful companion read because it reinforces the same core point from a different angle: the primary risk usually sits in what’s reachable and insufficiently protected, not just what looks risky in source code.
For Supabase and Firebase teams, that means testing should reflect the platform’s actual attack surface. Database rules, edge functions, RPCs, auth flows, mobile bundles, and generated client code all deserve scrutiny. Otherwise you end up with a pipeline that looks mature on paper and still misses the issue that matters.
Integrating AuditYour.App for Effortless CI Checks
At some point, most small teams hit the same wall. They can wire together secret scanning, SAST, SCA, mobile artefact review, access-rule validation, and custom policy checks, but maintaining all of that takes time they don’t have.
That’s where a specialised scanner can earn its keep. For teams shipping on Supabase, Firebase, or mobile stacks, the value isn’t another generic dashboard. It’s coverage for the things broad tools tend to miss, with feedback that fits into CI rather than living outside it.

Where it fits in a real pipeline
For practical use, I’d place a specialised scan in one of two spots:
- On pull requests for high-risk repositories, where you want fast feedback on rules, exposed functions, or leaked config before merge.
- Before staging or production deploys, where you want a final behavioural check against the app and its backend surface.
The advantage is that you don’t have to force generic SAST or DAST tools to approximate platform-specific risks. A scanner built for this class of app can focus on the actual failure modes: exposed RLS, public RPCs, mobile secrets, and frontend leakage.
Example GitHub Actions pattern
A lightweight GitHub Actions step can be as simple as triggering a scan job after build or preview creation, then failing the workflow if the returned result doesn’t meet your threshold. The same pattern works well when a PR spins up a temporary environment and you want the scan to validate what’s reachable.
The implementation details depend on how you trigger scans and consume results, but the operating model is straightforward:
- Build the app or preview environment
- Call the scanner
- Wait for findings
- Fail or warn based on the returned grade or issue class
- Post results into the PR for developer review
Example GitLab pattern
GitLab teams can mirror the same flow with a dedicated security job after build and before deploy. That works especially well if you already separate merge-request validation from deployment approval.
The main thing is to keep the scan visible and actionable. If the result lands only in an external dashboard, developers won’t see it soon enough. If it lands in CI with clear remediation, it becomes part of the normal shipping process.
For teams that want a direct way to add this sort of check without building a custom security stack first, the AuditYour.App scanner is designed for that gap. It’s particularly useful when your risk sits in Supabase policies, Firebase exposure, mobile bundles, or generated frontend code rather than in traditional backend code alone.
The best part is operational, not theoretical. You can add meaningful checks to CI without spending weeks stitching together niche tooling and edge-case scripts.
If you’re shipping quickly on Supabase, Firebase, or mobile and want CI checks that reflect those risks, AuditYour.App is a practical place to start. It gives small teams a fast way to scan for exposed RLS rules, public RPCs, leaked secrets, and mobile artefact issues, without building a full custom AppSec pipeline first.
Scan your app for this vulnerability
AuditYourApp automatically detects security misconfigurations in Supabase and Firebase projects. Get actionable remediation in minutes.
Run Free Scan