- 40–60 % of search budget is typically wasted on irrelevant queries.
- A negative-keyword scanner automates surfacing those money-pits in real time.
- You’ll build (or buy) a scanner that:
- Streams search-term data via the Google Ads API.
- Flags irrelevancies with rule-sets + NLP sentiment filters.
- Bulk-applies account-level or shared list negatives—now up to 10,000 per PMax campaign (March 2025 limit increase).
- Expect 12-25 % CPA reductions inside 30 days if your account was hygiene-starved.
1. Why Negative Keywords Still Matter in 2025
Budget bleeds keep scaling with Google’s ML, not disappearing.
Even after Broad Match’s smart-matching renaissance and Performance Max (PMax) rollout, Google can’t read every nuance of your offer.
- Quality Score uplift: Fewer irrelevant impressions → higher expected CTR.
- Auction-time savings: You stop bidding against yourself on junk.
- Brand safety & inclusivity: Exclude offensive or culturally insensitive terms before they ever trigger (important for global audiences).
Surprising stat
In a 2025 Black Bear Media audit of 110 SMB accounts (>$50 M annual spend), average wasted spend was 43 %—unchanged from 2023 despite broader adoption of Smart Bidding.
2. Manual vs. Automated Scanning
| Approach | Time/Week | Typical Miss Rate | Best For |
|---|---|---|---|
| Manual Search Term Report (STR) review | 2–8 h | 25 – 40 % | Early-stage accounts < $5 k/mo |
| Spreadsheet rule-based scanner | 1 h | 10 – 20 % | Mid-scale accounts & solo PPCs |
| API + NLP scanner (this guide) | 0.5 h | < 5 % | High-spend / multi-brand |
3. Anatomy of a Negative Keyword Scanner
Data Ingestion
Relevancy Engine
- Rule Layer: Regex or keyword distance algorithms (
Levenshtein,Jaccard) to catch obvious mismatches. - NLP Layer: Transformer (e.g., MiniLM) flagging semantic outliers—useful when intent is subtle (e.g., “free templates” queries for a paid SaaS).
- Sentiment/Context Filters: Drop searches containing discriminatory or sensitive language.
- Rule Layer: Regex or keyword distance algorithms (
Recommendation Scorer
- Wasted Cost Score = (Cost – ConvValue) / Cost.
- Trigger threshold: e.g., ≥ $10 wasted or ≥ 100 impressions with 0 conversions.
Action Layer
- Account-level negatives (Search & Shopping) via
CustomerNegativeCriterionService—launched 2024 and now default best practice. - Shared Negative Lists for cross-MCC propagation (beta in Ads Editor 2.5).
- Slack or email digest with one-click approve.
- Account-level negatives (Search & Shopping) via
Prerequisites
- Python 3.12,
google-ads==23.x,pandas,scikit-learn,sentence-transformers. - Google Ads manager-level OAuth.
- BigQuery dataset
ads_raw.
A. Authenticate & Stream
from google.ads.googleads.client import GoogleAdsClient
import pandas as pd
client = GoogleAdsClient.load_from_storage("google-ads.yaml")
query = """
SELECT
search_term_view.search_term,
metrics.impressions,
metrics.clicks,
metrics.cost_micros,
metrics.conversions,
campaign.id,
ad_group.id
FROM search_term_view
WHERE segments.date DURING LAST_7_DAYS
"""
stream = client.get_service("GoogleAdsService").search_stream(customer_id, query=query)
rows = [
{
"term": r.search_term_view.search_term,
"impr": r.metrics.impressions,
"clicks": r.metrics.clicks,
"cost": r.metrics.cost_micros / 1e6,
"conv": r.metrics.conversions,
"campaign_id": r.campaign.id,
"ad_group_id": r.ad_group.id,
}
for batch in stream for r in batch.results ]
df = pd.DataFrame(rows)
df.to_gbq("ads_raw.str_terms", project_id="bbm-marketing")
B. Flag Obvious Junk
# build baseline rules
blocked_patterns = [
r"\bfree\b", r"\bsalary\b", r"\bpng\b", r"\bdefinition\b",
]
df["regex_flag"] = df["term"].str.contains("|".join(blocked_patterns), case=False)
C. Semantic Outlier Detection
from sentence_transformers import SentenceTransformer
from sklearn.neighbors import LocalOutlierFactor
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
embeddings = model.encode(df["term"].tolist(), batch_size=32, show_progress_bar=False)
lof = LocalOutlierFactor(n_neighbors=35, novelty=False) # unsupervised
df["semantic_flag"] = lof.fit_predict(embeddings) == -1
D. Scoring & Thresholding
df["wasted_cost"] = df["cost"] - (df["conv"] * avg_order_value)
candidates = df[(df["wasted_cost"] > 10) & (df["regex_flag"] | df["semantic_flag"])]
E. Bulk Push Negatives
from google.ads.googleads.v16.services.types import MutateCustomerNegativeCriteriaRequest
from google.ads.googleads.v16.resources.types import CustomerNegativeCriterion
from google.ads.googleads.v16.enums.types import KeywordMatchTypeEnum
ops = []
for term in candidates["term"]:
crit = CustomerNegativeCriterion(
content=term,
keyword=CustomerNegativeCriterion.KeywordInfo(
text=term,
match_type=KeywordMatchTypeEnum.EXACT
)
)
ops.append({"create": crit})
service = client.get_service("CustomerNegativeCriterionService")
service.mutate_customer_negative_criteria(
MutateCustomerNegativeCriteriaRequest(
customer_id=customer_id,
operations=ops
)
)
F. Slack Digest
Use Incoming Webhooks to post a table of term, wasted_cost.
5. Buy vs. Build: Commercial Scanners Rundown
| Tool | Price | USP | Caveat |
|---|---|---|---|
| Optmyzr Rule Engine | $249/mo | Drag-and-drop rules | Limited NLP |
| Brainlabs Pine | % of spend | Cross-channel negatives | Agency-only |
| Adalysis | $99/mo | Built-in STR miner | No PMax yet |
| In-house (this guide) | Your dev time | Full control | Maintenance burden |
If time-to-value < 10 h, commercial makes sense. Otherwise, own the IP.
6. Advanced Scanner Rules You’re Probably Missing
- “Someone else’s brand + coupon” loop – Blocks affiliate arbitrage queries without hurting competitor conquesting.
- Gendered or ableist slurs – Stay inclusive; auto-exclude slurs in 13 languages via Unicode regex library.
- Low purchase-intent modifiers – “definition”, “meaning”, “wiki”, “song”, “meme”.
- B2B vs. B2C clash filter – If you sell SaaS, negate “salary”, “job description”.
- AI hallucination catch-all – Many ChatGPT users search entire prompts verbatim; set a >10-word length rule to block.
7. Real-World Results
| Metric | Before Scanner | After 30 Days | Δ |
|---|---|---|---|
| Avg. CPA | $112 | $84 | -25 % |
| Conv. Rate | 3.1 % | 3.8 % | +22 % |
| Search Impression Share Lost (Budget) | 21 % | 10 % | -11 pp |
Case study: Multi-store e-commerce client, $180 k/mo spend, implemented May 2025.
8. Maintenance Playbook
- Daily: Auto-apply exact negatives flagged > $50 wasted.
- Weekly: Human review borderline terms (brand adjacents).
- Quarterly:
- Refresh sentiment dictionary; include new slang.
- Merge overlapping lists; dedupe across MCC.
- Check PMax negative limit—now 10 k, but Google hints at “list support later this year”.
9. Common Pitfalls (And How to Dodge Them)
| Pitfall | Fix |
|---|---|
| Over-negation hurting volume | Use Phrase-match negatives sparingly; lean on exact. |
| Duplicates across lists | Schedule a nightly DISTINCT dedupe job. |
| Mismatch with Smart Bidding | Exclude only truly irrelevant searches; let tCPA flex on edge cases. |
| Ignoring account-level negatives | Set master exclusions once; inherit for PMax & Shopping. |
Heads-up: Some advertisers reported account-level negatives “disappearing” after bulk edits in Jan 2025 Ads UI glitch. Double-check change history if you saw sudden traffic spikes ŧhat week.
10. The Road Ahead
- Shared Lists for PMax – Announced but not live; early testers see 15 % faster rollout than ad-hoc PMax negatives.
- LLM-powered auto-labeling – Google’s AI Essentials hints at GPT-like classifiers inside Ads; expect closed beta Q4 2025.
- Cross-channel negatives – Unified for YouTube, Discover, and DV360 (speculative, but logical next step).
11. Internal Resources & Next Steps
- Further tighten query control with our SKAG Structure Guide.
- Ready for Performance Max mastery? Read The 2025 PMax Optimization Bible.
Conclusion
A robust negative keyword scanner isn’t a nice-to-have—it’s the throttle that keeps Google’s ML from joyriding your budget. Whether you code your own or plug in a SaaS tool, the combination of real-time query mining, semantic outlier detection, and bulk application at the account level delivers measurable gains: lower CPA, higher Quality Score, and richer, more inclusive messaging. Start small, iterate weekly, and watch your spend efficiency climb.
Last validated: July 13 2025. If Google deprecates APIs or UI flows referenced above, we’ll update this guide within 48 h.

