Decoding Your GA4 Data: How to Track and Segment Traffic from LLMs

Understanding where your website traffic is coming from, especially with the rise of AI tools and Large Language Models (LLMs). Mercury Technology Solutions is staying ahead of these shifts and leveraging data for strategic insights is fundamental to how we operate and advise our clients.

We're seeing AI tools like ChatGPT, Perplexity, Gemini, and others become significant points of information discovery. While Google's own AI Overviews data is still blended within Search Console, traffic is coming from these AI platforms, and understanding it is key to adapting your digital strategy effectively.

TL;DR: As users increasingly get information from AI tools (ChatGPT, Perplexity, Gemini, etc.), tracking this referral traffic in Google Analytics 4 (GA4) is vital. This guide shows you how to segment LLM traffic using regex filters within GA4 Explore reports for quick insights, or via Looker Studio (either with direct filters or by setting up a custom channel group in GA4 Admin) for more detailed, ongoing reporting. Understanding this traffic helps refine SEO and content strategies for the evolving digital landscape.

The Shifting Tides: Why LLM Traffic Segmentation Matters

The way people find and interact with information online is changing rapidly. Whether they're using conversational AI like ChatGPT, specialized tools like Perplexity, or integrated assistants, LLMs are becoming major gateways to content. Over recent months, we've observed a noticeable increase in referral traffic from these sources to our own and our clients' websites.

While the debate continues on whether to classify these tools strictly as "search engines," their functional impact is undeniable. They influence discovery and user journeys. Therefore, understanding the volume and behavior of traffic originating from LLMs is crucial for several reasons:

  • Adapting Strategy: It helps inform your content and SEO strategies (like SEVO and LLM-SEO ) to meet users on the platforms they prefer.
  • Measuring Impact: It allows you to gauge the effectiveness of efforts aimed at increasing visibility within AI environments.
  • Understanding User Behavior: Analyzing this segment reveals how users arriving from AI sources interact with your site differently.
  • Identifying Opportunities: It highlights which AI platforms are driving relevant traffic.

Methods for Tracking LLM & Chatbot Traffic in GA4

There are practical ways to isolate and analyze this traffic within your existing analytics setup. The two primary methods I recommend, depending on your needs for reporting and access levels, are using GA4 Explore reports and Looker Studio.

  • GA4 Explore Reports: Excellent for quick analysis, visualizing trends, and sharing specific insights within the GA4 interface.
  • Looker Studio: Ideal for creating shareable, potentially more customized dashboards for ongoing monitoring and deeper dives (e.g., analyzing landing pages or events specific to LLM traffic).

Let's break down how to set these up.

Method 1: Quick Analysis with GA4 Explore Reports

This is the simplest way to get an initial view using a regular expression (regex) filter.

  1. Create Exploration: In GA4, navigate to Explore and start a new Blank exploration.
  2. Set Dimensions & Metrics:
    • In the Variables column, import Session source / medium as a Dimension.
    • Import Sessions, Engaged sessions, and potentially Conversions or Key events as Metrics.
  3. Build the Report: Drag Session source / medium to Rows and your chosen Metrics (e.g., Sessions) to Values in the Tab settings column.
  4. Create LLM Segment:
    • In the Variables column, click the '+' next to Segments and choose Session segment.
    • Name your segment something descriptive (e.g., "LLM / AI Traffic").
    • Under Include sessions when:, add a condition: Session source / medium matches regex.
    • Paste the following regex pattern (or an updated version): Code snippet
      ^.*ai|.*\.openai.*|.*copilot.*|.*chatgpt.*|.*gemini.*|.*gpt.*|.*neeva.*|.*writesonic.*|.*nimble.*|.*outrider.*|.*perplexity.*|.*google.*bard.*|.*bard.*google.*|.*bard.*|.*edgeservices.*|.*you\.com.*|.*pi\.ai.*|.*claude\.ai.*|.*anthropic.*|.*astastic.*|.*copy\.ai.*|.*bnngpt.*|.*gemini.*google.*$
      
      (Important Note: This regex identifies common AI/LLM sources as of early 2025. New tools emerge constantly, and naming conventions can change. This list will require periodic review and updating to remain accurate.)
    • Click Apply and then Save and Apply.
  5. Analyze: Your report table will now be filtered to show only traffic from sources matching the regex. You can see which specific LLM sources are driving traffic and their engagement levels.
  6. (Optional) Visualize Trend: Duplicate the exploration tab. Change the Visualization type to Line chart. Drag Sessions (or another metric) to Values. This will show LLM traffic volume over time.

Method 2: Ongoing Reporting with Looker Studio

Looker Studio offers more flexibility for dashboards. You can use the same regex filter principle here in two ways:

A) Lightweight Approach (Any GA4 Access Level):

  1. Create Chart/Table: In Looker Studio, add a chart (e.g., Time series, Table) using your GA4 data source.
  2. Add Filter: Select the chart. In the Setup panel (usually on the right), scroll down to the Filter section and click Add a filter.
  3. Configure Filter:
    • Click Create a filter.
    • Give it a name (e.g., "LLM Source Filter").
    • Set the condition: Include > Session source / medium > Matches regex.
    • Paste the same regex pattern used in Method 1.
    • Click Save.
  4. Apply: The filter is now applied to that specific chart, showing only LLM traffic data. Repeat for any other charts where you want to isolate LLM traffic. This allows you to easily add LLM-specific views to existing dashboards.

B) In-Depth Approach (Requires GA4 Admin Access):

This method creates a dedicated channel group within GA4 itself, which can then be used cleanly in Looker Studio (and GA4 reports).

  1. Navigate in GA4: Go to Admin (bottom left gear icon).
  2. Find Channel Groups: Under the Property column > Data display, click Channel groups.
  3. Create New Group: Click Create new channel group.
  4. Name Group: Give the group a name (e.g., "Custom Channels incl. LLM"). Add a description if desired.
  5. Add LLM Channel: Click Add new channel.
    • Give the channel a specific name (e.g., "AI / LLM Referrals").
    • Under Channel conditions, set: Session source / medium matches regex.
    • Paste the same regex pattern used previously.
    • Click Save.
  6. Reorder Channels: Click Reorder. Drag your new "AI / LLM Referrals" channel above the default Referral channel. This ensures traffic matching the regex is assigned here first. Click Apply.
  7. Save Group: Click Save group in the top right.
  8. Use in Looker Studio: After GA4 processes the new group (allow up to 48 hours), you can select this "Custom Channels incl. LLM" group as your Default Channel Group dimension in Looker Studio reports for cleaner segmentation without needing chart-level filters.

Step-by-Step Tracking Summary

MethodKey StepsGA4 Access NeededProsCons
GA4 Explore ReportCreate Explore > Add Dims/Metrics > Create Session Segment (matches regex) > Apply Segment > VisualizeViewer+Quick analysis within GA4, easy setupLess flexible for dashboards, segment needs manual application
Looker Studio (Lightweight)Create Chart > Add Filter > Configure Filter (matches regex on Session source/medium) > Save & Apply Filter to ChartViewer+Flexible dashboards, no Admin needed, apply filter per chartFilter logic repeated per chart, relies solely on regex accuracy
Looker Studio (In-depth via GA4)GA4 Admin: Create Channel Group > Add LLM Channel (matches regex) > Reorder > Save Group. Then Looker Studio: Use new Channel Group dimension in charts.AdminCleaner segmentation in reports, consistent channel definition in GA4Requires Admin access, initial setup in GA4, processing delay (up to 48h)

Continuous Monitoring is Key

The AI landscape is fluid. New tools will emerge, and how platforms identify their traffic might change. The regex filter provided is a starting point and must be reviewed and updated periodically to remain effective. Regularly check your referral sources and refine the pattern to ensure you're accurately capturing traffic from relevant LLM tools.

Understanding this growing segment of traffic is no longer optional for businesses serious about data-driven digital strategy. Setting up this tracking now provides the insights needed to adapt, optimize, and maintain visibility as AI continues to reshape how users discover information online. At Mercury Technology Solutions, we integrate this level of analysis into our client strategies, ensuring decisions are grounded in the realities of the evolving digital ecosystem.

LLM Traffic Tracking FAQ

Q1: Why is my tracked LLM traffic low? It could be due to several factors: the LLM platforms you're interested in might not be sending significant referral traffic yet, your content might not be frequently surfaced by those LLMs, or the regex filter might be missing relevant source identifiers. Ensure your regex is up-to-date.

Q2: How often should I update the regex filter? Review it quarterly, or more often if you hear about significant new AI tools gaining traction or suspect changes in how existing tools identify themselves (e.g., changes in subdomains or parameters).

Q3: Can I track traffic from specific AI tools separately using this method? Yes. You can modify the regex or create separate segments/filters using more specific patterns for individual tools (e.g., .*perplexity.* for Perplexity, .*openai.com/.* for certain OpenAI referrers, though specifics can vary).

Q4: Does this track all interactions users have with my content via AI? No. This method primarily tracks referral traffic – instances where a user clicks a link within an AI tool's response that leads to your website. It doesn't track instances where the AI summarizes your content without linking, or where a user reads the AI's summary and then searches for your brand separately.

Q5: Is the traffic from Google's AI Overviews included in this? Currently, traffic from Google's AI Overviews is generally blended with regular Google organic traffic data in GA4 and Search Console. The regex method focuses on traffic explicitly referred from other identifiable AI platforms and chatbots. Tracking AI Overviews directly requires different approaches, often involving analyzing Search Console performance data changes.

Q6: Can Mercury Technology Solutions help set up and analyze this tracking? Yes. As part of our analytics, SEO, and digital strategy services , we help clients implement robust tracking, interpret the data, and integrate insights from sources like LLM referrals into their overall strategy for improved decision-making and performance.

Decoding Your GA4 Data: How to Track and Segment Traffic from LLMs
James Huang June 1, 2025
Share this post
Riding the AI Wave: Decoding a 2,200% AI Traffic Surge & Mastering AI Search Optimization