A Deep Dive into AWS Billing: CUR, FOCUS, and the Dataset Most Teams Underuse

Your AWS billing export is not an invoice. It is a timestamped, service-attributed, resource-attributed event log โ€” the same dataset that detects cost anomalies and security breaches, if you know which fields to query.

Most teams set up Cost Explorer, check the monthly total, and call it visibility. Cost Explorer is useful for trend charts and high-level breakdowns. It is not useful for answering the questions that actually matter: why did this specific resource start costing more on Tuesday, what is the cross-service billing signature of this AI workload, and does this data transfer pattern indicate exfiltration or legitimate use?

Those questions require the raw billing dataset โ€” and knowing how to structure it for analysis.

Two Export Formats: CUR 2.0 and FOCUS

AWS provides two billing export formats through AWS Data Exports. They are not interchangeable โ€” they serve different use cases and have meaningfully different schemas.

DimensionCUR 2.0 (Legacy path)FOCUS 1.2 (Recommended path)
Schema standard AWS-proprietary, AWS-only FinOps Open Cost and Usage Specification โ€” same schema across AWS, Azure, GCP
Column count 300+ columns, many service-specific 48 columns (43 FOCUS spec + 5 AWS x_ columns) โ€” compact and queryable
ServiceName format Inconsistent casing and naming across services Standardized: Amazon OpenSearch Service, Amazon Bedrock โ€” reliable for pattern matching
AI billing fields Usage type in lineItem/UsageType โ€” format varies per service x_UsageType: SearchOCU, InvokeModelInference โ€” queryable with LIKE patterns
Multi-cloud use AWS only โ€” requires schema normalization to join with Azure or GCP data Join directly with Azure and GCP FOCUS exports โ€” same column names, same value formats
Split cost allocation (ECS/EKS) Supported Not yet supported in FOCUS โ€” use CUR 2.0 if this is required
Recommended for new implementations Only if you need split cost allocation Yes โ€” all other use cases

If you are building a new billing pipeline today, start with FOCUS 1.2. If you have an existing CUR-based pipeline, run FOCUS exports in parallel before migrating โ€” the field mapping requires attention and some queries need to be rewritten. CUR 2.0 is not being deprecated imminently, but FOCUS is the direction AWS and every major cloud provider are converging on.

Setting Up the Pipeline: AWS Data Exports โ†’ S3 โ†’ Athena

The standard architecture for billing analysis is a three-layer stack: raw export files in S3, an Athena table pointing at those files, and SQL queries for analysis. The entire setup takes under an hour the first time.

Step 1 โ€” Enable AWS Data Exports

AWS Console โ†’ AWS Cost Management โ†’ Data Exports โ†’ Create export. Configuration choices that matter:

Step 2 โ€” Create the Athena table (FOCUS schema)

CREATE EXTERNAL TABLE focus_billing (
  billingaccountid        STRING,
  billingaccountname      STRING,
  billingcurrency         STRING,
  billingperiodend        TIMESTAMP,
  billingperiodstart      TIMESTAMP,
  chargeclass             STRING,
  chargecategory          STRING,
  chargedescription       STRING,
  chargefrequency         STRING,
  chargeperiodend         TIMESTAMP,
  chargeperiodstart       TIMESTAMP,
  commitmentdiscountcategory STRING,
  commitmentdiscountid    STRING,
  commitmentdiscountname  STRING,
  commitmentdiscounttype  STRING,
  consumedquantity        DOUBLE,
  consumedunit            STRING,
  contractedcost          DOUBLE,
  contractedunitprice     DOUBLE,
  effectivecost           DOUBLE,
  invoiceissuer           STRING,
  listcost                DOUBLE,
  listunitprice           DOUBLE,
  pricingcategory         STRING,
  pricingquantity         DOUBLE,
  pricingunit             STRING,
  provider                STRING,
  publisher               STRING,
  regionid                STRING,
  regionname              STRING,
  resourceid              STRING,
  resourcename            STRING,
  resourcetype            STRING,
  servicecategory         STRING,
  servicename             STRING,
  skuid                   STRING,
  skuname                 STRING,
  subaccountid            STRING,
  subaccountname          STRING,
  tags                    STRING,
  billedcost              DOUBLE,
  -- AWS x_ columns
  x_costcategories        STRING,
  x_discounts             STRING,
  x_operation             STRING,
  x_servicecode           STRING,
  x_usagetype             STRING
)
STORED AS PARQUET
LOCATION 's3://YOUR-BILLING-BUCKET/focus-export/'
TBLPROPERTIES ('parquet.compress'='SNAPPY');

After creating the table, run MSCK REPAIR TABLE focus_billing to load existing partitions. For ongoing updates, set up an AWS Glue crawler on the same S3 path โ€” it will pick up new export files automatically without manual table updates.

Step 3 โ€” Verify the data with a baseline query

-- Sanity check: total billed cost by service, last 30 days
SELECT
  servicename,
  SUM(billedcost) AS total_cost,
  COUNT(DISTINCT subaccountid) AS account_count,
  COUNT(DISTINCT resourceid) AS resource_count
FROM focus_billing
WHERE chargeperiodstart >= CAST(DATE_ADD('day', -30, CURRENT_DATE) AS TIMESTAMP)
  AND chargecategory = 'Usage'
  AND chargeclass = 'Regular'
GROUP BY servicename
ORDER BY total_cost DESC
LIMIT 20;

If this returns results, the pipeline is working. If the table is empty, check S3 bucket permissions, the Glue crawler state, and whether the export has delivered its first file (new exports can take up to 24 hours for the first delivery).

The Fields That Actually Matter

A FOCUS export has 48 columns. In practice, 10โ€“12 drive almost all useful analysis. Understanding what each actually contains โ€” and what it does not โ€” prevents the most common query mistakes.

For cost attribution and anomaly detection

FieldWhat it actually containsCommon mistake
ServiceName The AWS service responsible for the charge: Amazon OpenSearch Service, Amazon Bedrock, Amazon EC2. Standardized in FOCUS โ€” reliable for LIKE pattern matching across accounts. Treating ServiceName as sufficient attribution for AI workloads. A Bedrock Knowledge Base charge appears as Amazon OpenSearch Service โ€” ServiceName alone misleads.
x_UsageType The specific billing dimension within a service: SearchOCU, IndexingOCU, InvokeModelInference, BoxUsage:ml.g4dn.xlarge, NatGateway-Bytes. This is the field that distinguishes what is actually being billed. Filtering only on ServiceName and missing the usage type breakdown. Two services can share a ServiceName but have entirely different cost drivers in x_UsageType.
ResourceId The ARN or ID of the specific resource being billed. For OpenSearch collections: bedrock-knowledge-base-<uuid>. For EC2: instance ID. Required for identifying the specific resource to remediate. Aggregating by ServiceName and assuming all resources in a service behave the same. Anomalies are almost always resource-specific.
ConsumedQuantity vs BilledCost ConsumedQuantity is how much was used (OCU-hours, GB, tokens). BilledCost is what was charged. For OCU billing, ConsumedQuantity is always 1.0 โ€” the billing floor is in BilledCost, not in usage quantity. Using ConsumedQuantity to detect idle resources in OCU-billed services. A perfectly idle OpenSearch collection still shows ConsumedQuantity = 1.0 every hour.
SubAccountId The AWS account ID that incurred the charge. For organizations with many accounts, this is the primary grouping key for cross-account analysis. Analyzing only at the payer account level and missing account-specific anomalies that average out in aggregation.
RegionId The AWS region where the resource ran: us-east-1, eu-west-1. For security detection: any region that has never appeared in an account's billing history is a zero-threshold signal. Not tracking region history. New-region charges are the earliest billing signal of credential compromise โ€” but only detectable if you know what regions are "normal" for each account.

The Security Signal Layer

AWS explicitly documents the use of billing data for security risk identification. The AWS Security Maturity Model includes billing alarms as a Quick Win โ€” the first tier of security controls, before SIEM, before GuardDuty, before any advanced tooling. The rationale: when an attacker compromises AWS credentials, the first observable evidence is almost always a cost event, not a log event.

Three billing signals that have a direct security interpretation:

New-region zero-threshold rule

Any RegionId that has never appeared in an account's billing history is an immediate escalation โ€” not an alert threshold, a threshold of zero. A legitimate new region deployment goes through a change management process; it does not appear silently in the billing data. Cryptomining campaigns, credential compromise, and shadow IT deployments all share this signature: new region, new service, first charge appearing with no corresponding internal authorization.

-- New region detection: find region/account combinations with no prior 90-day history
WITH historical_regions AS (
  SELECT DISTINCT subaccountid, regionid
  FROM focus_billing
  WHERE chargeperiodstart < CAST(DATE_ADD('day', -7, CURRENT_DATE) AS TIMESTAMP)
    AND chargeperiodstart >= CAST(DATE_ADD('day', -90, CURRENT_DATE) AS TIMESTAMP)
    AND chargecategory = 'Usage'
),
recent_regions AS (
  SELECT DISTINCT subaccountid, regionid, MIN(chargeperiodstart) AS first_seen
  FROM focus_billing
  WHERE chargeperiodstart >= CAST(DATE_ADD('day', -7, CURRENT_DATE) AS TIMESTAMP)
    AND chargecategory = 'Usage'
  GROUP BY subaccountid, regionid
)
SELECT
  r.subaccountid,
  r.regionid,
  r.first_seen,
  'NEW_REGION' AS signal_type
FROM recent_regions r
LEFT JOIN historical_regions h
  ON r.subaccountid = h.subaccountid AND r.regionid = h.regionid
WHERE h.regionid IS NULL
ORDER BY r.first_seen DESC;

Data transfer egress spike

Exfiltration generates DataTransfer-Out-Bytes charges. The signal is not absolute volume โ€” it is a ratio change: egress increasing faster than compute or API call volume in the same account and period. A legitimate application that grows has both compute and egress growing together. An exfiltration event has egress growing against flat or declining compute.

-- Egress-to-compute ratio anomaly: egress growing faster than API/compute spend
WITH daily_costs AS (
  SELECT
    subaccountid,
    DATE(chargeperiodstart) AS charge_date,
    SUM(CASE WHEN x_usagetype LIKE '%DataTransfer-Out%' THEN billedcost ELSE 0 END) AS egress_cost,
    SUM(CASE WHEN x_usagetype NOT LIKE '%DataTransfer%' THEN billedcost ELSE 0 END) AS compute_cost
  FROM focus_billing
  WHERE chargeperiodstart >= CAST(DATE_ADD('day', -14, CURRENT_DATE) AS TIMESTAMP)
    AND chargecategory = 'Usage'
  GROUP BY subaccountid, DATE(chargeperiodstart)
)
SELECT
  subaccountid,
  charge_date,
  egress_cost,
  compute_cost,
  CASE WHEN compute_cost > 0 THEN egress_cost / compute_cost ELSE NULL END AS egress_ratio
FROM daily_costs
WHERE egress_cost > 5.0
ORDER BY egress_ratio DESC NULLS LAST
LIMIT 50;

New service in an account

GPU-class compute appearing in an account with no prior GPU spend is a cryptomining signal. A new managed AI service appearing with no prior activity is either shadow IT or credential compromise. The zero-threshold rule applies: any ServiceName + x_UsageType combination not seen in the prior 30 days is worth reviewing within 24 hours.

The Latency Problem โ€” and How to Work Around It

AWS Cost Anomaly Detection โ€” the native AWS service for billing-based alerting โ€” has a published detection latency of up to 24 hours, because it operates on CUR data which delivers with that lag. For a cryptomining campaign that generates hundreds of dollars per hour, 24-hour detection latency means potentially $2,400+ in damage before the first alert fires.

The workaround is to run behavioral queries directly against fresh FOCUS export files. AWS delivers updated billing exports up to three times per day. A Lambda function triggered by an S3 event on new export delivery can run the new-region query within minutes of a new charge appearing โ€” reducing detection latency from 24 hours to under 30 minutes for the most critical security patterns.

The architecture: S3 event notification โ†’ Lambda trigger โ†’ Athena query โ†’ SNS alert. The entire pipeline runs in under 60 seconds per export delivery. The Lambda function needs only three permissions: S3 read on the billing bucket, Athena query execution, and SNS publish. Cost: negligible โ€” Athena charges $5 per TB scanned, and a behavioral query against a filtered partition of billing data scans megabytes, not terabytes.

What to Build First

  1. Enable FOCUS 1.2 exports today. AWS Data Exports โ†’ Create export โ†’ FOCUS 1.2 with AWS columns โ†’ Hourly โ†’ Parquet. Run in parallel with any existing CUR export until you have validated the FOCUS data matches.
  2. Create the Athena table. Use the DDL above. Run the baseline query to confirm data is flowing. This is your billing source of truth โ€” everything else depends on it.
  3. Add the new-region detection query. Schedule it as a daily Athena Named Query or wrap it in a Lambda triggered by new export delivery. Route results to an SNS topic that hits your security channel. This is the highest-ROI security control you can add in under an hour.
  4. Build service-specific cost baselines. For each major service in your account, compute a 30-day average daily cost by service and region. Any day more than 2ร— baseline is an anomaly worth investigating. This catches cost spikes before they become month-end surprises.
  5. Add AI-specific behavioral queries. The cross-service patterns (OpenSearch OCU with no Bedrock inference, SageMaker endpoint hours with no invocations) are not detectable in Cost Explorer โ€” they require joining across service rows in the billing dataset. These are the patterns that standard FinOps tooling misses.

See which billing patterns your current setup can detect

The DropInFinOps free assessment maps your billing export configuration against the behavioral query library โ€” showing which cost and security patterns are detectable today and which require a FOCUS migration or additional instrumentation.

Take the free assessment โ†’