March 31, 2026·32 min read·8 views·7 providers

AI Spending Tax Risks: R&D, Capitalization & Compliance

Practitioner analysis of AI tax exposure (2025–2026): classifying §174 vs software, cloud GPU allocation, R&D credit risks, transfer pricing and audit play

Key Finding

The IRS now operates 129 AI use cases for audit selection and enforcement (up from 54 in 2024); no AI-specific LB&I campaign has been announced, but general R&D credit campaigns and §174 conformity remain high-priority; examinations of 2025 returns will begin in 2027–2028

high confidenceSupported by Anthropic, OpenAI, Perplexity, Grok, OpenAI-Mini, Gemini-Lite, Grok-Premium
Justin Furniss
Justin Furniss

@Parallect.ai and @SecureCoders. Founder. Hacker. Father. Seeker of all things AI

anthropicopenaiperplexitygrokopenai-minigemini-litegrok-premium

The Hidden Tax Exposure in AI Spending: R&D Credit Qualification, Software Capitalization, and Compliance Risks (2025–2026)

A Definitive Cross-Provider Analysis for Senior Tax Professionals


Executive Summary

  • The OBBBA's §174A restoration is the most consequential AI tax development of 2025, but it creates as many traps as benefits. All seven providers confirm that the One Big Beautiful Bill Act (enacted July 4, 2025) [3] permanently restored immediate expensing for domestic R&E under new §174A, effective for tax years beginning after December 31, 2024. However, the foreign/domestic bifurcation (foreign R&E remains on 15-year amortization), the §280C coordination with §41 credits, and the patchwork of state nonconformity mean that companies applying federal treatment uniformly across all jurisdictions and all cost categories are creating material audit exposure — potentially $10M+ per $100M of AI spend.

  • The single highest-risk position in AI tax today is the misclassification of production inference compute as §174A R&E. Every provider independently identified the training-vs.-inference allocation as the most common and most consequential error. Cloud GPU costs for model training during development qualify as R&E [3]; costs for serving production inference to customers are ordinary §162 expenses. Companies without robust cloud cost tagging and allocation methodologies are overclaiming §174A deductions and §41 credits simultaneously, creating compounding exposure.

  • There is a complete absence of AI-specific IRS guidance — no TAMs, no PLRs, no revenue rulings, no court cases — and this regulatory vacuum cuts both ways. All providers confirm [3] that companies are operating on analogical reasoning from pre-AI software and R&D frameworks. The most aggressive positions (treating all training data costs as §174A, claiming §41 credits on AI-generated code workflows, treating fine-tuning as fully experimental) are defensible but untested. The IRS's deployment of AI-powered audit selection [3] means these positions will be tested at scale within 2–3 years.

  • Transfer pricing of AI IP is approaching a crisis point. The combination of AM 2025-001's reassertion of periodic adjustment authority [33], the absence of comparable transactions for proprietary LLMs [3], and the IRS's commitment to tripling large-corporation audit rates creates a perfect storm for companies that have migrated AI IP to low-tax jurisdictions without rigorous contemporaneous documentation. A $1B LLM undervalued by 30% creates a $300M adjustment exposure with 20–40% penalties.

  • Form 6765 Section G — the business-component-level reporting requirement — is the most important near-term documentation imperative. While Section G remains optional for tax year 2025 [2], it becomes mandatory for 2026. Companies that have not built project-level documentation infrastructure now will face a retroactive documentation crisis when the mandate takes effect, precisely as IRS examinations of 2025 returns begin.


Cross-Provider Consensus

The following findings were independently confirmed by multiple providers and represent the highest-reliability conclusions in this analysis.


1. OBBBA enacted §174A on July 4, 2025, permanently restoring immediate domestic R&E expensing

  • Confirmed by: Anthropic, OpenAI, Perplexity, Grok, OpenAI-Mini, Gemini-Lite, Grok-Premium (all seven providers)
  • Confidence: HIGH
  • §174A applies to taxable years beginning after December 31, 2024 [4]. Software development costs are explicitly included [132]. Foreign R&E remains subject to 15-year amortization under legacy §174 [2]. Taxpayers may elect to capitalize and amortize domestically rather than expense immediately [2].

2. Production inference compute is an ordinary §162 expense; training compute tied to qualified research is §174A/§41-eligible

  • Confirmed by: Anthropic, OpenAI, Perplexity, Grok, OpenAI-Mini, Gemini-Lite, Grok-Premium
  • Confidence: HIGH
  • This is the most universally agreed-upon classification principle across all providers [4]. The allocation methodology — not the principle — is where audit risk concentrates. Flat percentage allocations are increasingly viewed as insufficient by examiners [Gemini-Lite].

3. No AI-specific TAMs, PLRs, revenue rulings, or court cases exist as of early 2026

  • Confirmed by: Anthropic, OpenAI, Perplexity, Grok, OpenAI-Mini, Grok-Premium
  • Confidence: HIGH
  • All providers independently noted the complete absence of AI-specific IRS guidance [4]. Companies are operating on analogical reasoning from Rev. Proc. 2000-50 [133], general §174 regulations, and the §41 Audit Techniques Guide [2]. This creates both risk and opportunity.

4. The §41 four-part test applies to AI activities; strongest qualification is novel architecture/algorithm development; weakest is API usage and routine fine-tuning

  • Confirmed by: Anthropic, OpenAI, Perplexity, Grok, OpenAI-Mini, Gemini-Lite, Grok-Premium
  • Confidence: HIGH
  • Universal agreement on the four-part test framework [3]. Strong qualification: novel model architectures, custom training pipelines, hyperparameter experimentation with documented uncertainty [3]. Weak/no qualification: commercial API usage without modification, prompt engineering, routine fine-tuning, business-process automation using off-the-shelf models [2].

5. EU AI Act compliance costs are ordinary §162 business expenses, not R&E

  • Confirmed by: Anthropic, OpenAI, Perplexity, Gemini-Lite, Grok-Premium
  • Confidence: HIGH
  • Pure regulatory compliance activities (documentation, legal review, conformity assessment) are §162 expenses [4]. Where compliance activities involve developing novel technical solutions (bias detection systems, explainability tools), those specific costs may qualify as §174A R&E [Anthropic]. EU AI Act fines are nondeductible penalties [Grok].

6. On-premise GPU clusters (NVIDIA DGX, H100/B200) are depreciable capital assets under §168, eligible for 100% bonus depreciation under §168(k) as restored by OBBBA

  • Confirmed by: Anthropic, OpenAI, Perplexity, Grok, OpenAI-Mini, Gemini-Lite, Grok-Premium
  • Confidence: HIGH
  • Hardware is explicitly excluded from §174 R&E expensing [2]. §168(k) bonus depreciation restored to 100% for qualifying property [2]. The §174A (software/R&E costs) and §168(k) (hardware) regimes are distinct and non-conflicting [Perplexity]. Critical warning: claiming the same hardware cost under both regimes is a double-dip that creates significant audit exposure [Gemini-Lite].

7. Foundation model API costs (OpenAI, Anthropic, Google) are generally §162 ordinary expenses, not §174A R&E or licensed IP

  • Confirmed by: Anthropic, OpenAI, Perplexity, Grok, OpenAI-Mini, Gemini-Lite, Grok-Premium
  • Confidence: HIGH
  • Universal agreement that flat subscription or per-use API fees are operating expenses [3]. Exception: API costs incurred during the development/prototyping phase of building a proprietary model may have a partial §174A argument [Anthropic, Perplexity], but this is a gray-zone position with high audit risk [Gemini-Lite].

8. California does not conform to federal §174A and requires separate state analysis

  • Confirmed by: Anthropic, OpenAI, Grok, OpenAI-Mini, Grok-Premium
  • Confidence: HIGH
  • California did not adopt TCJA §174 amortization and does not fully adopt §174A [3]. California passed SB 711 in 2025, replacing the Alternative Incremental Credit with an Alternative Simplified Credit methodology [3]. The federal catch-up deduction for 2022–2024 domestic R&D will likely require a subtraction modification on California returns [OpenAI].

9. Transfer pricing of AI IP faces acute comparability challenges; no arm's-length transactions exist for proprietary LLM transfers

  • Confirmed by: Anthropic, OpenAI, Perplexity, Grok, OpenAI-Mini, Gemini-Lite, Grok-Premium
  • Confidence: HIGH
  • All providers independently identified the absence of comparable transactions as the central transfer pricing challenge [4]. Companies default to profit split, income method, or cost-based approaches [2]. AM 2025-001's reassertion of periodic adjustment authority [33] significantly elevates the risk of ex post IRS challenges to AI IP valuations.

10. Form 6765 Section G (business-component-level reporting) is optional for 2025 but mandatory for 2026

  • Confirmed by: Anthropic, OpenAI, Grok
  • Confidence: HIGH
  • Section G deferred to tax year 2026 [3]. Companies should build business-component documentation infrastructure now [2]. The IRS has signaled it will seek this information in examinations even before the mandate [Anthropic].

11. IRS is using AI-powered audit selection and has significantly increased examination rates for large corporations

  • Confirmed by: Anthropic, OpenAI, Perplexity, Gemini-Lite
  • Confidence: HIGH
  • IRS operates 129 AI use cases as of 2026, up from 54 in 2024 [2]. IRS predicted to triple audit rates for large corporations by 2026 [35]. AI-powered selection specifically targets high R&D-to-revenue ratios and significant intercompany IP transfers [3].

Unique Insights by Provider

Anthropic

  • The "vibe coding" / "agentic coding" problem and the §41 substantially all test. Anthropic uniquely developed the argument that when AI generates 80%+ of initial code and human developers spend most time reviewing, testing, and debugging, the question becomes whether human review/testing activities themselves constitute the "process of experimentation." The reframing of debugging AI-generated code as "documenting the failure of a hypothesis" is a novel and intellectually compelling argument for maintaining §41 qualification — but it is entirely untested before the IRS or courts. This is the single most important unresolved question in AI-related R&D credit claims and deserves immediate attention from practitioners advising companies with heavy AI-assisted development workflows.

  • The §163(j) interaction with §174A deductions. Anthropic uniquely flagged that because R&E deductions reduce taxable income, they can also reduce the amount of the interest expense deduction allowed under §163(j). This second-order interaction is almost entirely absent from current practitioner literature and could be material for highly leveraged AI companies.

  • The onshoring incentive created by the foreign/domestic §174A bifurcation. Anthropic specifically noted that the 15-year foreign amortization vs. immediate domestic expensing creates a massive incentive to onshore AI training workloads — and a compliance trap for companies with distributed engineering teams where model training occurs on US-based cloud infrastructure but engineering direction comes from offshore teams. The question of whether "domestic" R&E is determined by where the compute runs or where the engineers are located is unresolved.

OpenAI

  • The Microsoft $28.9B transfer pricing dispute as a calibration benchmark. OpenAI uniquely cited [104] the Microsoft IRS dispute as a concrete illustration of the scale of transfer pricing exposure for technology IP. This provides practitioners with a real-world anchor for client conversations about AI IP migration risk.

  • The §367(d) termination opportunity for AI IP repatriation. OpenAI uniquely highlighted [141] that 2024 Treasury regulations allow termination of §367(d) annual inclusions when IP is repatriated to a qualifying U.S. person under strict reporting rules. For companies that previously migrated AI IP offshore and are now reconsidering that structure in light of Pillar Two and AM 2025-001, this creates a potential repatriation pathway worth modeling.

  • Proposed AI skills tax credit legislation. OpenAI uniquely cited [2] House lawmakers pushing a tax credit for AI skills training and Senator Schatz's AI jobs bills (February 2026). While not yet enacted, these proposals signal a potential new category of AI-related tax incentives that practitioners should monitor for client planning.

Perplexity

  • Granular dollar-impact modeling with specific cost breakdowns. Perplexity provided the most detailed quantitative analysis of any provider, including specific training run cost estimates (GPT-4 scale: $8M–$14M per run; 25,000–40,000 A100 GPU-days), training data cost estimates for major models (GPT-3: $4.6M–$6.5M; GPT-4: $9M–$12M; LLaMA: $7M–$10M), and a granular five-category training data classification framework distinguishing public dataset licensing (§174-qualifying), proprietary data cleaning (50/50 split), annotation/labeling (§174-qualifying if for model discovery), rights acquisition for copyrighted content (ordinary expense), and synthetic data generation (§174-qualifying only if novel methodology). This level of specificity is immediately useful for R&D credit studies.

  • The evaluation compute "gray zone" with a three-tier allocation framework. Perplexity uniquely proposed a specific allocation methodology for evaluation compute (running inference on test sets to evaluate model performance): first 20% is §174-qualifying, middle 50% is 50/50, final 30% is ordinary expense. While this framework lacks IRS authority, it provides a defensible starting point for practitioners.

  • The AWS Savings Plan capitalization analysis. Perplexity uniquely analyzed the treatment of multi-year cloud commitment prepayments under Treas. Reg. §1.263(a)-4, recommending capitalization as an intangible asset and ratable amortization, with the R&D-allocated portion treated as §174A R&E. This is a specific, actionable position that most practitioners have not addressed.

Grok

  • FASB ASU 2025-06 and its impact on AI software capitalization. Grok uniquely identified [119] that FASB ASU 2025-06 (effective 2026) scraps the preliminary/application stage framework for agile/ML development, adopting a principles-based recognition threshold. This is a significant GAAP development that will affect book-tax differences for AI software costs and deserves immediate attention from practitioners advising public company clients.

  • The "CA FTB Notice 2025-01 provides no §174A retroactivity" claim. Grok specifically cited a California FTB Notice 2025-01 [23] indicating no retroactive application of §174A for California purposes. If accurate, this is a critical state conformity data point that practitioners must verify and communicate to California-based clients.

  • The synthetic data generation challenge. Grok uniquely flagged that synthetic data generation via prior models may be challenged by the IRS as non-experimental if deterministic — a subtle but important distinction for companies using their own models to generate training data for next-generation models.

OpenAI-Mini

  • Rev. Proc. 2000-50 as the surviving analogical framework. OpenAI-Mini uniquely emphasized [133] that Rev. Proc. 2000-50 (which instructed that software development costs so closely resemble R&E costs that the IRS wouldn't disturb treating them similarly) and its associated TAMs remain the primary analogical authority for AI development costs. The key principle: when a taxpayer bears the risk of developing new software, coding costs are R&E deductible; when a third party bears design risk and delivers a finished model, the cost is closer to a purchase of software and may be capitalizable.

  • The internal-use software five-criteria test and its application to AI. OpenAI-Mini uniquely emphasized [26] that §41(d)(4)(E) excludes internal-use software from the R&D credit unless it meets special innovation tests, and that many internal AI projects (fraud detection, customer service automation, internal productivity tools) ultimately fail the IUS test. The three additional requirements — innovation, significant economic risk, and commercial unavailability — create a materially higher bar for internal AI deployments than for customer-facing AI products.

  • Tennessee add-back requirement. OpenAI-Mini uniquely cited [48] that Tennessee guidance requires adding back accelerated R&E expensing on the income tax return — a specific state nonconformity trap that practitioners serving Tennessee-based clients must address.

Gemini-Lite

  • The double-dip warning for on-premise GPU clusters. Gemini-Lite uniquely and explicitly flagged [132] that taxpayers must ensure the depreciation of on-premise hardware is not also being claimed as a §174 cost — a double-dip that creates significant audit exposure. While logically obvious, this error appears to be occurring in practice and deserves explicit client communication.

  • The "failed model runs repository" as the gold standard of §41 documentation. Gemini-Lite uniquely articulated [2] that a repository of failed model runs is the gold standard of proof for the process of experimentation requirement. This is a specific, actionable documentation recommendation that practitioners can immediately communicate to AI company clients.

Grok-Premium

  • The §280C coordination complexity post-OBBBA. Grok-Premium uniquely emphasized that the change in §174A combined with continued availability of the §41 research credit creates significant cash-tax benefits but also introduces coordination risks [2]. The §280C(c) framework requires either reducing the §174A deduction by the credit amount or taking a reduced credit — and companies that fail to model this interaction are creating both overclaim and underclaim exposure simultaneously.

  • Conservative disclosure via Form 8275 and method change requests as risk mitigation tools. Grok-Premium uniquely recommended [2] that in high-exposure situations (particularly around training data characterization, AI-generated code impact on experimentation, and valuation of trained models), conservative disclosure via Form 8275 or filing method change requests may be prudent. This is a specific procedural recommendation that most providers did not address.


Contradictions and Disagreements

Contradiction 1: State Conformity to §174A — California and New York

Position A (Grok): California, New York, New Jersey, and Massachusetts require 5-year amortization post-OBBBA [23]. Texas and Washington conform to federal immediate expensing.

Position B (OpenAI-Mini): California generally conforms to the old expensing treatment and allows full R&E deductions in the year incurred [2]. Texas similarly allows R&D to be expensed like old §174. Massachusetts, Washington, and other conforming states link to federal income, so post-2024 expensing applies.

Position C (Anthropic): California does not adopt federal capitalization and amortization rules for R&E expenditures under §174 or §174A. Both U.S. and non-U.S. R&E costs remain fully deductible for California purposes (citing SB 711) [2].

Analysis: This is a genuine and material contradiction. The confusion appears to stem from conflating California's pre-OBBBA treatment (which was more favorable than federal TCJA, allowing immediate expensing when federal required amortization) with California's post-OBBBA treatment (where the question is whether California conforms to the new §174A). The Anthropic and OpenAI-Mini positions appear more consistent with the cited sources [2], which indicate California maintained its own favorable treatment independent of federal changes. However, the specific question of whether California's conformity date update under SB 711 picks up §174A requires verification with the California FTB. Practitioners must independently verify current California FTB guidance before advising clients.


Contradiction 2: Whether §174A Is Effective for Tax Years Beginning After December 31, 2024 or December 31, 2025

Position A (Most providers, including Anthropic, OpenAI, Grok, Grok-Premium): §174A applies to taxable years beginning after December 31, 2024 [3].

Position B (Perplexity): §174A is effective for tax years beginning after December 31, 2025 [132].

Analysis: The weight of authority strongly supports the December 31, 2024 effective date [4]. The Perplexity claim of a December 31, 2025 effective date appears to be an error in that provider's analysis. The statutory text of the OBBBA and multiple Big 4 analyses confirm the 2024 effective date. This is a high-stakes error — practitioners relying on the 2025 effective date would miss a full year of immediate expensing for calendar-year taxpayers.


Contradiction 3: Whether Reserved Cloud Instances Are Capitalizable Intangible Assets or Prepaid Operating Expenses

Position A (Perplexity): By analogy to software licensing, 1-year and 3-year cloud commitment prepayments are likely capitalizable intangible assets under Treas. Reg. §1.263(a)-4, to be amortized over the reservation period.

Position B (Anthropic, Grok, OpenAI-Mini): Reserved instances are generally treated as prepaid operating expenses (prepaid service contracts), deducted ratably over the reservation period [3].

Position C (Gemini-Lite): Reserved instances and long-term capacity commitments are generally operating expenses (§162), though they may be tied to a dedicated R&D project.

Analysis: The distinction between "capitalizable intangible asset" and "prepaid operating expense" has different implications for balance sheet presentation and method of accounting, but both positions result in ratable deduction over the reservation period for tax purposes. The practical difference is primarily in GAAP presentation. The IRS has not issued specific guidance on cloud reserved instances [2]. The Perplexity position (capitalizable intangible) and the majority position (prepaid operating expense) converge on the same tax result but diverge on GAAP treatment — practitioners should address both dimensions.


Contradiction 4: Whether Training Data Costs Qualify as §174A R&E

Position A (Grok, Grok-Premium): Training data acquisition, annotation, and curation can qualify as §174A R&E if the data pipeline is developed experimentally, with 85% of a $50M dataset curation cost potentially qualifying [8].

Position B (OpenAI-Mini): A data or information base, like proprietary datasets, isn't treated as software development under IRS precedent [133]. If data is purchased or licensed as an asset, it might not be R&E.

Position C (Perplexity): A nuanced five-category framework: public dataset licensing qualifies; proprietary data cleaning is 50/50; annotation/labeling qualifies if for model discovery; rights acquisition for copyrighted content is ordinary expense; synthetic data generation qualifies only if novel methodology is used.

Analysis: This is a genuine gray zone with no IRS guidance. The disagreement reflects legitimate uncertainty in the law. The Perplexity framework is the most granular and defensible, but all positions acknowledge the absence of specific authority [134]. The key distinction appears to be whether the data cost is (a) acquiring an existing asset (ordinary expense or capitalized asset) versus (b) developing a novel data pipeline or methodology (§174A R&E). Practitioners should document the specific nature of each data cost category.


Contradiction 5: The "Substantially All" Test — Whether It Applies to §174 or Only §41

Position A (Gemini-Lite): There is no §41 "substantially all" (80%) rule for §174 [134]. The substantially all test is a §41 concept only.

Position B (OpenAI-Mini): IRS regs require that 80%+ of a project's cost be experimental for it to qualify under the substantially all test [134]. (Appears to apply this to both §174 and §41.)

Position C (Grok): Data acquisition may qualify if it is "substantially all" (80%+) integral to R&E under §41(d)(4)(C) cross-applied via Reg. §1.174-2(a)(8).

Analysis: Gemini-Lite is technically correct that the "substantially all" 80% threshold is a §41 concept under Treas. Reg. §1.41-4(a)(6), not a §174 concept. For §174, the standard is whether costs are "incident to" the development activity under Treas. Reg. §1.174-2(a)(1). However, the practical effect is similar — costs that are not substantially related to R&E activity will not qualify under either provision. Practitioners should be precise about which test applies to which provision, as the standards are technically distinct even if practically convergent.


Detailed Synthesis

Part I: The §174A Landscape — Opportunity and Trap

The OBBBA's enactment on July 4, 2025 [2] represents the most significant shift in AI-related tax treatment since the TCJA's 2022 capitalization mandate. New §174A permanently restores immediate expensing for domestic R&E expenditures, including software development costs, for taxable years beginning after December 31, 2024 [2]. For a mid-size AI company with $50M in annual domestic R&D spend, this translates to approximately $10.5M in annual cash tax benefit versus the prior amortization regime [Anthropic].

But the restoration creates a new set of planning decisions that most companies are not adequately modeling. [Grok-Premium] identifies three distinct compliance risks introduced by §174A: (1) classification risk — determining which costs are domestic R&E versus foreign R&E versus ordinary expense; (2) documentation risk — the IRS will now scrutinize the substance of claimed R&E rather than the timing of deductions; and (3) coordination risk — the §280C(c) interaction between §174A deductions and §41 credits requires careful modeling to avoid either double-counting or inadvertent waiver of benefits.

The foreign/domestic bifurcation deserves particular attention. Foreign R&E remains subject to 15-year amortization under legacy §174 [2]. [Anthropic] uniquely identifies the compliance trap for companies with distributed engineering teams: if model training occurs on US-based cloud infrastructure but engineering direction comes from offshore teams, the question of whether the R&E is "domestic" is unresolved. The IRS has not issued guidance on whether "domestic" is determined by the location of the compute, the location of the engineers, or some functional analysis of where the research activity occurs. Companies with significant offshore engineering talent should seek specific guidance before claiming §174A treatment on costs directed by non-US personnel.

The transition rules for 2022–2024 capitalized costs add another layer of complexity. [Anthropic] notes that taxpayers can deduct any remaining unamortized domestic R&E costs entirely in 2025, or spread the deduction over 2025 and 2026. Eligible small businesses may elect retroactive application by amending 2022–2024 returns [2]. [Grok-Premium] recommends that companies model the §280C coordination before making this election, as accelerating the deduction may reduce the available §41 credit in the catch-up year.

Part II: AI/ML Training Cost Classification — The Hierarchy of Certainty

[Perplexity] provides the most granular framework for classifying AI training costs, which [Grok-Premium] and [Anthropic] broadly confirm. The hierarchy of certainty runs as follows:

High-confidence §174A qualification: Developing a proprietary foundation model from scratch — including training infrastructure costs, hyperparameter experimentation, model architecture searches, and novel data preprocessing methodology — clearly constitutes R&E under Treas. Reg. §1.174-2(a)(1) [2]. [Perplexity] estimates that a $12.1M proprietary LLM development project (comprising $8.2M training infrastructure, $1.8M data acquisition/cleaning/curation, and $2.1M hyperparameter experimentation) is fully §174A-deductible under the most defensible position.

Moderate-confidence §174A qualification: Fine-tuning a third-party foundation model on proprietary data for a domain-specific task [2]. [Grok-Premium] notes that material customization involving hyperparameter search, novel datasets, or architectural modifications can qualify, while routine prompt engineering and parameter-efficient fine-tuning on commercial models is often treated as production or non-experimental. [Perplexity] estimates that of a $2.22M fine-tuning project, between $1.54M (conservative) and $2.22M (aggressive) qualifies as §174A R&E, with the training data curation component ($680K) being the primary point of contention.

High-risk/weak §174A position: Using AutoML services or applying a pre-built ML pipeline to proprietary data without material architectural innovation [Perplexity]. [OpenAI-Mini] notes that if a company merely uses off-the-shelf AI tools or adapts them without facing technical uncertainty, the work may not pass the four-part §41 test — and by extension, the §174A "experimental" requirement.

The training data question is the most significant unresolved classification issue. [Grok] argues that data acquisition, annotation, and curation costs can qualify as §174A R&E if they are "substantially all" integral to the R&E activity [8]. [OpenAI-Mini] cautions that a data or information base isn't treated as software development under IRS precedent [133]. [Perplexity] offers the most nuanced framework: public dataset licensing qualifies; proprietary data cleaning is 50/50; annotation/labeling qualifies if for model discovery (not routine QA); rights acquisition for copyrighted content is ordinary expense; synthetic data generation qualifies only if novel methodology is used. Treas. Reg. §1.174-2(b) lists examples of R&E costs but does not specifically address data acquisition [132], and no IRS guidance specifically addresses training data costs [134].

The AI-generated code problem is the most novel and unresolved issue in the entire AI tax landscape. [Anthropic] uniquely develops the argument that when developers use "vibe coding" or "agentic coding" approaches — generating and refining code using natural language — the iterative "prompt-select-refine-test-debug" loop may itself constitute a process of experimentation. The reframing of debugging AI-generated code as "documenting the failure of a hypothesis" is intellectually compelling. [OpenAI-Mini] notes that if 30% of coding is auto-generated from a known model without engineer input, an examiner could argue those portions were routine and not qualified R&D. [Grok-Premium] identifies heavy reliance on generative AI for code without documented human experimentation as a specific audit risk factor. There is no IRS guidance, no TAM, and no court case addressing this question [2].

Part III: Cloud GPU Infrastructure — The Allocation Imperative

[Perplexity] estimates that leading AI companies are spending $500M–$5B+ annually on compute for LLM training, with a single frontier model training run consuming 25,000–40,000 A100 GPU-days at a cost of $8M–$14M. The tax treatment of this spend depends entirely on allocation methodology.

The statutory basis for treating cloud compute as a QRE under §41 is §41(b)(2)(A)(iii), which allows QRE treatment for "any amount paid or incurred to another person for the right to use computers in the conduct of qualified research" [11]. [Anthropic] notes that this provision was designed in the pre-cloud era for mainframe time-sharing, but cloud GPU instances satisfy the three-part test: the computer is owned and operated by someone other than the taxpayer; the computer is located off the taxpayer's premises; and the taxpayer is not the primary user of the computer [3].

The critical allocation challenge is separating training compute (R&D) from inference compute (ordinary expense). [Gemini-Lite] notes that a flat percentage allocation is increasingly viewed as insufficient by examiners. [Anthropic] recommends establishing allocation methodologies supported by cloud billing system tagging (AWS Cost Explorer tags, Azure resource groups, GCP labels). [Perplexity] proposes a three-tier framework for evaluation compute: first 20% is §174-qualifying, middle 50% is 50/50, final 30% is ordinary expense — reflecting the reality that early evaluation is experimental while late-stage evaluation is production validation.

For reserved instances and capacity commitments, [Perplexity] recommends capitalizing multi-year prepayments as intangible assets under Treas. Reg. §1.263(a)-4 and amortizing ratably, with the R&D-allocated portion treated as §174A R&E. [Anthropic] and [Grok] treat these as prepaid operating expenses with ratable deduction. Both approaches converge on the same tax result — ratable deduction over the reservation period — but differ in GAAP presentation. The IRS has not issued specific guidance [2].

For on-premise GPU clusters, the interaction of §168(k) bonus depreciation and §174A is straightforward but frequently misapplied. [Gemini-Lite] explicitly warns against the double-dip: hardware costs go through §168(k) (100% bonus depreciation for qualifying property placed in service after January 19, 2025 [2]); software/R&E costs go through §174A. The two regimes are distinct [Perplexity]. [Anthropic] notes the trade-off: buying hardware moves it into §168(k) territory (ineligible for §41 credit), while renting cloud compute preserves §41 credit eligibility. Companies with large on-premise GPU investments should model both paths.

Part IV: AI Software Capitalization — The GAAP/Tax Divergence

[Grok] uniquely identifies FASB ASU 2025-06 (effective 2026) as a significant development: the ASU scraps the preliminary/application stage framework for agile/ML development and adopts a principles-based recognition threshold [119]. [Gemini-Lite] confirms that the traditional waterfall approach to software capitalization is obsolete and that FASB has acknowledged this [2]. The transition will likely result in more expenses being recognized currently rather than capitalized under GAAP.

The ML development lifecycle creates a fundamental mismatch with traditional ASC 350-40 stages. [Anthropic] notes that in ML, "training" IS the development — a model that hasn't been trained has no functionality. This creates a conceptual problem: activities following the completion of development (such as training, maintenance, and upgrades) are expensed under ASC 350-40-25-13 through 25-15, but in ML, training is not post-development — it is development.

[Perplexity] proposes a practical framework for A/B testing and model versioning: if an A/B test requires 30%+ code or model rewrite, capitalize it; if it requires less than 15% modification, expense it. [Grok] notes that iterative cycles such as A/B tests and versioning capitalize enhancements if probable future benefits exist, while training and inference prototypes are expensed.

For foundation model API costs, all providers agree on the primary classification: SaaS subscription / ordinary §162 expense [3]. [Perplexity] provides the most granular breakdown of a $8.4M annual API spend: $1.2M for research/prototype development (§174A-deductible), $2.1M for internal productivity tools (ordinary expense), $3.8M for customer-facing applications (50/50 split between development/testing and production/inference), and $1.3M for data analysis and business intelligence (ordinary expense). The revised total is $2.0M §174A-qualifying and $6.4M ordinary expense — a significant reclassification from the common practice of treating all API costs as ordinary expense.

Part V: §41 R&D Credit Qualification — The Documentation Crisis

The §41 four-part test is well-established, but its application to AI activities is creating a documentation crisis. [Grok-Premium] identifies the strongest qualification arguments: developing new model architectures, developing novel training algorithms, developing proprietary infrastructure to overcome scaling bottlenecks, developing custom datasets requiring experimental curation, and non-routine hyperparameter optimization with documented uncertainty [3]. The weakest arguments: using commercial APIs without material modification, prompt engineering, routine fine-tuning, business-process automation using off-the-shelf models, and purely data-labeling without technological advancement [2].

The internal-use software exclusion under §41(d)(4)(E) is a significant trap for enterprise AI deployments. [OpenAI-Mini] emphasizes that internal AI systems (fraud detection, customer service automation, internal productivity tools) face the three-part high-threshold test: innovation, significant economic risk, and commercial unavailability [2]. Many internal AI projects fail this test because commercially available alternatives exist (even if the company chose to build rather than buy). [Grok-Premium] notes that the additional three-part high threshold of innovation test often applies for internal-use software [26].

The documentation standard is evolving rapidly. [Gemini-Lite] identifies a repository of failed model runs as the gold standard of proof [2]. [Grok] recommends Git, Slack, and Jira exports as an emerging standard [19]. [Anthropic] notes that the IRS will accept Git commit logs, MLflow or equivalent experiment tracking, and cloud compute logs as evidence of systematic experimentation. [OpenAI-Mini] emphasizes that generic AI-generated summaries won't satisfy examiners — auditors want to see initial problem statements and how each test result narrowed uncertainty [139].

Form 6765 Section G is the most important near-term documentation imperative. While optional for 2025, it becomes mandatory for 2026 [2]. The form requires business-component-level reporting linking each project to specific technical uncertainties, experimentation processes, and outcomes [2]. [Anthropic] notes that the IRS has signaled it will seek this information in examinations even before the mandate. Companies that have not built this infrastructure will face a retroactive documentation crisis.

[Perplexity] provides a specific wage allocation example: detailed time tracking supports a qualifying wage base of $9.75M when $15M of engineer salaries are multiplied by 65%. IRS examiners typically accept 55–70% wage allocations, while many companies have claimed 85–95% [2]. The resulting overclaimed wage base of $3M produces a credit disallowance of approximately $420K — a concrete illustration of the documentation risk.

Part VI: Transfer Pricing — The AI IP Valuation Crisis

The transfer pricing of AI IP is approaching a crisis point driven by three converging forces: the absence of comparable transactions, AM 2025-001's reassertion of periodic adjustment authority, and the IRS's commitment to significantly increased large-corporation audit rates.

[OpenAI] and [Grok] both cite the Microsoft $28.9B IRS transfer pricing dispute [104] as a calibration benchmark for the scale of exposure. For AI companies, the exposure is potentially larger because AI IP appreciates faster and more unpredictably than traditional software IP, making ex ante valuations particularly vulnerable to ex post challenge under AM 2025-001 [33].

The comparability problem is acute. [Anthropic] notes that there are essentially no arm's-length transactions involving the transfer of a trained foundation model between unrelated parties. The few observable transactions (licensing deals between AI labs and enterprise customers) involve inference access, not model weight transfers, making them poor comparables for cost-sharing or buy-in purposes. [OpenAI-Mini] and [Gemini-Lite] confirm that companies default to profit split, income method, or cost-based approaches [3].

[OpenAI] uniquely highlights the §367(d) termination opportunity [141]: 2024 Treasury regulations allow termination of §367(d) annual inclusions when IP is repatriated to a qualifying U.S. person under strict reporting rules. For companies that previously migrated AI IP offshore and are now reconsidering that structure in light of Pillar Two and AM 2025-001, this creates a potential repatriation pathway. [OpenAI-Mini] notes that IRS still expects consistent valuation under both §367(d) and §482 [141].

The OECD Pillar Two interaction is significant but nuanced. [Anthropic] notes that in January 2026, the US Treasury announced that US-headquartered companies would be exempt from Pillar Two's requirements via the Side-by-Side (SbS) system [59]. However, [Grok-Premium] correctly notes that companies with AI IP in non-US jurisdictions must still model the Pillar Two ETR impact of their transfer pricing structures, particularly where AI IP generates significant profits in low-tax jurisdictions that are subject to Pillar Two top-up taxes from other implementing jurisdictions [3].

Part VII: State and International Divergence

The state conformity landscape is a patchwork that creates significant compliance complexity. [Grok] notes that post-OBBBA, more than 20 states decouple from federal §174 treatment [23]. [OpenAI] and [Anthropic] confirm that California and Texas have divergent treatments, with California maintaining its own favorable R&E deduction rules independent of federal changes [2].

California's SB 711 (2025) is the most significant state development [3]. The legislation replaced the Alternative Incremental Credit with an Alternative Simplified Credit methodology at lower rates: 3% for most taxpayers and 1.3% for those without QREs in each of the prior three years, compared to federal rates of 14% and 6% [2]. [Anthropic] notes that the new calculation mirrors the federal ASC methodology at these lower rates.

[OpenAI-Mini] uniquely identifies Tennessee's add-back requirement [48] and [Grok-Premium] notes that Texas enhanced its credit while ending certain sales tax exemptions [8]. [Perplexity] notes that Michigan introduced an entirely new refundable R&D credit for tax years beginning on or after January 1, 2025 — a completely new opportunity for businesses conducting research in Michigan [44].

For EU AI Act compliance costs, all providers agree on the primary classification: ordinary §162 business expenses [4]. [OpenAI-Mini] provides specific cost benchmarks: third-party conformity assessments run €15,000–€50,000 per high-risk system; initial QMS implementation costs €40,000–€100,000 [143]. [Anthropic] notes that where compliance activities involve developing novel technical solutions (bias detection systems, explainability tools), those specific costs may qualify as §174A R&E — a gray-zone position that most providers did not address. EU AI Act fines (up to €35M or 7% of global annual turnover) are nondeductible penalties [2].

Part VIII: The Emerging Audit Landscape

The IRS enforcement environment for AI-related tax positions is intensifying on multiple fronts. [Anthropic] notes that the IRS now operates 129 AI use cases, up from 54 in 2024 [2], and uses AI-powered audit selection to identify high-risk returns. [Gemini-Lite] confirms that the IRS is targeting companies with high R&D-to-revenue ratios and significant intercompany IP transfers [3].

[Grok] and [Grok-Premium] confirm that no AI-specific LB&I campaign has been announced as of January 2026 [3]. However, general R&D credit campaigns and §174 conformity remain high-priority LB&I focus areas [2]. [Perplexity] estimates examination rates of 8–12% for large tech companies with more than $50M AI spend on software development and §174 expensing, and 15–20% for those with more than $5M claimed AI R&D credits.

The most common errors identified across all providers:

  1. Failing to bifurcate cloud compute between R&D and production (all providers)
  2. Treating all AI API costs as ordinary expenses when development-phase API usage may qualify as §174A R&E (Anthropic, Perplexity)
  3. Not coordinating §174A deductions with §41 credits under the §280C(c) framework (Grok-Premium, Anthropic)
  4. Ignoring state nonconformity and applying federal §174A treatment uniformly (Anthropic, Grok, OpenAI)
  5. Inadequate documentation of technical uncertainty for AI activities (all providers)
  6. Misclassifying foreign AI research costs as domestic (Anthropic)
  7. Overclaiming wage allocations — claiming 85–95% when examiners accept 55–70% (Perplexity)
  8. Double-dipping on hardware — claiming both §168(k) bonus depreciation and §174A R&E for the same GPU purchase (Gemini-Lite)
  9. Claiming §41 credits on internal-use AI without meeting the three-part high-threshold test (OpenAI-Mini)
  10. Treating all fine-tuning as experimental when routine parameter-efficient fine-tuning is production activity (Grok-Premium)

[Grok-Premium] recommends that in high-exposure situations, conservative disclosure via Form 8275 or filing method change requests may be prudent [2]. [OpenAI-Mini] recommends engaging IRS advance programs (PFA or APA) for ambiguous cases [58].


Evidence Explorer

Select a citation or claim to explore evidence.

Go Deeper

Follow-up questions based on where providers disagreed or confidence was low.

What is the IRS's position on the "domestic" vs. "foreign" characterization of R&E expenditures when US-based cloud infrastructure is used by offshore engineering teams — specifically, does the location of compute, the location of engineers, or a functional analysis of where research direction originates determine §174A eligibility?

All providers identified the foreign/domestic bifurcation as a critical compliance issue, but none could identify IRS guidance on this specific question. With distributed AI engineering teams being the norm at major technology companies, the dollar impact of this unresolved question is enormous — potentially affecting hundreds of millions in §174A deductions for companies with significant offshore engineering talent.

Low ConfidenceL tier
Investigate this →

How are IRS examiners actually applying the §41 four-part test to AI-assisted development workflows — specifically, what documentation standards are emerging in active LB&I examinations for companies using GitHub Copilot, cursor, and similar AI coding tools, and has the IRS issued any internal guidance (IDRs, examination guidelines, or training materials) on the "substantially all" test when AI generates a significant percentage of code?

Multiple providers identified this as the single most important unresolved question in AI-related R&D credit claims, but all confirmed the complete absence of formal guidance. The answer will determine the defensibility of §41 credit claims for a significant portion of the technology sector. Practitioners with active examination experience should be surveyed, and any available LB&I examination materials should be obtained via FOIA.

DisagreementL tier
Investigate this →

What is the correct tax treatment of multi-year cloud reserved instance prepayments (AWS Savings Plans, Azure Reserved Instances, GCP Committed Use Discounts) — specifically, are these capitalizable intangible assets under Treas. Reg. §1.263(a)-4 (Perplexity's position) or prepaid operating expenses (majority position), and does the answer change when the reserved capacity is predominantly used for §174A R&E activities?

Providers disagreed on this classification, and the IRS has not issued specific guidance . With cloud reserved instance commitments representing billions of dollars in annual spend for major AI companies, the correct treatment has material cash tax implications. A technical analysis grounded in Treas. Reg. §1.263(a)-4 and the 12-month rule is needed.

DisagreementM tier
Investigate this →

How should companies structure contemporaneous documentation for AI training data costs to survive §174A and §41 examination — specifically, what project-level records, data pipeline documentation, and cost allocation methodologies are sufficient to distinguish §174A-qualifying experimental data development from ordinary data asset acquisition, and what documentation standards are emerging in the absence of IRS guidance?

Training data costs can exceed compute costs for frontier models (Perplexity estimates GPT-4 training data at $9M–$12M), yet no IRS guidance addresses this cost category . All providers identified training data classification as a high-risk area, but none provided a complete documentation framework. A practitioner-level documentation guide would be immediately useful for client advisory.

Low ConfidenceM tier
Investigate this →

What are the second-order effects of §174A's immediate expensing on §163(j) interest expense limitations, §382 NOL limitations, and state apportionment factors for AI-intensive companies — specifically, how does the acceleration of R&E deductions interact with these other tax provisions to create unexpected tax costs that offset the §174A benefit?

Anthropic uniquely identified the §163(j) interaction as a potential second-order cost of §174A expensing, but no provider developed this analysis in depth. For highly leveraged AI companies (common in the current funding environment), the §163(j) interaction could materially reduce the net benefit of §174A. The state apportionment implications of large R&E deductions (which reduce taxable income and may affect sales factor calculations) are also unexplored.

ImplicationS tier
Investigate this →

Key Claims

Cross-provider analysis with confidence ratings and agreement tracking.

11 claims · sorted by confidence
1

The OBBBA (enacted July 4, 2025) permanently restored immediate expensing for domestic R&E under new §174A, effective for taxable years beginning after December 31, 2024, while foreign R&E remains subject to 15-year amortization

high·Anthropic, OpenAI, Perplexity, Grok, OpenAI-Mini, Gemini-Lite, Grok-Premium(Perplexity (on effective date only — claims December 31, 2025; likely error) disagree)·
2

Production inference compute is an ordinary §162 expense; cloud GPU costs for model training during qualified research qualify as §174A R&E and §41 QREs; companies without robust allocation methodologies are creating compounding exposure

high·Anthropic, OpenAI, Perplexity, Grok, OpenAI-Mini, Gemini-Lite, Grok-Premium·
3

There are no comparable arm's-length transactions for proprietary LLM transfers; companies default to profit split, income method, or cost-based approaches; a $1B LLM undervalued by 30% creates a $300M adjustment exposure with 20–40% penalties

high·Anthropic, OpenAI, Perplexity, Grok, OpenAI-Mini, Gemini-Lite, Grok-Premium·
4

On-premise GPU clusters are depreciable capital assets under §168 eligible for 100% §168(k) bonus depreciation; claiming the same hardware cost under both §168(k) and §174A R&E expensing is a double-dip creating significant audit exposure

high·Anthropic, OpenAI, Perplexity, Grok, OpenAI-Mini, Gemini-Lite, Grok-Premium·
5

The IRS now operates 129 AI use cases for audit selection and enforcement (up from 54 in 2024); no AI-specific LB&I campaign has been announced, but general R&D credit campaigns and §174 conformity remain high-priority; examinations of 2025 returns will begin in 2027–2028

high·Anthropic, OpenAI, Perplexity, Grok, OpenAI-Mini, Gemini-Lite, Grok-Premium·
6

No AI-specific TAMs, PLRs, revenue rulings, or court cases exist as of early 2026; companies are operating on analogical reasoning from pre-AI software and R&D frameworks

high·Anthropic, OpenAI, Perplexity, Grok, OpenAI-Mini, Grok-Premium·
7

California does not conform to federal §174A and requires separate state analysis; California passed SB 711 in 2025 replacing the Alternative Incremental Credit with an Alternative Simplified Credit at rates of 3% (most taxpayers) and 1.3% (no prior-year QREs)

high·Anthropic, OpenAI, Grok, OpenAI-Mini, Grok-Premium(Grok (partially — claims CA requires 5-year amortization post-OBBBA, inconsistent with other sources) disagree)·
8

AM 2025-001 reasserted the IRS's authority to make ex post periodic adjustments to AI IP transfer pricing, significantly elevating the risk of retrospective challenges to cost-sharing arrangements and buy-in payments priced on ex ante projections

high·Anthropic, OpenAI, Gemini-Lite, Grok-Premium·
9

The impact of AI-generated code on §41 qualification is the single most important unresolved question in AI-related R&D credit claims; whether the "prompt-select-refine-test-debug" loop constitutes a process of experimentation is untested before the IRS or courts

high·Anthropic, OpenAI, Grok-Premium·
10

Form 6765 Section G (business-component-level reporting) is optional for tax year 2025 but mandatory for 2026; companies that have not built project-level documentation infrastructure now will face a retroactive documentation crisis

high·Anthropic, OpenAI, Grok·
11

Training data acquisition, annotation, and curation costs present the most significant §174A classification challenge; no IRS guidance specifically addresses training data costs; the most defensible position distinguishes between acquiring existing data assets (ordinary expense) and developing novel data pipelines/methodologies (§174A R&E)

medium·Anthropic, Perplexity, Grok, OpenAI-Mini, Grok-Premium(OpenAI-Mini (partial — notes data/information bases aren't treated as software development under IRS precedent) disagrees)·

Topics

AI tax exposureR&D tax credit §41section 174 software capitalizationcloud GPU tax allocationAI transfer pricingEU AI Act compliance taxR&D documentation and audit preparation

Share this research

Read by 8 researchers

Share:

Research synthesized by Parallect AI

Multi-provider deep research — every angle, synthesized.

Start your own research