← Blog / Alternative Data Guide
Every institutional trader's nightmare is this: showing up to the earnings call having read the same 10-K, the same analyst reports, and the same Bloomberg headlines as everyone else. You have no edge. You're just noise.
The best hedge funds solved this problem decades ago. They buy data that everyone else doesn't have — satellite images, credit card transactions, web traffic analytics, geolocation patterns, SEC filings parsed with proprietary algorithms. This is called alternative data, and it's how the most successful funds generate persistent alpha in an otherwise efficient market.
But here's what's changing: the barrier to entry is collapsing. What cost $500,000 per year in 2015 costs $5,000 in 2026. Retail investors and smaller RIAs are now accessing data sets that were exclusive to Citadel and Renaissance five years ago.
This guide explains every major alternative data category, how top hedge funds use it, what it actually costs, and how to access it at a fraction of institutional prices.
VertData aggregates SEC insider filings, CFTC COT data, 13F institutional holdings, social sentiment, and government contract data into a single intelligence platform designed for hedge funds, RIAs, and serious individual investors.
Get Access → vertdata.comIn investing, traditional data is what every market participant receives simultaneously: quarterly earnings reports, SEC filings, economic data releases, analyst reports. It's public, it's widely distributed, and by the time you read it, it's largely priced in.
Alternative data is anything else. It's data that isn't part of the standard financial reporting ecosystem — and because it isn't standard, it creates information asymmetry. The fund that figures out how to extract a reliable signal from Yelp reviews before a restaurant chain reports same-store sales has an edge that doesn't exist in traditional data.
The SEC's official position is that alternative data is legal as long as it's (a) not material non-public information (MNPI) obtained through a breach of fiduciary duty, and (b) properly obtained. Satellite images of publicly visible parking lots are legal. Hacking corporate databases is not. The line in between is where lawyers earn their fees.
What it is: Satellite companies capture multi-spectral images of the Earth multiple times per day. Hedge funds pay for processed analytics derived from these images — parking lot car counts at retailers, oil tank storage levels, crop yields, construction activity, shipping port congestion.
Famous use case: In 2014, it became publicly known that Tiger Global was counting cars in Walmart parking lots from satellite images to estimate quarterly sales before earnings. The signal was profitable for years until it became crowded.
Current providers: Orbital Insight, SpaceKnow, RS Metrics, Planet Labs
Institutional cost: $50,000–$500,000/year. Retail access: not available at reasonable cost. This is still firmly in institutional-only territory.
What it is: Aggregated, anonymized credit and debit card spending data from millions of cardholders. Sold by data brokers who partner with card networks and merchants. Shows consumer spending by retailer, category, and geography — typically 2–4 weeks ahead of official sales reports.
Famous use case: Funds using card data were able to predict Target's Q4 2021 miss before it was announced. The card data showed a spike in freight costs and a slowdown in discretionary categories weeks before the earnings call.
Current providers: Bloomberg Second Measure, Earnest Analytics, Yodlee, M Science
Institutional cost: $100,000–$800,000/year for full coverage. Partially accessible through Bloomberg Terminal subscriptions.
What it is: Programmatic extraction of data from websites — product prices, inventory availability, job postings, review counts, app download rankings. Web traffic analytics (similar to SimilarWeb data) show how many users are visiting a company's site.
Signal: Rising job postings in a specific department often precede expansion. A sudden spike in e-commerce traffic 3 weeks before earnings suggests a strong quarter. Declining web traffic for a SaaS company can predict churn before it's reported.
Current providers: SimilarWeb, 1010data, Thinknum, Apptopia
Institutional cost: $20,000–$200,000/year. Some data accessible via API at $5,000–$30,000/year.
What it is: Quantified environmental, social, and governance metrics — carbon emissions, water usage, board diversity, labor practices, regulatory violations. Used both for compliance screening and as alpha signals (companies with improving ESG metrics sometimes see multiple expansion).
Signal: Companies with rapidly improving environmental scores have outperformed their ESG-lagging peers in certain sectors, particularly energy transition-sensitive industries. ESG litigation risk is increasingly priced in before incidents go public.
Current providers: MSCI ESG, Sustainalytics, Bloomberg ESG, TruCost
Institutional cost: $30,000–$250,000/year for comprehensive datasets.
What it is: Parsed, structured, and AI-analyzed data from SEC EDGAR — Form 4 insider transactions, 13F institutional holdings, 8-K material events, Schedule 13D activist filings. This is one of the highest-ROI alternative data categories because the underlying data is free — the edge comes from processing speed and analysis quality. For a complete breakdown of each filing type, see our guide to reading SEC EDGAR filings.
Signal: Real-time Form 4 insider buying alerts, activist 13D filings, cluster insider purchases, 8-K earnings surprises compared against consensus. Funds that process this data faster and more accurately than others have a significant information advantage.
Current providers: VertData, Calcbench, AlphaSense, Sentieo
Institutional cost: $5,000–$50,000/year. Accessible to sophisticated retail investors at a fraction of institutional pricing.
What it is: Systematic tracking of stock transactions by corporate insiders (Form 4) and government officials (STOCK Act disclosures). When executives buy their own company's stock on the open market with their personal money, that's one of the most reliable bullish signals in finance.
Signal: Cluster insider buys (3+ insiders buying within the same 30-day window) precede significant outperformance in academic studies. Congressional trades in sectors with pending legislation have shown abnormal returns relative to market benchmarks. Our congressional trading data guide covers exactly how to extract actionable signals from STOCK Act disclosures.
Current providers: VertData, OpenInsider, QuiverQuant, Washington Service
Institutional cost: $5,000–$60,000/year for real-time access with AI scoring. Basic free data available at OpenInsider.com.
What it is: Natural language processing of social media (Twitter/X, Reddit, StockTwits), earnings call transcripts, news articles, and analyst reports to extract sentiment signals. Machine learning models score text for bullish/bearish tone, uncertainty, management confidence, and forward guidance quality.
Famous case: The GameStop short squeeze of January 2021 was visible in WallStreetBets sentiment data 48 hours before the mainstream media covered it. Funds with social sentiment data were positioned.
Current providers: Refinitiv News Analytics, Accern, StockGeist, Quandl
Institutional cost: $15,000–$150,000/year. Social media monitoring APIs accessible at $1,000–$10,000/year.
What it is: Anonymized mobile phone location data showing foot traffic to specific locations — retail stores, restaurants, hotels, commercial real estate, borders and ports. Sold by data brokers who aggregate location data from apps and mobile ad networks.
Signal: Foot traffic data at chain restaurants predicts same-store sales with ~78% accuracy (according to second-party studies). Hotel occupancy derived from geolocation data tracks RevPAR closely. Cross-border foot traffic data was used to anticipate reopening trade plays in 2021.
Current providers: SafeGraph, Placer.ai, Veraset, Foursquare
Institutional cost: $50,000–$300,000/year for comprehensive nationwide coverage.
| Data Type | Institutional Cost/Year | Retail Accessibility | Typical Use Case |
|---|---|---|---|
| Satellite Imagery | $50K–$500K | ❌ Not accessible | Predict retail sales, oil inventory, crop yields before reports |
| Credit Card Data | $100K–$800K | ⚠️ Limited via Bloomberg | Consumer spending trends by retailer, 2–4 weeks early |
| Web Traffic Analytics | $20K–$200K | ⚠️ Partial (SimilarWeb free tier) | SaaS churn prediction, e-commerce velocity signals |
| ESG Data | $30K–$250K | ⚠️ Limited free data | Regulatory risk screening, impact investing mandates |
| SEC Filings (parsed) | $5K–$50K | ✅ Accessible (raw data free; VertData for analysis) | Insider buys, activist campaigns, 8-K earnings surprises |
| Insider Trade Data | $5K–$60K | ✅ Accessible (VertData, OpenInsider) | Cluster insider buys, congressional trade signals |
| Social Sentiment (NLP) | $15K–$150K | ⚠️ Partial (StockTwits free data) | Retail investor momentum, short squeeze signals |
| Geolocation / Foot Traffic | $50K–$300K | ❌ Not accessible | Retail traffic prediction, real estate occupancy |
The most expensive alternative data categories — satellite, credit card, geolocation — are still largely out of reach for non-institutional investors. But there's a significant opportunity in the categories where the underlying data is public, and the edge comes from analysis quality and processing speed.
The United States government publishes more alternative data than most investors realize — for free:
The raw data is free. But parsing it, cleaning it, cross-referencing it, and extracting trading signals from thousands of daily events is where the work — and the edge — lives.
It's not just about having the data — it's about the workflow. Here's how a systematic hedge fund typically builds an alternative data process:
VertData was built on a single thesis: the highest-ROI alternative data categories are the ones where the raw data is already public. Form 4 insider filings are free on EDGAR. CFTC COT reports are free. Congressional STOCK Act disclosures are free. Government contract data is free.
The edge isn't in owning exclusive data. It's in processing that public data faster, more accurately, and with better signal extraction than anyone else.
VertData delivers:
Stop trading blind. VertData aggregates the highest-signal public alternative data sources — insider filings, COT positioning, 13F holdings, government contracts, activist campaigns — into a single platform built for hedge funds, RIAs, and serious individual investors.
Start Free Trial → vertdata.comYes, as long as it's legally obtained and doesn't constitute material non-public information (MNPI) obtained through a breach of fiduciary duty. Satellite images of publicly visible areas, aggregated consumer data, public government filings, and social media data are all legal. The SEC has brought cases against misuse of expert network information and stolen corporate data, but has been clear that legally-obtained alternative data is permissible for investment use.
Start with Form 4 insider transaction data. The raw data is free on EDGAR, the signal is academically well-documented, and the strategy is simple to implement. Combine cluster insider buys (3+ insiders in the same month) with a relative value filter and you have a backtest-ready strategy that doesn't require expensive proprietary data.
According to Oppenheimer's 2025 hedge fund survey, the average large hedge fund ($1B+ AUM) spends $15–60 million per year on alternative data. The largest quant funds (Renaissance, D.E. Shaw, Two Sigma) spend considerably more. The total market for institutional alternative data is estimated at $7–9 billion annually.
Disclosure: This article is for informational purposes only and does not constitute investment advice. VertData is a financial data and technology platform. Past performance of any strategy discussed is not indicative of future results.