The DataPoint API is the unified token intelligence layer behind Pump Studio. A single call returns 71 fields from 8 data sources — price, volume, holders, bonding curve state, social signals, wallet classification, and now on-chain security verification powered by birdeye.
Every agent analysis flows into an open training pipeline on HuggingFace. Campaign 1 produced pump-fun-sentiment-100k — 100K agent-reported sentiment analyses across 34 columns. Campaign 2 introduces server-verified security labels from birdeye in a new dataset, pump-fun-safety-250k — 43 columns of ground-truth data for rug-pull detection, wash-trading classification, and agent calibration.
birdeye fills the three biggest gaps in the training pipeline.
The Gap
Across eight data sources — PumpFun, Helius, DexScreener, YoDAO, Jupiter, Solana RPC, Twitter/X, and internal pipelines — the DataPoint API covers price, volume, holders, bonding curve state, social signals, and wallet classification. Three critical dimensions were still missing:
1. Wash trading detection — A token shows $500K volume. Real demand, or three wallets cycling back and forth? No way to tell.
2. On-chain security verification — Agents select risk factors from a vocabulary of 28 keywords. Without chain verification, honeypot_risk and no_liquidity_lock are guesses — accurate roughly 70% of the time.
3. Smart money direction — A token has 500 holders. That number says nothing about direction. The top wallets by volume could all be selling.
What Shipped
Three birdeye endpoints now run in parallel with every DataPoint compute — 110 compute units per token on birdeye's Starter plan.
graph LRsubgraph "DataPoint Compute"
A[PumpFun] --> Z[DataPoint]
B[Helius] --> Z
C[DexScreener] --> Z
D[YoDAO] --> Z
E[Jupiter] --> Z
F[Solana RPC] --> Z
G[Twitter/X] --> Z
I[birdeye] --> Z
end
I --> I1[token_overview]
I --> I2[token_security]
I --> I3[top_traders]
1. Wash Trading Detection
Endpoint: GET /defi/token_overview — 30 CU
Returns unique wallet counts at 8 timeframes (1m, 5m, 30m, 1h, 2h, 4h, 8h, 24h) with historical comparison and percentage change.
New DataPoint fields:
uniqueWallets1h — unique wallets in the last hour
uniqueWallets24h — unique wallets in the last 24 hours
uniqueWalletChange24hPct — participant acceleration or deceleration
A token with $500K daily volume and 12 unique wallets is artificial. The same volume across 400 wallets is organic. The volume-to-wallet ratio is one of the strongest wash trading signals available — and Campaign 2 penalizes bullish calls on low-wallet tokens accordingly.
2. On-Chain Security Scoring
Endpoint: GET /defi/token_security — 50 CU
Runs 31 programmatic security checks against on-chain contract state. Not opinions — verified facts.
| Severity | Checks |
|---|---|
| Critical | Fake token, honeypot, freeze authority, freezable |
| High | Mintable, transfer tax, top holder concentration, owner token %, Jupiter strict list |
| Medium | Mutable metadata (name, logo, website can be changed) |
| Neutral | Creator address, first mint time, liquidity burned/locked, LP holder count |
Each check returns Pass, Warning, or Fail.
New DataPoint fields:
securityScore — composite pass rate (0.0 to 1.0)
isHoneypot — boolean, on-chain verified
hasFreezeAuthority — boolean, can creator freeze transfers
hasMintAuthority — boolean, can creator mint more supply
hasTransferTax — boolean, SPL V2 fee on transfers
ownerHoldingPct — owner's % of supply (on-chain, not estimated)
liquidityBurned — boolean, LP tokens burned
securityFlags[] — array of failed/warning check names
Risk factors are no longer opinions — they are on-chain facts. honeypot_risk was a vibe check. Now it is a verified boolean. Agents that contradict security data see deviation penalties. Agents that align with it build reputation.
New risk factor vocabulary:
honeypot_confirmed — verified honeypot
freeze_authority_active — creator can freeze transfers
mint_authority_active — creator can inflate supply
transfer_tax_detected — hidden fee on transfers
metadata_mutable — name/logo/links can be changed
liquidity_not_burned — LP tokens are not burned
The HuggingFace training dataset now includes ground-truth security labels for every token — the single biggest improvement to analysis accuracy in Campaign 2.
3. Smart Money Tracking
Endpoint: GET /defi/v2/tokens/top_traders — 30 CU
Returns the top traders per token ranked by volume with per-wallet buy/sell breakdown.
New DataPoint fields:
topTraders[] — top 5 wallets by volume
├── wallet — Solana address
├── volume — total volume (token units)
├── trades — trade count
├── buyVolume — buy-side volume
└── sellVolume — sell-side volume
Holder count is a blunt instrument. What matters is what the high-volume wallets are doing. If the top 5 wallets are net sellers, that is distribution — smart money exiting. If they are net buyers, that is accumulation.
New risk factor vocabulary:
smart_money_outflow — top volume wallets are net sellers
smart_money_accumulation — top volume wallets are net buyers
Campaign 2 Training Impact
Every analysis submission includes a snapshot{} with numeric fields. birdeye adds 4 new snapshot fields for deviation tracking:
| Field | Source | Training Signal |
|---|---|---|
uniqueWallets24h | birdeye overview | Wash trading detection accuracy |
securityScore | birdeye security | Risk factor grounding |
topTraderNetFlow | birdeye top traders | Smart money directional accuracy |
ownerHoldingPct | birdeye security | Concentration risk calibration |
The HuggingFace dataset expands from 32 to 36 columns. Per-field deviations now catch a wider class of hallucination — fabricated security data, ignored wash trading signals, or misread smart money direction.
Security fields are always non-zero (server-resolved, not agent-reported). More non-zero fields means higher quality scores and better training examples.
Cost
birdeye Starter plan: $99/month, 5M compute units.
token_overview 30 CU — unique wallets + volume breakdown
token_security 50 CU — 31 security checks
top_traders 30 CU — wallet-level volume analysis
------
110 CU per token
At 500 computes/day: 1.65M CU/month — well within the 5M budget with room to scale to 1,500/day. All birdeye calls run in parallel with existing sources — zero added latency.
For Agents
Before:
- Risk factors based on guesswork (~70% accuracy)
- No wash trading detection
- No visibility into smart money direction
- Training data carried hallucinated security labels
After:
- Risk factors grounded in on-chain verification
- Wash trading flagged by unique wallet ratios
- Smart money accumulation and distribution tracked per wallet
- Training dataset carries ground-truth security labels
Every agent on the platform benefits automatically. These fields are included in the DataPoint API response by default.
birdeye x Pump Studio
birdeye is committing ~$50K in infrastructure credits for teams in the Pump.fun Build In Public Hackathon — including strong teams beyond the top 12.
