Trade surveillance: Why data quality determines success

Blog

Trade surveillance programs succeed or fail based on data quality, not surveillance logic or alert sophistication.
Minimal isometric cube pattern with connected nodes, symbolizing digital networks, blockchain, or abstract communication design.

Start with exchange data, not your risk system

Your firm needs the same data that regulators have at their disposal. When the CFTC or FCA calls, they are looking at exchange data, so if you built your program on risk system data, you are looking at something different and will spend the conversation explaining why your numbers do not match theirs—a conversation that never goes well.

Someone from your trading technology team will tell you this is unnecessary, pointing out that all the trades are already in the risk system that cost millions and already feeds several other applications, and that pulling data directly from exchanges is redundant and expensive. They are wrong.

Risk systems change how trades look in ways that serve their purpose but undermine surveillance. A spread trade comes in as one transaction, but the risk system splits it into two legs because that makes the P&L calculation easier.

An options strategy enters as four separate orders, but the system combines them into one position because that simplifies the Greeks calculation. This approach serves risk management perfectly but makes surveillance impossible.

Try to reverse-engineer the original trades from risk system data and you will spend months building mapping tables, discovering edge cases that do not fit your model or trades that got split or combined in ways nobody documented. You eventually give up and get the exchange data anyway after wasting several months.

At BroadPeak we have seen this pattern repeat at dozens of firms: management decides exchange data costs too much or requires too much effort, the compliance team spends months trying to work around it, the project stalls and nothing gets delivered, then a year later someone finally approves the exchange feeds and the project finishes in weeks. The moral of the story: just start with exchange data.

Trade versus order data

Major exchanges provide two feeds, the trade feed that shows executed transactions and the order feed showing every electronic order that led to those transactions. The trade feed tells you what happened, the order feed tells you why. When a regulator asks about potential spoofing, they want to see the order pattern and know how many times you placed and cancelled orders. This data comes from the order feed and exists nowhere else, not in your risk system, not in your trade database, only at the exchange.

The good news is that exchanges want you to have this data because they have compliance obligations too and would rather you catch problems before they become regulatory issues. Most exchanges make this data available at reasonable cost, with some even providing compliance-specific packages. Record-keeping rules often require you to keep this data anyway, so you might as well use it.

Bilateral and physical trades need different handling

Direct exchange data solves listed products, but bilateral trades and physical transactions require a different approach since these trades live in your internal systems with no exchange feed to tap, forcing you to work with whatever data structure your trading system uses.

With listed products, you look for market manipulation, spoofing, and wash trades, violations that show up in individual transactions. With bilateral trades, you look for connections to your listed activity to make sure listed trades are not being used to manipulate bilateral positions.

As an example, a firm trades physical natural gas and also trades Henry Hub futures, with the physical contracts settling based on where the futures contract settles at month-end. If someone at the firm pushes the futures price higher just before settlement, the physical positions suddenly become more valuable and that trader just made money on their bilateral positions. This is the pattern you want to catch, and you need both data sets to see it.

Physical delivery data matters too. A firm might trade power futures and have generation assets, raising questions about whether the listed positions are hedging the physical business or whether someone is using physical positions as cover for speculative listed trading that creates false liquidity. Questions you cannot answer with exchange data alone.

The integration challenge nobody talks about

You now have multiple data sources, each formatting data differently, using different field names, and updating on different schedules. Exchange A calls it “instrument_id” while Exchange B calls it “contract_code” and your bilateral system calls it “product_reference” – they all mean the same thing, but your surveillance system needs to know that. This is the unglamorous part of surveillance and also the part that takes the most time.

Some trades will show up that do not fit your data model when a new product type launches or a trader finds a creative way to structure a transaction, breaking your mapping logic which you then need to fix fast because trades are still flowing. This is not a one-time problem but an ongoing challenge.

Trading technology changes constantly. Exchanges upgrade their systems, add fields, change formats, and deprecate old protocols. At the same time, your trading platforms upgrade, reorganize databases, and change how they store certain trade types. Every change breaks something in your surveillance pipeline.

Most firms handle this by assigning a junior developer to fix data issues. That developer has twelve other projects, so data fixes take weeks. Surveillance alerts stop working. Nobody notices for months until an auditor finds the gap, and the fix-it cycle starts again. This approach guarantees failure.

You need dedicated technical support for surveillance data—someone who works with compliance permanently. Their job is to keep data flowing and handle changes before they break anything, monitoring data quality, catching format changes, and testing updates before they hit production. Building the system is just the first step. Maintaining it is the majority of the work.

Most firms treat surveillance data infrastructure like they treat buildings, build it once, use it for decades, minimal maintenance. Trading systems do not work that way.

Your data pipeline needs to handle real-time updates so that when a trade happens, your surveillance system sees it within seconds rather than overnight, and when an order gets placed and cancelled, that flows through immediately. Batch processing, where you load yesterday’s trades every morning, creates blind spots because a trader could run an entire spoofing pattern before lunch and your system will not see it until tomorrow, which is too late.

You need streaming ETL to extract, transform, and load data in real time as it happens. This requires different technology than most compliance teams are used to, different thinking, and budgets that compliance departments typically lack.

The good news is that the technology has gotten much better in recent years. Tools that required a team of engineers now work with minimal technical support, cloud infrastructure makes scaling easier, and costs have dropped. The bad news is that many firms are still using systems built ten years ago, trying to run modern surveillance on infrastructure designed for batch processing, and it does not work.

The hidden benefit of getting this right

Fix your data infrastructure and surveillance gets easier, which is obvious, but the less obvious benefit is that other teams want to use your data. Your trading desk wants to see their order patterns i.e. how often their orders get filled, at what price levels, and how their orders affect the market. Your risk team wants to correlate listed positions with physical exposure in real time, not end of day or overnight. Your quant team wants to analyze execution quality, see how different order types perform, and test new trading strategies against historical order data.

All of these teams need the same data you just spent months integrating for surveillance, so you can give it to them and suddenly your compliance cost center becomes a business asset. This only works if you built it right in the first place. If you cobbled together a fragile system that barely serves compliance, you cannot extend it to other teams, but if you build robust data infrastructure, everyone benefits.

Once your data is integrated, your pipelines are working, and trades flow in real time, compliance teams need to see this data and use it directly rather than through IT or by submitting tickets and waiting days.

Every surveillance question starts with looking at data, which raises new questions that require more analysis in a cycle that repeats throughout investigations. If each step requires a data analyst or developer, you move too slowly because trading happens in milliseconds while surveillance that takes days cannot keep pace.

Modern tools, like BroadPeak, let business users work directly with data without coding, just point, click, filter, and analyze.

These tools work best with clean, well-structured data, which is exactly what you just built. The analytics part should be the easy part, and if it is not, you have a tools problem rather than a data problem.

A regulator calls about trading activity from three months ago and your compliance analyst pulls up the data in under two minutes, with numbers that match what the regulator sees. The analyst can filter by time, trader, and product, seeing the order pattern, the trade executions, and the bilateral positions that might be affected, then answers the regulator’s questions on the call with no follow-up needed, no scrambling to reconstruct data, and no mismatches to explain away. That is success.

Success is also when your trading desk asks for order analytics and you can deliver it next week instead of next quarter, when a new product launches and your surveillance data pipeline handles it automatically, or when an exchange changes their data format and your monitoring catches it before it breaks anything.

Trade surveillance fails when firms use risk system data instead of direct exchange feeds. You need real-time data from exchanges showing both trades and orders, combined with bilateral trade data from internal systems. This requires streaming infrastructure rather than batch processing, plus dedicated technical support to handle constant changes. Get this right and your surveillance actually works when regulators call, but get it wrong and you spend years building a system that cannot answer basic questions. The firms that succeed treat surveillance data as critical infrastructure requiring ongoing investment rather than a one-time project.

Trade surveillance: See the whole picture

Trade surveillance teams need faster, more accurate ways to detect market abuse, especially across physical and financial trades. BroadPeak’s Trade Surveillance solution brings OTC, E/CTRM, and exchange data together so energy and commodity trading firms can monitor both physical and financial trades in one place. With connectivity to major exchanges and execution platforms, we centralize and standardize data to streamline alerts, and accelerate investigations.

With BroadPeak Trade Surveillance solutions, you eliminate false positives, unify data across your enterprise, and gain confident oversight amid shifting conditions. 

Trade Surveillance solution>>

Perspectives

Insights

Trade surveillance: Why data quality determines success

Trade surveillance programs succeed or fail based on data quality, not surveillance logic or alert...
Corn plants in a field exhibit signs of drought, with many leaves turning brown and dry under the warm glow of sunset.

Why AI projects fail in energy and commodity trading

Most energy and commodity trading firms are testing AI right now but not getting results...

BroadPeak’s Trade Surveillance solution gaining traction

Q&A exploring why BroadPeak’s Trade Surveillance solution is gaining traction across energy and commodity firms...

Book a demo

Let's connect

Scroll to Top