Building Server Empathy – Broadpeak’s Guide to WebSockets

While building an application, I recently encountered a situation where I needed to use WebSockets. Among other things, the application fits a row from the database (43 columns) into a row in the Facebook FixedDataTable (FDT), which is a Javascript library for displaying tabular data.

A problem occurred when the client requested too many records, and the server could not handle the records in memory (RAM). Depending on the server heap size, at about 75,000 records the server ran out of memory and became unresponsive. Given our large data sets and the possibility of potentially hundreds of connections, the memory issue had to be rectified. I had to implement data streaming to allow the server to run in constant memory. I decided to use WebSockets because I wanted to implement a cooperative streaming protocol that uses bidirectional communication.  This was the start of my journey towards Server Empathy.

The original implementation failed because of a memory overload, but for posterity it is outlined in these three steps:

1) Client sends HTTP request to server asking for all trades fitting a certain criteria from the database
2) Server opens a database connection and reads all relevant rows into memory
3) Server does some work and passes data as response back to client

In order to implement a solution I modified the way the database returns records, and created a communication protocol over WebSockets for client/server communication.

First let’s discuss the database connection. In Java JDBC, when a query is executed, an object of class ResultSet is returned. The ‘Resultset’ is a cursor into the query results prepared by the database server.  The original implementation was fetching the entirety of the result rows, bringing them into memory and sending them off to the client.

In the new implementation the server uses a result set function to manage the flow of data from the database to the server.  In a series of iterations, the server only fetches a small batch of records from the Resultset.  Next the server enriches the data and drops it on the WebSocket for the client, before proceeding on to the next batch. The previous batch automatically gets garbage collected, allowing the server to run in constant memory.  Next outlines how I built a client-server communication protocol to minimize the memory burden on the server.

When the client gets some data, it renders the data into the FDT and then sends a message to the server over the WebSocket. The message tells the server whether it should stop or continue sending records. The client has the option to request the termination of the connection at any time for any reason. The server will terminate the connection either when it has received a stop signal from the client, or when the query has completed.
The interaction is detailed in the diagram below (bi-directional arrows represent a web socket):

WebSockets communication model

In this model the server only needs to keep (at maximum) a predetermined number n records in memory at a given time. This alleviates the server load and allows the client to terminate a connection if the application becomes slow or the user navigates away from the page. The chatty relationship between client and server ensures that both sides are always responsive, and any problems will be rectified immediately.

WebSockets may be more difficult to implement than a standard HTTP request, but they provide a lot of flexibility in the client/server interaction. They are particularly effective for communicating large amounts of data, because WebSockets allow for batch streaming while lightening memory loads on both client and server.  So if you “care” about server memory, speed, or network bandwidth, there is a strong possibility web sockets may be useful!

Technical Information
If you are interested, here is some information on the technology stack used for this application:

Core.async – The asynchronous library we use on both client and server. Processes messages entering and leaving the web socket.

Re-Frame – A small templating library built on top of ReactJS. It helps to manage the state complexity, while providing a way to program functionally.

Clojure/ClojureScript – Respectively, server and client programming languages we use.

Chord – WebSocket library, built on top of http-kit.

Fixed Data Table – Javascript library used to render the trade data in a tabular form.

For People Who Love Bad News…

It’s been all good news these days. Makes me think back to high school when I learned about the Stoics. Something to the effect of …destructive emotions are the result of an ershutterstock_184339940ror in judgement.  I suppose it could also be said that errors in judgement are the result of destructive emotions. Either way these are the days to set aside overhyped hyperbole and keep one’s head.

Dislocation and Disruption

Everyone (especially in tech) loves to talk about disruption. Mostly these companies are not disruptive…they are dis-locative. There is a big difference, and that difference is “the big plan.” True disruption has a “plan of being” after the dislocative event.  Remember “Occupy Wall Street?”  Lots of attention, plenty of media, a voice heard ‘round the world.  But, nary a path forward that people can get behind.  Dislocation gets a lot of attention but eventually goes quietly into the night.  Compare that to, I dunno, Alexander the great (keeping with high school classics).  War and pillage had been happening since the beginning of time.  Alexander was different.  Had a plan to instill laws, commerce and governance that forever changed the face of the world.  That’s Disruption.  So I’ll ask what you think about Brexit.  Major Dislocation or Disruption?  It’s a whole lot of hype until the UK files for an Article 50 exit.  My guess is that could take a long, long time.

The Trouble With the Spot Month

The CFTC may or may not roll out with Dodd-Frank Limits this year.  At least that’s the opinion of everyone we talk to.  Signs point to yes, but experience says no.

The exciting news is that this month we are rolling out K3 ATLAS as a stand alone product.  Atlas is a data server that assembles everything you need to correctly calculate limits across the world.

The Problem ATLAS Overcomes

Correct reference data …When we started building limits we found limits ref data to be dead wrong or entirely non-existent.  Based on the commercially available data out there, we quickly realized there is a good chance that just about everyone’s calculation is probably wrong.

What Kind of Data You Need

Current and updated limits from the exchange.

Correct exchange deltas for your options

Aggregation ratios to roll product together

Current maturity dates of all your products

And More than anything….A correctly calculated spot start date.

That’s what ATLAS provides.  We are now downloading over 1 million pieces of data a day and the first service to offer the spot start date for all products.   You can tell you IT team it’s all delivered via a nice REST API.  They will be happy as clams because they can fold it into your existing limits process in no time.
Seriously, if you are looking to clean up your limits process, give us a call. You can get real time updates to all of these data points for less than you are paying some provider to only give you limits in a CSV.  Always happy to give you a free trial to try it out.  info@broadpeakpartners.com  

EU-MAR – More Insight

A really good article here from Baringa.   The gist is focus on Policy and Plan.  There’s a reason for that.  Sure, you want all your order information from all exchanges and market-places.  The bad news is that not all of them are remotely ready to go.  There’s a little bit of a dark art to getting your order data.  To make a long story short it pays to plan for the long haul!

Corporate Tuberculosis

So, I’m sitting in this meeting at an enormous Fortune 50 company. The general theme of the meeting is “digital transformation” and creating better insight into data for the business. Operations wants something really straightforward: direct access to data for faster turnaround.   We’re reviewing the most powerful data technologies in the market and how they all fit together to deliver exactly what the business wants.Untitled-1

And then a guy on the phone starts ranting. He’s got a fiefdom of 30 people running a 20 year old ETL tool that serves as the single gateway to the business. They’ve made enormous investment and he was having none of this meeting.

Furrowing my brow, I realized: “Aarrgh! – They’ve got a Data Hoarder.”

Data Hoarders are the corporate equivalent to Tuberculosis.  Won’t kill you right away but, they didn’t call it consumption for nothing.  Data Hoarders are a wasting disease from which, without significant intervention, you will eventually die.

Let’s set some context here. Worldwide corporate data use is cranking away at something like 8 Zettabytes per year.   That’s up from about 0.5 Zettabytes since 2009.  What’s a Zettabyte?  Well, if a Gigabyte were one second, one Zettabyte would equal 31,688 years.  

Here is the thing: in any given company somewhere between a quarter and half of the data falls into what we consider “critical corporate data.”This is the kind of transactional and metadata that has a clear and material impact on the company.  Exposing this data is critical toward building meaningful insight.

There is this saying: “Do what you’ve always done | Get what you’ve always got”.   It’s a peppy motivational quote.  But in the corporate world  doing what you’ve always done has a feeling of genuine safety.  The institutionalization of this is:  “Better the devil you know (than the devil you don’t) .”   But when it comes to data there is something afoot that upends the apple cart: Data is an ever changing deluge. Old technology scales miserably.  It requires larger and larger fiefdoms just to reach the starting line. In other words, if you do what you’ve always done, you will get far, far less than what you’ve always got.

So, You’ve Got Corporate Tuberculosis-Now What?

There is good news. Remember that 20 year old ETL.  It cost a fortune and requires dozens of skilled personnel to run.  There are plenty of amazing tools out there that completely change the game.  K3 just happens to be one of them.  At so many of our clients we have liberated data from monolith applications so the business can get at that data and do things with it.  We are talking about delivering streaming and cross functional data in weeks not years, and at a fraction of the cost.

I know.  Data Hoarders are a tough nut to crack.  Smart CEOs and CIOs have an open door policy when it comes to dismantling this type of fiefdom.  But even more important is getting Data Hoarders to let go of the fear leading them to hold on to job security for dear life.  If you really want a safe job, be the person that enables democratization of enterprise data.  Trust me, your career will be far more rewarding when you become a key stakeholder in delivering spectacular insight.

How To MAR- A Commodities Primer

We’ve  recently had some very interesting conversations with companies about MAR.  Despite going live in July 2016, it’s clear that companies are falling into two camps.  Those that are doing their best to ignore it.  And, those that are really grappling with some of the finer points of surveilling manipulative behavior.  Things like:

  • When does a Spoof become a Spoof, instead of just plain old bidding behavior?
  • How would we detect manipulation in the spot market via the futures market and vice versa?
  • Is there a scenario where MAR might reach into REMIT’s territory?
  • What constitutes a “reasonable suspicion” such that we have to submit a STOR under MAR?

These are all really important questions.  But even as a software vendor we are really encouraging companies to focus efforts on starting their POLICY & PROCEDURE (P&P).  Just for context, I am a guy who sells data connections & surveillance software and I’m suggesting that you don’t buy a single thing… until P&P is in place.

Why?  Well, when it comes to surveillance it pays for companies to take the long view.  The potential surveillance universe is large and the data required is complicated. There is no possible way to take this down in a big bang.

Food for Thought

Your MAR P&P boils down to three things:

  1. How will we surveil for known manipulative practices?
  2. How do we close gaps in our surveillance to cover known about but un-monitored activities?
  3. How do we stay current with manipulative activities that we hadn’t thought about?

For example, most compliance officers know about wash trading as a manipulative behavior.  We can capture data and generate reports that detect this behavior.  But rogue traders are a clever bunch and will certainly come up with manipulative ideas that no one ever thought of.  Likewise when they come up with something it’s going to take some time to close this surveillance gap.  

A surveillance program at any given point in time will have 3 universes.

  • Behaviors that we don’t know about yet (bottom sphere);
  • Behaviors that we know about but don’t monitor yet (big sphere);
  • Behaviors that we know about and monitor (widening sphere)


complianceNow here is the rub.  It may take any given company a long while to move into a known and monitored state.

Why?  Well, the biggest risk is that the data is just not available, and may not be for a long time.  In other cases, we can get the data but implementation of surveillance over that behavior will take “x” months.   

So the objective of your MAR P&P is to create a control framework that tracks, plans and expands the scope of known and monitored activity.  This directly leads to an actionable scope for buying software, data connections and the like. This will relieve a lot of pain in the process and prevent your surveillance program from spiraling out of control.

If You Would Like to Kickstart this Process We Have a Template Surveillance Policy that  covers this and more.  Tom Eisner directly at +1-646-461-3820 or tje@broadpeakpartners to get a copy.  And as always if you have any questions or comments please don’t hesitate to contact us.

Commodities Surveillance | You’ve Been “MAR’d”

You know what?  Change is like poetry…

….and most people absolutely hate poetry*.MAR

 

This is going to be one of those “Difficult Conversations”.  You know, the kind they teach you about in Business School. We’ve got to talk about some difficult things and find common ground toward a way forward.  So here’s the situation:

MAR (Market Abuse Regulation) is coming in July and there’s some things about it that are going to be really, really difficult. It’s a size-able reg but I’m going to jump right to the section which is getting the most attention: MAR requires that EU commodity firms  “establish and maintain effective arrangements, systems and procedures to be able to detect suspicious orders and transactions.”

It sounds innocuous enough, but before I get to all the ins and outs of actually executing this mandate, I need to put forth the centerpiece of this difficult conversation:

Some traders will be forced to change how they execute trades.  

I’m also going to say that the search for the ultimate surveillance tool is foolhardy until you’ve got your data strategy thoroughly thought through .

The Devil of Surveillance is in the Details

Let’s talk about what we are up against.

  • Most European companies have begun to look for some type of software solution based on the complexity of their business.  These range from simple to complicated pieces of software and process.  Some solutions have very cool artificial intelligence and other big data features that are a huge leap forward, technologically speaking, for both firm and industry alike.
  • But there are known challenges with surveillance systems: If you ask anyone in equities surveillance their complaint is that surveillance solutions take a  long time to configure and calibrate. Take equities surveillance. It’s been around for years.  What worries market officers is not actual market manipulation.  It’s false positives.  Every time your compliance system beeps and tells you it sees suspicious activity compliance officers are obligated to investigate and potentially report.  It is the proverbial boy who cried wolf.  
  • Can firms realistically get something up and running by July?
  • But there is one issue that crowns all: Where are commodity companies going to get the data to run any surveillance solution?  In other words, where exactly, do we get all of our order and trade execution data?

I’m not going to kid you.  The answer to where we get all this data is nuanced and nasty enough to drive any compliance officer through the roof.  So here we go:

Take this  project we just completed with a pretty typical trading firm.  About 1000 physical and financial trades per day across three exchanges and smattering of OTC trades.

How Do We Get Our Order & Execution Data?

Getting orders is like running a gauntlet with traps on all sides. The primary reason is that the major exchanges and FCMs are pretty new at it and there is a lot of bumps in the road.  Let me give you an idea of what it looks like:

Let’s start with ICE.  The ICE Private Order Feed will get you all our trader’s ICE order data.  It’s pretty straightforward  But, only so long as the traders are using WebIce.  Yeesh, looks like our traders are using a bunch of other tools.

Got some traders using TT?

Yep, we’re going to need a connection into TT to get those orders and fold them together with the ICE POF.  Totally do-able.  But, it’s not quite as simple as that.  You know the TT instance the traders are using?  It’s hosted by the FCM.  We’re going to have to  wrestle with them for a while to allow you access to your own order book.  This is not as easy as it sounds.

Done yet?  

Nope, not even close.  The orders are flowing fine from ICE but CME we are only seeing some from TT.  Must be our CME Direct trades.  Let’s get them through the drop file.   Anyone know a really good VPN guy because it’s  taking forever and ever to set-up what was supposed to be a simple connection to basically get a file.  And what do you mean I have to pay for a 500KB connection?

Whoops, our trading operations never set up segregated session IDs. Going to have to shell out for those and wrestle a bit more with the FCM.  As always, this is not as straightforward as it seems.

Cross your fingers we think we have a full footprint of CME and ICE orders.

How about OTC trades?

OK, yes over to OTC trades.  It’s all physical now and we can aggregate all of the ACER XML order data from our OMPs.  Wait! What do you mean the XML is different between these brokers?? OK, we will do some mapping  it as the data comes in and we should be good to go.

Wait, one more thing! Lets bring in the full order book so we can get tick values so we can compare our orders to the market.  Phew, at least that’s a well established integration.  But, it costs how much?  WOW.  Wait just one second.  Tick data is coming in …and there’s an ABSOLUTE TON of data.  We  estimate we need …..whoa, that’s a lot of storage!  How soon can we start procuring servers? Geeze our server team is going to gouge our eyes out on this one.    

With that done we’ve made it pretty far… except….

Except that:

  • We still  have 3 traders using some trade execution tool that just has no way to capture orders at all.   
  • We still have 5 traders using a third  exchange that offers absolutely no orders at all.  
  • Its March.  We go live in July. We haven’t even started configuring our surveillance solution yet.
  • The above orders & execution process only scratches the surface …For an article I had to leave so much out.

As an integration software company this is our world.  We secretly nerd out on it.  But I don’t think regulators really understands that as far as market surveillance data goes,this is the kind of cobble, beg, borrow and plead process we have to do to just get orders.

And, what’s a compliance officer do with the exceptions?  Are we really ready to tell traders they can’t use their favorite execution platform because it is entirely “un-surveillable?”  Can we shut off trading exchanges that don’t offer order data at all?  

The crux of this difficult conversation is, exactly that. Off reservation traders are going to have to change how they execute and it’s just not going to be popular.  But, until these execution tools and exchanges improve their order flow we’re going to have to curtail.

There is also a lot of sales pressure to “get into a system fast.”  I’m sorry, but without a really well thought out data strategy there is a risk of buying a system that has nothing but partial and half broken data to analyze.  That is going to result in a crushing blow of possibly report-able nonsense.

If you are trying to get your data for MAR we’d be happy to talk. It’s all do-able and we’ve  learned many tricks along the way. For a July go live date including getting a surveillance system up and running we’re going to need to get moving.  

OK, I’m glad we had this talk. 

*Adapted from a Michael Lewis quote in “The Big Short.”

REMIT Table 1 and Table 2

As we approach the deadline for the next phase of REMIT, K3 stands ready to help you with your tough trades.  Having trouble with Table 2?  Need questions answered? Don’t worry just give Tom Eisner a call.

We can help you get your REMIT Table 1 and Table 2 trades converted to a conforming ACER XML no problem.  Here’s a video of the process.

Call Tom Eisner +1-646-461-3820 or email for info.

 

ICE Releases New Wash Trades FAQ

This month the Intercontinental Exchange (aka ICE) released a revised Wash Trade FAQ document.  This isn’t simply arbitrary guidance.  It comes after considerable review with the CFTC, who list it along with the previous edition on their site.  The last version was issued a little over 6 years ago and the new edition addresses washing in context of the take-over of electronic trading.  The old version referred to the “trading floor”, language entirely absent from the new version.  Lastly, it moves away from the more securities oriented term “wash sale” and fully adopts the term “wash trade”.

So what is a wash trade?  From the FAQ: “A wash trade is a transaction or a series of transactions executed in the same Commodity Contract and delivery month or Option series at the same, or a similar, price or premium for accounts of the same Principal.”  “A wash trade occurs when there is an act of entering into, or purporting to enter into, transactions with no intent to obtain a bona fide market position…”

In simple terms, if you make offsetting trades in a given product/contract month, you are giving the false appearance of activity in the market.  But this isn’t for a single trader…the “you” in this case is the corporate parent.  Say you have two traders across the pond from each other and trading under different entities.  One buys 100 lots of WTI at the market open and the other sells 100 lots at the market open.  The parent entity will have no net position and this could be considered a wash trade.  Two things stick out…

  1. This speaks nothing of intent of the two traders.  What it comes down to is that any open /close at the same price is automatically suspect. It’s the compliance officer’s job to establish whether there was proper intent to obtain a bona fide market position.
  2. The updated guidance refers to trades done at a “similar” price. This is much broader than the previous version which defined washing as trades at the “same” price.  It makes sense in context of electronic markets which move fast, but does broaden the scope of what could be a wash. Compliance officers’ job just got harder. Now we have to track everything within a penny? Two?

What does this mean for me or my firm?

It means you should have basic capability to monitor every suspect wash trade.  I would go further by saying this requires a substantive command of the data and the analytics to support it.  When the exchange or regulators inquire about possible suspicious trades, quickly producing data to substantiate activity along with a narrative around intent is your best foot forward.

Here is an example: Anytime we trade in and out at or near the same price…it shows up on this chart.

 

Euro Insanity

There’s an old saying that insanity is defined by doing the same thing again and again and expecting different reshutterstock_323387108sults.  There is no shortage of compliance pain under ESMA, REMIT, MIFID…etc:

ESMA recently released an updated EMIR reporting specification which increases the number of  report-able fields by 50% to 129.  Everyone has been a bit heads-down sorting out ACER II,  amidst a growing chorus of complaint about the existing complexity, cost, difficulty of their EMIR solution.

But there is hope for doing something different.

Trade Repositories –  It is a buyers market for TRs.  Some of the original choices have proved to be…well let’s just say “really bad.” And by ‘really bad” I mean an absolute black hole in terms of service, support and cost.  But there is good news.  There are other TRs that will really go a long long way to get your business and eliminate the nonsense and bloodletting.  With K3 switching TRs is a simple matter.

Automated Reporting – The area where you can get the biggest bang for your Euro. At least 98% of your reporting should be automated and not require human intervention.

Reporting Management and Updates – If anything is certain, regulations will change and reporting specifications will follow. You should be able to manage these changes in your reporting system without the need for digging down into code. As an example, level II updates took half a day with K3.

This is a golden opportunity to re-look at how reporting is done an off the shelf platform like K3.

#RegReporting2.0

The take away is this:  if your company took up a “Just Report It” approach to reporting and cobbled together a solution and TR that is giving you heartburn, now’s a good time to look at replacement.  We think you will be pleasantly surprised at how cost effective deploying an automated/off the shelf solution like K3 can be.  We have a crazy long list of referenceable clients and would be happy to connect you.

ACER Tidbits
So, we know a lot of people are headsdown on ACER Table 2 reporting. Did you know:

  • Table 2 trades submitted for confirmation on ICE eConfirm can automatically be reported to the ICE RRM?  If you can get your counterparty to agree to confirm there, it’s nice because it eliminates a lot of the 2 sided, USI, and other pains of table 2 reporting. Just match on eConfirm and you are done.
  • K3 automatically converts trades into ACER XML for reporting Table 1 and Table 2 trades.  Once in ACER XML it can be automatically sent to any TR.  If you’d like to see this just send us a note and we will set you up for a demo.  tje@broadpeakpartners.com

2016 | Be Prepared to STOP!

This is the time of year I get prodded to write up a pithy outlook for the year. It seems to be harder this year.  Seemingly, like everyone else I have had my “nose to the grindstone” far too long. That and the fact that my crystal ball looks mostly cloudy with a chance of rain.

So, let’s get the big variable out of the way: We have a weird global economy: fragile, exuberant, volatile and fractious.  Looks great, smells recessionary.  Feels improving, tastes deflationary.  It’s a setup for both some runaway successes and abject failures.

Amidst all this,  what should we expect in 2016?  I had a friend in college who, during a night of antics with her friends, stole a big road sign.  She put it above her bed.  It read: “BE PREPARED TO STOP.”  Got me thinking about 2016.  In this volatile year, whatever you are doing in 2016: BE PREPARED TO STOP.

Here’s what is ripe for the stopping in 2016???shutterstock_275958773

Another Unicorn

A lot of Unicorns will hit the bricks this year.  Don’t get me wrong, there are amazing products out there…Seriously, I can’t live without Uber.  

But what we can live without is the Unicorn capital structure.  Here’s why.. You know what happens when Unicorn investors are not happy with revenue growth?  They stop investing and start pulling the staff change slot machine lever,hoping for the magic combination to come up.  When that does not work they fall back on soaking their customers.  This starts the revenue downward spin cycle.  I’ve seen it again and again, and it always ends in application and company tragedy.

The Ivory Tower of IT Saying “NO”….Again.

The revenue generating part of enterprise have always had great ideas on how to do business better, more efficiently, and profitably.  And, for decades those ideas are pursued only to hit a  ubiquitous IT wall of “NO.”  “No” comes in many forms. Usually, something like, “We already have a gazillion dollars invested in (really, really old and custom) technology etc…”

The shift of operations spending IT dollars started years ago.  But the business is now pretty much fed up with legacy infrastructure.  It’s just too slow…too expensive.   Let me be blunt: We know you have large investment in older technology. We know it requires a lot of control to run. But business requirements and technology is evolving at breakneck speed.  In 2016, if IT is not solving business problems and just taking a disposition of saying NO …the business is going to go around you.

Someone Telling Us There Are No Servers Available

In the time it took you to read this far, I have spun up two enormous servers “somewhere else” for the cost of a salad in midtown manhattan.   What’s more, I have connected them to our network so they are indistinguishable from a server running in our own office. They are, for all practical purposes, absolutely secure.   I like to think of it as cloud reversal, because instead of running “stuff on the cloud” I have just pulled the cloud to me.

Today when you say there is no server space…  here’s what happens:

This developer/ analyst I know built a really awesome business optimization model.  He wants to test in full, but IT says that it will cost $150K in new servers just for a test instance.  In other words they said “NO”.  So what’s an enterprising analyst to do?  He independently set up all the servers he could ever want on Amazon for $200 per month and started testing it there. (Also, charged it to his company credit card.)  I’m sorry. But, this financial disconnect between what IT says servers cost and what any regular joe can get is just too wide to survive 2016.

In summary, my read on the market is that there will be plenty of money for projects.  Plenty of money for investment.  But I think that this is the year companies will really take stock at where they are blowing money on sustaining roadblocks.  If you think you are managing something that is super duper expensive and produces terrible results, BE PREPARED TO STOP.

Stay tuned for more 2016 discussions including:
“That Time You Hired Nate Silver and He Quit”

Page 2 of 1012345...10...Last »
(877) 738-0470