Announcement

Collapse
No announcement yet.

The Matrix, but with money: the world of high-speed trading

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • The Matrix, but with money: the world of high-speed trading

    A view into the world of high-speed trading. This is the stuff I worked on with Lehman for a couple years, it's kind of scary how easy it is for things to go wrong with it, though.



    The Matrix, but with money: the world of high-speed trading

    Supercomputers pitted against one another in a high-stakes battle of attack and counterattack over a global network where predatory algorithms trawl the information stream, competing every millisecond to gain an informational advantage over rivals. It sounds like Hollywood fiction, but it's just an average trading day on the stock market.



    It sounds like something out of The Matrix: a giant, world-spanning electronic network where high-powered machines, some of them using GPUs to gain a speed advantage, run secret, rapidly-evolving software algorithms that battle it out for profits in a high-stakes game of cat-and-mouse, attack-counterattack, that yields some $21 billion a year for the winners and can spell ruin for the losers. Except that it's not The Matrix—it's the stock and commodities markets, and the fact that these markets mainly consist now of computers trading against one another has been brought closer to the public's attention by last month's alleged theft of Goldman Sachs' proprietary trading code.

    The collection of computer-automated, high-speed trading technologies and techniques that are typically lumped under the heading of "high-frequency trading" (HFT) have been around for a while, but HFT has recently become heavily identified with the banking giant Goldman Sachs, which dominates some aspects of it on the New York Stock Exchange. And as Goldman draws more media and congressional scrutiny, so will HFT. To prepare you for the high-frequency trading media onslaught, we'll take a look at HFT and at a stock market that really isn't what you thought it was.

    If you look under the hood of the markets in 2009, you'll find that the trading floor has been replaced by electronic networks; the frantic, hand-signaling traders have been replaced by computer systems; and all of moves in the trader's dance—a thousand little tricks and techniques (some legal, some questionable, and some outright illegal) for taking regular advantage of speed, location, and information to generate profits—are executed hundreds of times per second, billions of times per day. And the whole enterprise is mainly powered by the same hardware from Intel, AMD, and NVIDIA, that Ars readers use for gaming.
    The world that has been pulled over your eyes to blind you from the truth

    Press reports of trading days that end with big gains or losses are typically accompanied by shots of a trading floor where young traders are either euphorically throwing papers into the air (up days) or staring dejectedly at a stock ticker with hand pressed to forehead, shoulders slumped in defeat (down days). These guys, you think, are "the market," and if you looked up the New York Stock Exchange (NYSE) on Investopedia you'd find nothing in the "Stocks Basics: How Stocks Trade" entry to disabuse you of this widely held notion. "The NYSE is the first type of exchange... where much of the trading is done face-to-face on a trading floor," Investopedia declares, and it goes on to provide a description of how a floor-centered, face-to-face NYSE that hasn't matched reality for about five years.

    Only about three percent of the trading volume on the NYSE is actually carried out by means of traditional "open outcry" trading, where flesh-and-blood humans gather to buy and sell securities. The other 97 percent of NYSE trades are executed via electronic communication networks (ECNs), which, over the past ten years, have rapidly replaced trading floors as the main global venue for buying and selling every asset, derivative, and contract. So the ECNs are the markets in 2009, and those pit traders who pose for the cameras are mainly there for the cameras.

    "Why don't you know BATS?," Bernard Donefer, a finance professor and HFT expert at CUNY's Baruch College, asked me rhetorically. "Because there's nothing to look at. It's based in Kansas; the computers are in Jersey City."

    At the time that Donefer and I spoke last week, BATS was the third largest equity market in the world, behind the NYSE and NASDAQ, and it has been all-electronic since it began life in 2005. There has never been a floor that a CNBC camera crew could report from, so it's essentially invisible to the general public. The NYSE and a few other exchanges keep hanging on to their trading floors "mainly for branding purposes," Donefer told me.

    The ECNs offer the advantages of speed, anonymity, error minimization, and audit trails. They've also "ported" many of the problems endemic to electronic networks—security vulnerabilities, the "garbage-in, garbage out" (GIGO) problem, and the problem of technology moving too fast for lawmakers, to name just three—from the Internet to the markets. But the problems with ECNs are a topic for another day. The real issue is that when the average retail investor gets an E*Trade account and tries to play the stock market, she typically has no idea that she's going up against the market equivalent of IBM's chess grandmaster-thumping supercomputer, Deep Blue.

    Not your father's stock-picking club

    Experts guess that between 60 and 75 percent of the NYSE's daily trading volume is just computers trading against one another using a variety of strategies. Recent HFT investigations by Donefer, Themis Trading, and sites like Zero Hedge have brought to light a lively ecosystem of algorithms, or "algos" in the parlance, that use ECNs in different ways to make money.

    What the vast majority of these algos have in common is that they are not long-term, buy-and-hold "investors" in the classic sense. Rather, they focus on executing as many trades per second as possible and on turning a small profit (often pennies or fractions of a penny) on each trade. This combination of high speed, massive volume, and razor-thin per-trade profits adds up over the course of a day, week, or year to some very large numbers.

    At least two different groups, the TABB Group and FIXProtocol, estimate that high-frequency trading generated around $20 billion in profits for the financial sector last year. Goldman Sachs accounts for some 20 percent of global high-frequency trading activity, and the bank recently had a blow-out quarter in which its HFT-heavy trading operation racked up a record number of days where profits topped $100 million.

    Goldman Sachs may be a major player in HFT, but the bank is by no means the only one. There are thousands of firms that use HFT in the market to varying degrees, with some using HFT exclusively to generate profits. There are whole funds, like those operated by Renaissance Technologies, that consist entirely of a large computer system and some PhD programmers.

    These high-frequency trading platforms consist of two main components: the sofware algorithms, or "algos," and the underlying hardware. Let's take brief look at each part in turn.
    Algorithms: like BattleBots, but with money

    Different analysts divide up the world of HFT in different ways, and such division and classification efforts are very much a work in progress right now. Indeed, many would take issue with my lumping together all computer-automated trading practices under the label of HFT; most who follow the space have terms that they prefer, and, like the space itself, the language is still in flux. This being the case, I won't attempt anything like a systematic overview of HFT—instead, I'll treat the space a bit like a zoo, and I'll merely point out a few different animals.
    Iceberging and predatory algos
    iceberg_ars.jpg

    One of the most important uses for HFT is to get the best price for very large stock orders by breaking them up into small orders of random sizes and hiding the activity from other traders, who, on sensing that a large order is in progress, might take advantage of that knowledge by making moves that would impact the stock price. Institutional traders use this practice, called "iceberging," quite frequently, and most experts agree that it contributes to the orderly functioning of the markets by reducing wide price swings.

    Some categories of "predatory algos" closely monitor the markets in order to sniff out exactly these types of hidden large orders, so that the algo can trade against them. For instance, if a predatory algo detects that someone is trying to hide a large sell order for INTC by trickling it out into the market in small blocks, it might work to bid down the price of INTC just a bit so that it can pick up those blocks at a discount and then sell them for a profit when the share price floats back up to the market's earlier, non-manipulated valuation.

    Statistical arbitrageurs
    One very popular variety of high-frequency trader is the statistical arbitrageur, or "stat arb." Stat arbs make their money by vacuuming up mountains of historical data and looking for correlations between various datapoints and asset prices. The stat arb's trading platform, which is basically a large computer system manned by programmers and financial engineers, uses those correlations to build predictive models that take in a stream of information inputs like news reports and stock prices (Thompson Reuters sells a service that fires wire reports at very low latency to these systems), and output a rapid-fire stream of "buy" and "sell" orders for different assets.

    For instance, a stat arb HFT platform might identify a direct correlation between positive news about Steve Jobs' health and increases in the price of AAPL; then the microsecond that the platform receives and processes an in-bound Reuters news packet containing a statement about Jobs' cancer-free status, it would immediately spit out a "buy" order for AAPL on the expectation that Apple stock is about to increase in price once this news becomes more widely known.

    The Apple example above is a bit contrived, but it's hard to come up with a real-world example because the models and algorithms used by stat arb shops are closely guarded trade secrets. These platforms trade every asset class available on the electronic markets—from frozen pork bellies to complex structured financial products—and they constantly scour the earth looking for data that will let them spot correlations and build or refine their predictive models.

    Stat arb platforms don't have to be right all of the time—they only have to make money on the majority of the thousands of trades that they execute each day in order to be very profitable. Making money on these trades requires two things: speed and good correlations. First, the stat arb's platform needs to process the information very fast and get its orders to the exchange quickly, preferably ahead of its competitors, which may be using a similar model and may get to a trade just a few microseconds later. Every fraction of a second lost is a penny or two lost to the competitor who got to the trade just before you did.

    The other thing that stat arbs need is for the correlations that they've spotted to actually hold true; or, if those correlations stop working, then they need to be able to find new correlations that do work with a relatively quick turnaround. There are some periods before a major disruption when well-functioning market correlations begin to distort en masse; in such periods, the most recent of which was in August of 2007, the so-called "market neutral" funds that rely on these correlations to turn profits regardless of the market's overall direction can begin to tank dramatically.
    Dark pools

    The final animal in the HFT menagerie that I'll point out on this brief tour is the automated market maker (AMM), which is a subtype of what is often called "dark pools," or "dark liquidity." AMMs like Citadel and GETCO always stand ready to buy and sell large quantities of assets, and they don't publish price quotes to other market participants via exchanges.

    To find out what assets a dark pool will either sell or buy and at what price, you first have to ping it. Once you ping the pool with a request to, say, buy a specific asset, the pool will reply with the price that it's willing to sell you that asset for. You can either accept the price and complete the transaction, or turn it down and ping again later to see if the price has moved in your direction.

    Dark pools, then, let traders completely sidestep normal stock and commodities exchanges in order to buy and sell assets without having to broadcast their desired price to the rest of the world. This anonymity has a number of uses, the most important of which is that it makes it easier to implement the "iceberg" strategy described above.

    As with the other creatures in our HFT menagerie, the AMM makes money a little at a time, but in volume. These massive, computer-automated brokers turn a profit by making money on the bid/ask spread and by not being too exposed to price movements in any one asset class.

    Dark pools have come under fire precisely because of the secrecy and anonymity that they're designed to enable. In most respects, these pools are the polar opposite of the "open outcry" markets of yesteryear, and that lack of transparency is raising hackles. There are accusations that dark pool owners abuse their privileged role as an intermediary in these opaque markets by somehow manipulating prices and/or increasing the bid/ask spread, but even those who aren't willing to go so far as to level accusations of illegal or unethical behavior see in this secrecy and lack of oversight the potential for massive abuse.

    Hardware: game on

    Because high-frequency trading is, as Richard Bookstaber has recently described it, an "arms race" where relative speed matters much more than absolute speed, this market is one of the few left with a demand for raw performance at any cost. Indeed, my personal introduction to the world of HFT came in bits and pieces over the past few years via parts of briefings from the Intel, NVIDIA, AMD and their would-be competitors, all of whom have been aggressively pursuing this market.

    Anecdotally, NVIDIA has been fairly successful in making inroads into the market with CUDA, despite any concerns about vendor lock-in. The amount of money potentially gained by a significant speed advantage is so high that programmer time is cheap by comparison, and the result is that HFT shops quickly began to experiment with CUDA when NVIDIA released the SDK.

    Intel is likely to be another beneficiary in this space, since its Xeon parts have maintained a consistent top-end performance lead vs. AMD over the past few years. I say "likely to be," because HFT shops are very secretive about every aspect of their platforms. The only information that I have about the popularity of Xeon in HFT comes informally from Intel, so it doesn't count for much. I've heard that NVIDIA and CUDA are popular from multiple sources, though.

    In all, it's ironic that the hardware that HFT platforms are using to battle it out over stocks, bonds, commodities, and other assets is essentially the same as the technology that PC gamers are using to play their own games with much lower stakes.

    The other key ingredient to the success of any HFT platform is low network latency. The platforms are greatly helped in this regard by the fact that many exchanges will let HFT platforms pay to co-locate their servers with those of the exchange itself, so that the HFT platform can get its order in ahead of the competition. Critics contend that such co-location deals provide avenues for potential front-running of orders, in which an HFT platform gets an advance peek at an incoming order that will move a stock's price in a specific direction, and then uses that knowledge to make a quick bet on the impending price move.

    Too much, too fast?
    I mentioned earlier in this article that high-frequency trading was "estimated" to account for between 60 and 75 percent of all available market volume. This number, which you might think would be important to know, is only one of a number of survey-based, ballpark estimates; the real numbers aren't knowable because algo trades aren't marked as such. In other words, we have no way to tell how much of the current stock market activity—both prices and volume—is the result of computers trading against each other in the manner described above.

    It's also not clear whether all of this computerized buying and selling is actually good for the markets and for society as a whole. Couldn't we as a society better spend all of this money, computer power, and PhD brainpower on, say, coming up with a fossil fuel replacement? Supporters of HFT respond that their platforms provide much-needed market liquidity. They argue that, without HFT, there may not be enough buyers or sellers for a particular asset, so the market in that asset just stops functioning smoothly.

    Not everyone is convinced that liquidity is worth the attendant risks of HFT, which are very difficult to quantify when you're looking at HFT's potential impact on the market as a whole.

    Apart from the issues of transparency and oversight raised by the HFT approaches described above, there's also the possibility that HFT, with all of its enormous speed and complete automation, poses a larger systemic risk to our markets.

    At the back of everyone's mind is the 1987 program trading crash, described by Richard Bookstaber in A Demon of our Own Design. In the run-up to October of 1987, all of the major market participants had been using essentially the same computer-automated algorithm to hedge their portfolio risk. On Black Monday (10/19/1987), all of the portfolio insurance programs started dumping assets in lock-step, in response to a particular set of inputs. This synchronized selling begat more synchronized selling, and by the time this giant, market-sized feedback loop was shut down by the closing bell, the Dow had lost almost 23 percent of its value in a single day.

    Most of the debate around HFT is between those who think that a similar crash could not only happen again, but could be many times worse because the aforementioned increases in speed and trading volume, and those who insist that we don't yet know enough to make that call. It could be that this fast-moving system as a whole could quickly and dramatically fail in some unforeseen way, due to a combination of an external shock and unseen internal fragility; or, it could be redundant and robust enough to keep humming along in the face of anything we (or Mother Nature) throw at it.

    Either way, though, HFT's combination of speed, volume, secrecy, and lack of human oversight and intervention worries even those who trust the human players not use their machines to cheat at the game.
    "The issue is there are still many people out there that use religion as a crutch for bigotry and hate. Like Ben."
    Ben Kenobi: "That means I'm doing something right. "

  • #2
    Article 2:


    Goldman's secret sauce could be loose online; markets beware

    The secret program that controls the computer-automated trading desk at the nation's top investment bank has escaped onto the Internet, thanks to a rogue company programmer. But that's just the start of the story, and it's getting stranger still.



    A Russian programmer named Sergey Aleynikov was picked up this past Friday by the FBI for allegedly stealing and passing along code that, if circulating out in the wild, could expose US markets to manipulation and cost Aleynikov's former employer, Goldman Sachs, millions. Bloomberg quotes assistant US Attorney Facciponti saying that "there is a danger that somebody who knew how to use this program could use it to manipulate markets in unfair ways. The copy in Germany is still out there, and we at this time do not know who else has access to it."

    So how could a 32MB compressed source code archive pose a threat to markets and to America's most powerful investment bank? The story is actually less complex than it may sound.
    Recovering the black box

    In a nutshell, the "black box" trading platforms of Goldman and other banks use a combination of proprietary, secret algorithms and the fastest hardware available to take in a torrent of news and other market data and generate a stream of trades that are timed to the millisecond. So, instead of operating on the old stock market adage, "buy on the rumor, sell on the news," a high-frequency trading platform like Goldman's will buy a few milliseconds after the news hits, then sell moments later at a very small premium to other traders and platforms who didn't get the news in (or their trades out) quite as fast. Do this billions of times a day, and voila, you're printing money.

    Obviously, to make a scheme like this work, you need a few things, one of which is hardware that's at least a few milliseconds faster than everyone else's (see my previous post on the high-frequency trading "arms race"). The second main ingredient is software that, given a set of data inputs, can figure out which trades everyone else is likely to make in response to those same inputs, so that the platform can get there first and be holding those assets when everyone else suddenly decides they want them.

    If you have your hands on the code that runs on Goldman's trading platform—again, one of the largest in the world—then you know with 100 percent accuracy which trades Goldman's computers are going to make in response to a given set of inputs. All you need then is even faster hardware so that you can get to those trades just a few milliseconds before Goldman, and you'll always beat the bank and therefore be able to sell to Goldman at a slight premium. Goldman will therefore make less on every trade, since you'll essentially be usurping their place in the pecking order.

    When US government prosecutors claim that the release of Goldman's secret sauce could potentially expose markets to manipulation, what they're really saying is that some unknown party could use it to out-manipulate Goldman, and possibly even do something more ambitious like frustrate Goldman's platform so that it fails while simultaneously finding some way to short it. Given that Goldman's platform is one of the main providers of liquidity to the market (i.e., it fills a market function by holding assets that everyone will want shortly, and then selling them to all comers), it would ostensibly be a bad thing if it suddenly blew up.
    How Aleynikov did it

    The FBI's complaint (PDF) in the case describes how Aleynikov pulled off his heist of the code. During the first five days in June, the programmer, whose LinkedIn profile describes him as VP of Equity Strategy, ran some scripts via a bash shell that copied and compressed a bunch of source code, then sent it out via HTTPS to a German server—about 32MB total over four separate occasions, which is actually quite a lot of compressed ASCII source code.

    Aleynikov tried to cover his tracks by having the script erase his bash history, but Goldman's machines actually keep a backup of everyone's bash history, which is how they figured out what he had done. The bank was tipped off by the HTTPS transfers, which seem to have set off some sort of alarm that invited further scrutiny.

    The programmer had informed Goldman that he was quitting and going to work for Chicago-based Teza Technologies, LLC, another high-frequency trading shop that has now suspended his employment in the wake of his arrest. He was released on a $750,000 bond today and now awaits trial.

    This story is still developing, and I encourage you to read the second half of Matthew Goldstein's Reuters story, which is where the arrest first came to light, to get a sense of where it's headed. (Zero Hedge is also on top of this). In particular, there are a number of very odd twists here, the latest of which makes the New York Stock Exchange look particularly bad.

    The NYSE puts out a weekly list of the top program traders by volume, and Goldman typically tops this list by a country mile. Then last week's list came out, and Goldman's name was shockingly absent. And today, now that the code theft story is out, the NYSE has put out a statement claiming that Goldman's absence on the list was the result of a "system error;" it has also released a revised list showing Goldman once again dominating program trading activity.

    Needless to say, many econ bloggers are incredulous that the top entrant in the weekly program trading list suddenly went missing last week and nobody at the NYSE caught the error before now, especially given the Aleynikov news and the timing of the "error." Conspiracy theories are legion, and even if none of them are true, it's hard to shake the feeling that this story is about to blow up into a major scandal.
    "The issue is there are still many people out there that use religion as a crutch for bigotry and hate. Like Ben."
    Ben Kenobi: "That means I'm doing something right. "

    Comment


    • #3
      From the scuttlebutt I've been hearing the loss of the Goldman code probably isn't that big a deal. The actual secret high freq algorithms supposedly change so often that it'll just force GS to halt hf trading for a few weeks while they bring up a sufficiently new version to make it impossible for reverse engineering their trades to bite them in the ass. The only supposedly semi-stable thing is the truly backend infrastructure stuff, which is definitely valuable, but not really that secret.
      12-17-10 Mohamed Bouazizi NEVER FORGET
      Stadtluft Macht Frei
      Killing it is the new killing it
      Ultima Ratio Regum

      Comment


      • #4
        There's enough movement into and out of GS that it's impossible to believe that they would rely on stable algorithms to conduct hf trades. Even if you can't bring the code with you it wouldn't be that hard to recreate most of its behaviour when you move to another company. Unless Goldman's been able to enforce draconian non-competes (as in much, much longer than the standard 1-3 months or so) it has to alter its code more frequently than this anyway...
        12-17-10 Mohamed Bouazizi NEVER FORGET
        Stadtluft Macht Frei
        Killing it is the new killing it
        Ultima Ratio Regum

        Comment


        • #5
          By the way, IIRC this guy's an astrophysicist.

          12-17-10 Mohamed Bouazizi NEVER FORGET
          Stadtluft Macht Frei
          Killing it is the new killing it
          Ultima Ratio Regum

          Comment


          • #6
            Interesting analysis KH. Thanks.

            Comment


            • #7
              Note: I am not yet a quant. I just play one on Poly.
              12-17-10 Mohamed Bouazizi NEVER FORGET
              Stadtluft Macht Frei
              Killing it is the new killing it
              Ultima Ratio Regum

              Comment


              • #8
                asher: answer my latest c++ question on my thread in techno area. Kuci will just try to explain to me that I should use C and preprocessor loops. BC will say something useless and trivial. You're my only hope...
                12-17-10 Mohamed Bouazizi NEVER FORGET
                Stadtluft Macht Frei
                Killing it is the new killing it
                Ultima Ratio Regum

                Comment


                • #9
                  Originally posted by KrazyHorse View Post
                  By the way, IIRC this guy's an astrophysicist.

                  If so, he's got quite an impressive software development background according to his linkedin profile:

                  SYSTEMS: UNIX/Linux, Windows.
                  LANGUAGES & FRAMEWORKS: C/C++, STL, Template Metaprogramming, lock-free algorithms, Erlang/OTP, Kylix/Delphi/Pascal, Mono, C#/.NET, F#, OCaml, SQL, PL/SQL, Perl/shell scripting, JavaScript, HTML, XML.
                  DATABASES: Oracle 7/8i/9i/10g, Sybase, TimesTen, FastDB, GigaBASE, Mnesia.
                  PROTOCOLS & API: TCP/IP, UDP, SCTP, FIX, Nasdaq ITCH/OUCH, NYSE CCG FIX, SS7/ISUP, SIGTRAN/M3UA, SNMP, SIP, SOAP, TibcoRV, MPI, SMTP, JABBER, JSON, AJAX, Win32, pthreads, futexes, async I/O.
                  He also contributed to many open source Erlang projects, which is pretty exceptionally geeky.
                  "The issue is there are still many people out there that use religion as a crutch for bigotry and hate. Like Ben."
                  Ben Kenobi: "That means I'm doing something right. "

                  Comment


                  • #10
                    Originally posted by KrazyHorse View Post
                    asher: answer my latest c++ question on my thread in techno area. Kuci will just try to explain to me that I should use C and preprocessor loops. BC will say something useless and trivial. You're my only hope...
                    In a bit, I just posted these articles I'd read this morning. I wasted my morning at Crappy Tire in the waiting room so I'm catching up at work.
                    "The issue is there are still many people out there that use religion as a crutch for bigotry and hate. Like Ben."
                    Ben Kenobi: "That means I'm doing something right. "

                    Comment


                    • #11
                      Oops. It's the guy who's in charge of the firm he was jumping ship for that's an astrophysicist (princeton)
                      12-17-10 Mohamed Bouazizi NEVER FORGET
                      Stadtluft Macht Frei
                      Killing it is the new killing it
                      Ultima Ratio Regum

                      Comment


                      • #12
                        Originally posted by Asher View Post
                        In a bit, I just posted these articles I'd read this morning. I wasted my morning at Crappy Tire in the waiting room so I'm catching up at work.
                        np. just wanted to point you there. whenever you have the time
                        12-17-10 Mohamed Bouazizi NEVER FORGET
                        Stadtluft Macht Frei
                        Killing it is the new killing it
                        Ultima Ratio Regum

                        Comment


                        • #13
                          I'm more impressed by his $1.2M salary.
                          "The issue is there are still many people out there that use religion as a crutch for bigotry and hate. Like Ben."
                          Ben Kenobi: "That means I'm doing something right. "

                          Comment


                          • #14
                            1.2M is high, but not absurdly high. Certainly well above average range (~400-500k after ~10 years)
                            12-17-10 Mohamed Bouazizi NEVER FORGET
                            Stadtluft Macht Frei
                            Killing it is the new killing it
                            Ultima Ratio Regum

                            Comment


                            • #15
                              I think that I have heard a bit of talk about Renaissance Technologies at the lab, apparently it is one of the firms that employs lots of physicists.

                              JM
                              Jon Miller-
                              I AM.CANADIAN
                              GENERATION 35: The first time you see this, copy it into your sig on any forum and add 1 to the generation. Social experiment.

                              Comment

                              Working...
                              X