Understanding the capabilities and disruptive threats of big data is a must for all sophisticated finance professionals today. This article considers the development and the current state of Big Data Finance, as it specifically relates to investment advice, with an application to ETF investing.

Big Data Finance is a revolution as well as an evolution coming to prominence over several decades. Chartists and other technical analysts since at least the 1920s have utilized market data to derive upcoming patterns of prices. Perhaps the first formal breakthrough in Big Data Finance occurred in the 1980s, when companies like Bloomberg began packaging and delivering market data in large sets to many investment professionals, allowing investment advisors to quote accurate market data to their clients.

Next, the internet enabled investment advisors to receive streaming real-time or near-real-time financial data. The innovation spurred the growth of financial technologies such as electronic trading, exchange-traded funds (ETFs), as well as an explosion of exchanges and dark pools. The internet streamlined financial services’ processes, and reduced financial transactions costs by a factor of 100 that were previously prohibitive. This created an influx of new clients for investment advisors.

Today, Big Data Finance is really about managing the scale of data and extracting the information within very large data sets. Today’s big data is about faster, better analytics, that help extract that “needle from the haystack” using the latest data science inferences. Storing, managing and integrating ultra-large sets of streaming and historical data of all kinds is the daily work of data scientists. The scope of data being analyzed now include market data, social media data, news, regulations, announcements, and so on.

For investment advisors, understanding Big Data Finance today is particularly crucial when dealing with the following issues: ETFs, client risk preferences, news of all sorts, order execution, and sudden market effects, such as flash crashes.

For example, ETFs are a popular and low-cost investment medium, yet they may come with many risks. Issued by an investment manager that collects small fees for fund management, today’s ETFs come in all shapes and sizes and seem to span most portfolios imaginable. While ETFs may seem innocuous, their proliferation facilitates stock return synchronicity—a condition whereby portfolio diversification is tossed out the window as the prices of all stocks move south at the same time.

Some researchers deduce that ETFs propagate shocks and cause instabilities in the markets, framing the discussion in the direction of the information spillover theory. Others propose alternative theories of ETF behavior, pinning responsibility for the ETF and flash crash effect on the market makers. Still other researchers link the so-called smart beta models and the ETFs among the causes of flash crashes.

Besides being aware of the negative impact of the ETFs in portfolios and markets as a whole, investment advisors may choose to proactively study the composition of ETFs and the ETFs’ projected movement and risks. Many ETFs are created using derivatives on underlying assets, such as futures and options, not the instruments that the ETFs are designed to track. Prospectuses for ETFs typically mention the composition of an ETF in general terms, the details are often murky and kept rather confidential for competitive purposes. Instead, most ETF prospectuses contain the following “buyers beware” passage:

The U.S. Securities and Exchange Commission has not approved or disapproved these securities or passed upon the accuracy or adequacy of this prospectus. Any representation to the contrary is a criminal offense. Securities of the Trust (“Units”) are not guaranteed or insured by the Federal Deposit Insurance Corporation or any other agency of the U.S. Government, nor are such Units deposits or obligations of any bank. Such Units of the Trust involve investment risks, including the loss of principal.

Source: “Principal U.S. Listing Exchange for SPDR® S&P 500®, ETF Trust: NYSE Arca, Inc., under the symbol ‘SPY,’” Prospectus dated January 20, 2016.

While such derivative exposure makes ETF manufacturing cheap and easy, it is not at all straightforward to assess the true risks of such opaque instruments. One 2014 research paper examined nearly 7,000 ETFs and found that only 11% percent of the ETFs are within 1% percent of the actual mean return and volatility that they are designed to reproduce! In other words, only 89% percent of all ETFs are doing their job of replicating their target baskets of financial instruments!

Dealing with 7,000+ ETFs and their derivatives is squarely a big data problem. Since the exact risk profile of ETFs is usually not accessible to investors, accurately assessing the risks of ETFs could be a valuable service provided by investment advisors to their clients. Providing clients with a more accurate risk profile could help save thousands and millions of dollars for their clients’ portfolios, attracting a steady following and increasing earnings.