What Financial Quants Tell us About our Integrated Analysis Platform and Haskell

February 18, 2014 Aaron Contorer

RedditTwitterLinkedInFacebookGoogle+Email

Many of the most enthusiastic users and prospective users of Haskell are in the field of data analysis. This includes people in finance, science, Internet services, and other industries. Recently we have announced a special focus on finance.

While people love the Haskell language once they learn it, companies are also asking us to lower the barriers to adoption. We started working on this with our free School of Haskell, and continued it by releasing FP Haskell Center. With our upcoming work, we are adding features to directly reduce the amount of code people have to write, and the amount of Haskell cleverness they need to get started.  This is the key to wider Haskell adoption.

In our recent technology preview screencast video, you can see a sample of this: we generate a working, end-to-end data analysis application before the user has to write code, and then we allow the user to focus on chaining together simple data transformation functions — and, of course, modifying them and writing new ones where needed. The idea is simple: not every developer needs to become an adept architect before putting Haskell to work. Just as we already have general Web-app frameworks, now we are starting to provide full data-analysis frameworks. In the video you’ll recognize the released version of FP Haskell Center, together with pre-release (preview) versions of some new tools and libraries in our new product, and even some domain-specific language (DSL) work.

One of the things we’ve learned from talking with users is that data handling is a very important issue. While data handling sounds like a mundane issue, it is a critical one in finance due to the “garbage in, garbage out” problem.  While many Haskellers focus on the “glamorous” code itself, putting Haskell to work inevitably involves connecting it to interesting data sources — and most often, interesting data sinks or outputs. In our latest work, we’re building tools and libraries to greatly ease these connections. This includes:

  • Tools for automatically turning existing data sources (files, databases, feeds/tickers) into error-resistant, efficient Haskell data structures.
  • Improvements to data connection libraries.
  • Application frameworks for easily scrubbing data, and for easily chaining together data transformation and analysis components.
  • Sample and template code for end-to-end data analysis applications including inputs, human-readable outputs, and machine-readable outputs.

As we continue to develop more tools for Haskell users, we are focusing on these things data analysts tell us they still need:

  • Further expansion of libraries for mathematical & statistical operations over vectors & matrices
  • Stronger tools for data importing, scrubbing, and normalizing
  • Domain-specific libraries for the industries where Haskell is most used, such as finance
  • Stronger libraries or tools for data visualization and exploration
  • Easier high-level ways to construct complex analyses out of existing components
  • In some cases avoiding GPL, for people who work at companies that cannot publish their proprietary analysis source code but would still like to publish compiled applications

One piece of feedback interested me very much, and came as a bit of a surprise (though perhaps it shouldn’t have). Today, a lot of professionals develop their analyses in one language (such as Excel or Matlab), then once experimentation is complete, send these analyses over to another programmer to be completely rewritten in Java, Python, C, or C#. This delays innovation, hugely increases cycle times and errors. Many analysts love the idea (and some have now attained the reality) that with Haskell, one can use the same code from rapid, easy prototyping, all the way through to high-performance, big-data production. That’s a benefit Haskellers could talk about more.

This sort of end-to-end or “straight-through” process has already been adopted by some companies, and more are talking with us about adopting it. Haskell’s unique trio of benefits — easy coding, high quality, and high performance — means that Haskell is well suited to the whole process, from experimentation to detailed prototyping and right through to scaled-up production. By eliminating the need to use different tools for different stages of the work, data analysts tell us that Haskell brings a new kind of productivity boost.

Combining the inherent power of the Haskell language and technical infrastructure (like GHC) with new features like automatic data bindings, easy assembly of data-transformation pipelines, easy repository management and reuse of data-scrubbing and data-analysis components, easy deployment, rapid turnaround, and ready-to-use visualization capabilities, we believe our upcoming Integrated Analysis Platform is a big step forward. Whether you are hoping to see Haskell used more, or just hoping to see business make better analytic decisions and more efficient use of the world’s resources, that’s a good thing.

As always, we appreciate your feedback and look forward to hearing from you. You can comment on this blog or email us at sales@fpcomplete.com.