GoodData’s Open Analytics Platform

Jaroslav Gergic

Back in 2007 Marc Andreessen, founder of Netscape Communications and an investor in GoodData, wrote his legendary blog post entitled “The three kinds of platforms you meet on the Internet”. He defined the following three levels of a platform as follows:

Level 1 - “Access API”: “platform’s apps run elsewhere, and call into the platform via a web services API to draw on data and services – this is how Flickr does it.”
Level 2 - “Plug-In API”: “platform’s apps run elsewhere, but inject functionality into the platform via a plug-in API – this is how Facebook does it.”
Level 3 - “Runtime Environment”: “platform’s apps run inside the platform itself – the platform provides the “runtime environment” within which the app’s code runs.”

In the same year, GoodData was founded by serial entrepreneur Roman Stanek to take on the calcified world of Business Intelligence (BI). Unlike other BI vendors, we chose a radically different approach. We did not try to build yet another closed BI tool with its inherent strengths and unavoidable weaknesses. We were determined to build an open BI platform. Last week, after years of continuous innovation from over 150 engineers who powered nearly 100 major platform releases, we officially unveiled our next generation, end-to-end Open Analytics Platform.

It was not always an easy journey. Our #1 rule was that every feature and function of the platform must be exposed via public REST API, i.e. turning GoodData into a Level 1 platform according to the Marc’s definition above. This rule added additional costs and was not always easy to follow. Believe it or not, even our rich HTML5-based UI has been 100% built on the public REST API from the day one.


GoodData Interactive Dashboards

We had moments when we seriously considered dropping that rule to expedite our end-user features delivery. Eventually, we prevailed and the platform we have today allows our customers and partners to leverage the openness of GoodData platform to make it an integral part of their own applications and automate every aspect of their GoodData deployments.

Bringing GoodData Platform to the Next Level

This is not where we are going to stop. Automating our platform via APIs still requires our customers and partners to host portions of their code and infrastructure on their own servers with all the operational overhead. For partners developing custom data-mining techniques on top of our platform, it also means their code typically runs far away from the data. And this is what we are going to change in 2014.

We are attacking this on two frontiers. The first is in Data Discovery and Data Visualization realm. While we already have a rich library of UI elements and data visualization options, it is obvious that we cannot address every imaginable UI solution. Some of the exploratory techniques are quite niche and subject to narrow audiences, while other techniques are just emerging or evolving so fast that it is almost impossible to code common practices and bake them directly into our platform. The variety of data discovery techniques is so vast that we could easily spend rest of our lives developing every possible combination. To turn this problem into an opportunity we are developing a client-side JavaScript SDK that lets third parties develop custom UI widgets and data visualizations and run them on top of GoodData platform. This will let our partners focus on adding value by developing their core competencies in data discovery and data visualization, while leveraging our rich platform for other functions such as data loading, number crunching, security and collaboration.

Secondly, a similar revolution is taking place behind the scenes in the server-side Data Governance realm. We are turning the GoodData Platform into a full-featured Runtime Environment by allowing third parties to deploy custom Ruby code that we will execute on our platform as requested. This will allow clients and partners to manipulate their data before it is loaded into GoodData data marts, called Projects in GoodData lingo, for interactive data exploration.

We are not going to replicate Heroku or Google AppEngine by providing yet another generic Platform as a Service (PaaS) for general-purpose application development. The GoodData Runtime Environment is all about data. Centered around our high-performance HP Vertica-based Data Storage Service (DSS), the data warehouse in the cloud, it makes all other GoodData Platform services one network hop away from your custom code. We provide a highly-secure (all data is encrypted at rest) SOC 2 certified sandboxed environment that enabled our customers and partners to let their imagination spark as they build new solutions on the GoodData backbone.

We imagine that this new runtime environment will allow our customers and partners to implement various kinds of segmentation, data mining, statistical modeling and predictive analytics solutions on top of GoodData platform that they will be able to deliver to their customers and users via our existing user interface.

This is meant to complement the existing CloudConnect application, which is tailored towards more traditional BI developer audience. While the new technology allows for greater flexibility, it does require a developer-type persona rather than BI consultant one. The runtime environment sandbox is uses JVM (and Linux containers for additional security and resource management). We will likely be adding support for more programming languages in the future that will be able to leverage the ecosystem of existing platform components and libraries.

Turbocharging Your Data Analytics Solution

As we learned ourselves, the BI domain is like a vast and deep ocean: you need to invest a lot to cover all the aspects of BI starting with data loading, storage, through analytical engines all the way up to the visualization and collaboration. And you need to invest even more, to achieve a scalable, reliable and secure technology. If you want to innovate in the BI space, why try to replicate what GoodData has already done for the past six years? Why not focus on your core competency and let GoodData to take care of the rest? Amazon Web Services (AWS) has spurred innovation by lowering the barrier to entry for new software startups, because new companies no longer need to buy hardware and run a datacenter. In a similar fashion, GoodData is changing the domain of BI. If you want to innovate in the Data and Analytics space in 2014, there is no need to start from scratch.

Stay tuned for more exciting news in 2014. The next chapter of the GoodData Open Analytics Platform story has just began!

Dev's Newsletter

Subscribe Now