Let us suppose that you or your customer have a lot of data: collected in a database somewhere, or filling up from the daily activities. Or perhaps a set of IoT sensors costantly stream their measurements. You know about Map/Reduce or other modern ways of processing data. So now you need to design and create an application, which uses Big Data technologies, to mine the data and get real-time insights into the deeper aspects of your data. DICE will help you to quickly get from a concept to a fully deployed code in the test bed – when you start and whenever you change something.
The DICE tools aim at simplifying and speeding up the process of developing data-intensive applications. By this we mean that if you want to create an application, which deals with Big Data, you should not have to graduate from all the ins and outs of designing, testing and deploying your application. One of the bigger hurdles in using Big Data technologies is the process of installing, configuring and starting up the various services before an application can even start processing any data.
One of the great things about living in a modern world is that there are a lot of tools already available that solve problems of applying DevOps, treating infrastructure as code, and packaging complex installation processes into easily usable cookbooks. So potentially it should be an easy matter to, say, quickly whip together a few nodes that run a Zookeeper service, some Storm nodes and a Kafka messaging bus. Yet in our experience, the user always has to understand a lot of seemingly arcane details about each service, experiment with the community-provided cookbooks and tune configuration before the combination of services can start to work together.
In the DICE deployment tool, we bring together the good from various communities. First, we adopted an emerging industry standard for describing an application as a topology or a blueprint. OASIS TOSCA offers a simple, clear and efficient way of expressing the building blocks of any Cloud application: nodes, services, their relationships and any set of the needed parameters.
Next, a cloud orchestrator needs to consume this blueprint and execute the necessary steps to bring the blueprint to reality. In DICE, we use the Cloudify, an open source cloud orchestration framework provided by the GigaSpaces Technologies. Cloudify packs support for a wide range of public and private cloud platforms, existing blueprint constructs for expressing the required components of an application, and is also reasonably close to the TOSCA standard.
Cloudify serves as a mediator between an abstract description of an application on one side, and the tools that perform the installing and configuring on the other side. At this low level, one can use anything from simple bash and Python scripts to employing Chef cookbooks, Puppet recipes or any other configuration manager. Orchestration here means that Cloudify brings a notion of a whole application, so each individual node’s configuration comes with a context of the other interdependent nodes.
With the DICE deployment tool, we work to improve that by providing ready-made packages of libraries, which employ Chef cookbooks tested to work together. If you use the DICE deployment tool as a stand-alone component, then defining an application to be deployed is a matter of writing down the services by type in a simple YAML file, and the tool will do the rest.
Matej Artač (XLAB)