Dremio, the Data-as-a-Service Platform company, announced a major release of its open source platform that includes a collaborative data catalog; along with new controls for multi-tenant deployments, end-to-end data encryption, and a breakthrough in performance and efficiency through the Gandiva Initiative for Apache Arrow. These new features support data initiatives by providing shorter lead times, lower operational costs, greater security and governance, and more self-service to a broader range of roles.
Everything-as-a-Service has been embraced by IT for the past five years – encompassing infrastructure, platforms and applications,” said Kelly Stirman, vice president strategy and CMO, Dremio. “Today, we are providing these same benefits to our customers for their data initiatives, by offering tools for data engineers to be more productive, and for data consumers to be more self-sufficient.”
More Value from Data, Faster
Dremio’s open source Data-as-a-Service Platform helps analysts and data scientists work together to discover, curate and collaborate for diverse analytical use cases. It builds on Apache Arrow to accelerate queries on a range of data sources, from S3, ADLS, and HDFS to NoSQL and relational databases.
At Virgin Orbit our mission is to safely, reliably, and affordably place satellites into orbit, and gaining insight from data is a key part of our strategy,” said Andrzej Goryca, senior enterprise systems manager, Virgin Orbit. “We were thrilled to discover Dremio earlier this year, and it has significantly reduced the time to build our data analytics platform. Today our analysts and data scientists can easily work with data from our relational and NoSQL stores using their favorite tools, with a lot less intervention from our data engineering teams. We get to answers faster with Dremio.”
Dremio 3.0 includes advanced features for demanding enterprise workloads, including extensive security controls, data lineage, advanced data acceleration features, a new elastic deployment model, and connectors for popular data sources. Dremio solves the challenge of making data fast and self-service for data consumers, eliminating the creation, management, and governance risk associated with unnecessary data copies.
At VideoAmp we view ourselves as both a software and data company,” said Eric Lakich, vice president tech ops at VideoAmp. “We are a convergence of Platform-as-a-Service and Data-as-a-Service with data at the core of everything we do. Dremio is a key component in our data exploration workflow enabling us to offer a portal into our vast backend data-sources. With Dremio, more people than ever before have the power to quickly explore and create. This capability brings great value to the VideoAmp products and solutions.”
Advancements in Security, Data Cataloging, Multi-Tenant Deployments
With these new features, Dremio helps companies ensure governed and secure access to data from any source, at the speed of thought, through a self-service experience.
- Built-in data catalog. Dremio’s data catalog provides a powerful and intuitive way for data consumers to discover, organize, describe, and self-serve data from virtually any data source in a governed and secure model. Data stewards can describe and tag datasets. Data consumers can utilize the Google-like search interface to find the data they need and then immediately start curating, blending or analyzing it.
- Advanced security controls. Dremio now integrates natively with Apache Ranger for centralized access control, building on the system’s powerful row and column-level access controls that work with any data source and across multiple data sources. In addition, Dremio now supports end-to-end TLS encryption. For AWS deployments, Dremio now supports EC2 instance profiles for secure access to S3.
- Multi-tenant workload controls. The new multi-tenant features allow data engineering teams to manage and optimize cluster resources across a variety of workloads and users. Workload management policies can be used to precisely control resource allocation based on user, group membership, time of day, data source, query type, and many other runtime factors. Policies are expressed using standard SQL, allowing teams to leverage complex conditions with the power and familiarity of SQL.
- Elastic deployments using Kubernetes. Dremio now provides an official Docker image and templates for elastic, highly available deployments using the popular Kubernetes orchestration framework. Companies can simplify the management of their deployments on-prem and using popular cloud services like Amazon EKS and Azure AKS using Dremio’s Helm Charts for provisioning and elastically scaling clusters of up to 1000+ nodes.
- A breakthrough in performance and efficiency. Also part of this release is the availability of the Gandiva Initiative for Apache Arrow. This new execution kernel provides up to 100x greater efficiency on many types of queries and operations. This improved efficiency translates into lower operational costs, better user experience, and the ability to support more workloads with existing hardware.
- New engine for relational push-downs. A new declarative engine for relational database sources increases the sophistication of push-downs of SQL expressions, resulting in more efficient processing on popular systems such as Postgres, SQL Server, Oracle, and Teradata.
- New data sources. Dremio continues to add support for more popular data sources deployed in customer data centers and cloud services. With 3.0 Dremio now supports Azure Data Lake Store, Elasticsearch 6, AWS S3 GovCloud, and Teradata.
Sign up for the free insideBIGDATA newsletter.