Cost Opt & FinOps Archives - Unravel

Data Observability: The Missing Link for Data Teams

Christine Della Penna — Thu, 15 Sep 2022 20:16:38 +0000

As organizations invest ever more heavily in modernizing their data stacks, data teams—the people who actually deliver the value of data to the business—are finding it increasingly difficult to manage the performance, cost, and quality of these complex systems.

Data teams today find themselves in much the same boat as software teams were 10+ years ago. Software teams have dug themselves out the hole with DevOps best practices and tools—chief among them full-stack observability.

But observability for data applications is a completely different animal from observability for web applications. Effective DataOps demands observability that’s designed specifically for the different challenges facing different data team members throughout the DataOps lifecycle.

That’s data observability for DataOps.

In this white paper, you’ll learn:

What is data observability for DataOps?
What data teams can learn from DevOps
Why observability designed for web apps is insufficient for data apps
What data teams need from data observability for DataOps
The 5 capabilities data observability for DataOps

Download the white paper here.

The post Data Observability: The Missing Link for Data Teams appeared first on Unravel.

A Better Approach to Controlling Modern Data Cloud Costs

Stephen Lamont — Fri, 14 Jan 2022 20:09:12 +0000

As anyone running modern data applications in the cloud knows, costs can mushroom out of control very quickly and easily. Getting these costs under control is really all about not spending more than you have to. Unfortunately, the common approach to managing these expenses—which looks at things only at an aggregated infrastructure level—helps control only about 5% of your cloud spend. You’re blind to the remaining 95% of cost-saving opportunities because there’s a huge gap in your ability to understand exactly what all your applications, pipelines, and users are doing, how much they’re costing you (and why), and whether those costs can be brought down.

Controlling cloud costs has become a business imperative for any enterprise running modern data stack applications. Industry analysts estimate that at least 30% of cloud spend is “wasted” each year—some $17.6 billion. For modern data pipelines in the cloud, the percentage of waste is much higher—closer to 50%.

This is because cloud providers have made it so much easier to spin up new instances, it’s also much easier for costs to spiral out of control. The sheer size of modern data workloads amplifies the problem exponentially. We recently saw a case where a single sub-optimized data job that ran over the weekend wound up costing the company an unnecessary $1 million. But the good news is that there are a lot of opportunities for you to save money.

Every data team and IT budget holder recognizes at an abstract, theoretical level what they need to do to control data cloud costs:

Shut off unused always-on resources (idle clusters)
Leverage spot instance discounts
Optimize the configuration of auto-scaling data jobs

Understanding what to do in principle is easy. Knowing where, when, and how to do it in practice is far more complicated. And this is where the common cost-management approach for modern data clouds falls short.

What’s needed is a new approach to controlling modern data cloud costs—a “workload aware” approach that harnesses precise, granular information at the application level to develop deep intelligence about what workloads you are running, who’s running them, which days and time of day they run, and, most important, what resources are actually required for each particular workload.

Pros and Cons of Existing Cloud Cost Management

Allocating Costs—Okay, Not Great

The first step to controlling costs is, of course, understanding where the money is going. Most approaches to cloud cost management today focus primarily on an aggregated view of the total infrastructure expense: monthly spend on compute, storage, platform, etc. While this is certainly useful from a high-level, bird’s-eye view for month-over-month budget management, it doesn’t really identify where the cost-saving opportunities lie. Tagging resources by department, function, application, user etc., is a good first step in understanding where the money is going, but tagging alone doesn’t help you understand if the cost is good or bad—and whether it could be brought down (and how).

Here’s a cloud budget management dashboard that does a good job of summarizing at a high level where cloud spend is going. But where could costs be brought down?

Native cloud vendor tools (and even some third-party solutions) can break down individual instance costs—how much you’re spending on each machine—but don’t tell you which jobs or workflows are running there. It’s just another view of the same unanswered question: Are you overspending?

Eliminating Idle Resources—Good

When it comes to shutting down idle clusters, all the cloud platform providers (AWS, Azure, Google Cloud Platform, Databricks), as well as a host of so-called cloud cost management tools from third-party vendors, do a really good job of identifying resources you’re not using anymore and can terminate them automatically.

Reducing Costs—Poor

Where many enterprises’ approach leaves a lot to be desired revolves around resource and application optimization. This is where you can save the big bucks—specifically, reducing costs with spot instances and auto-scaling, for example—but it requires a new approach.

Check out the Unravel approach to controlling cloud costs

Create a free account

A Smarter “Workload-Aware” Approach

An enterprise with 100,000+ modern data jobs has literally a million decisions—at the application, pipeline, and cluster level—to make about where, when, and how to run those jobs. And each individual decision carries a price tag.

Preventing Overspending

The #1 culprit in data cloud overspending is overprovisioning/underutilizing instances.

Every enterprise has thousands and thousands of jobs that are running on more expensive instances than necessary. Either more resources are requested than actually needed or the instances are larger than needed for the job at hand. Frequently the resources allocated to run jobs in the cloud are dramatically different than what is actually required, and those requirements can vary greatly depending upon usage patterns or seasonality. Without visibility into the actual resource requirements of each job over time, it is just a guessing game which level of machine to allocate in the cloud.

That’s why a more effective approach to controlling cloud costs must start at the job level. Whichever data team members are requesting instances must be empowered with accurate, easy-to-understand metrics about actual usage requirements—CPU, memory, I/O, duration, containers, etc.—so that they can specify right-sized configurations based on actual utilization requirements rather than having to “guesstimate” based on perceived capacity need.

For example, suppose you’re running a Spark job and estimate that you’ll need six containers with 32 GB of memory. That’s your best guess, and you certainly don’t want to get “caught short” and have the job fail. But if you had application-level information at your fingertips that in reality you need only three containers that use only 5.2 GB of memory (max), you could request a right-sized instance configuration and avoid unnecessarily overspending. Instead of paying, say, $1.44 for a 32GB machine (x6), you pay only $0.30 for a 16GB machine (x3). Multiply the savings across several hundred users running hundreds of thousands of jobs every month, and you’re talking big bucks.

Application-level intelligence about actual utilization allows you specify with precision the number and size of instances really needed for the job at hand.

Further, a workload-aware approach that has application-level intelligence also can also tell you how long the job ran for.

Knowing the duration of each job empowers you to shut down clusters based on execution state rather than waiting for an idle state.

Here, you can see that this particular job took about 6½ minutes to run. With this intelligence, you know exactly how long you need this instance. You can tell when you can utilize the instance for another job or, even better, shut it down altogether.

Leveraging Spot Instances

This really helps with determining when to use spot instances. Everybody wants to take advantage of spot instance pricing discounts. While spot instances can save you big bucks—up to 90% cheaper than on-demand pricing—they come with the caveat that the cloud provider can “pull the plug” and terminate the instance with as little as a 30-second warning. So, not every job is a good candidate to run on spot. A simple SQL query that takes 10-15 seconds is great. A 3-hour job that’s part of a larger pipeline workflow, not so much. Nobody wants their job to terminate after 2½ hours, unfinished. The two extremes are pretty easy yes/no decisions, but that leaves a lot of middle ground. Actual utilization information at the job level is needed to know what you can run on spot (and, again, how many and what size instances) without compromising reliability. One of the top questions cloud providers hear from their customers is how to know what jobs can be run safely on spot instances. App-level intelligence that’s workload-aware is the only way.

Auto-Scaling Effectively

This granular app-level information can also be leveraged at the cluster level to know with precision how, when, and where to take advantage of auto-scaling. One of the main benefits of the cloud is the ability to auto-scale workloads, spinning up (and down) resources as needed. But this scalability isn’t free. You don’t want to spend more than you absolutely have to. A heatmap like the one below shows how many jobs are running at each time of the day on each day of the week (or month or whatever cadence is appropriate), with drill-down capabilities into how much memory and CPU is needed at any given point in time. You avoid guesstimating configurations based on perceived capacity needs rather than precisely specifying resources based on actual usage requirements. Once you understand what you really need, you can tell what you don’t need.

To take advantage of cost savings with auto-scaling, you have to know exactly when you need max capacity and for how long—at the application level.

Unravel is full-stack “workload-aware”

Create a free account

Conclusion

Most cloud cost management tools on the market today do a pretty good job of aggregating cloud expenses at a high level, tracking current spend vs. budget, and presenting cloud provider pricing options (i.e., reserved vs. on-demand vs. spot instances). Some can even go a step further and show you instance-level costs, what you’re spending machine by machine. While this approach has some (limited) value for accounting purposes, it doesn’t help you actually control costs.

When talking about “controlling costs,” most budget holders are referring to the bottom-line objective: reducing costs, or at least not spending more than you have to. But for that, you need more than just aggregated information about infrastructure spend, you need application-level intelligence about all the individual jobs that are running, who’s running them (and when), and what resources each job is actually consuming compared to what resources have been configured.

As much as 50% of data cloud instances are overprovisioned. When you have 100,000s of jobs being run by thousands of users—most of whom “guesstimate” auto-scaling configurations based on perceived capacity needs rather than on actual usage requirements—it’s easy to see how budgets blow up. Trying to control costs by looking at the overall infrastructure spend is impossible. Only by knowing with precision at a granular job level what you do need can you understand what you don’t need and trim away the fat.

To truly control data cloud costs, not just see how much you spent month over month, you need to tackle things with a bottom-up, application-level approach rather than the more common top-down, infrastructure-level method.

See what a “workload-aware” approach to controlling modern data pipelines in the cloud looks like. Create a free account.

The post A Better Approach to Controlling Modern Data Cloud Costs appeared first on Unravel.

Mastering Cost Management: From Reactive Spending to Proactive Optimization

Unravel Data — Wed, 05 Feb 2025 20:26:50 +0000

According to Forrester, accurately forecasting cloud costs remains a significant challenge for 80% of data management professionals. This struggle often stems from a lack of granular visibility, control over usage, and ability to optimize code and infrastructure for cost and performance. Organizations utilizing modern data platforms like Snowflake, BigQuery, and Databricks often face unexpected budget overruns, missed performance SLAs, and inefficient resource allocation.

Transitioning from reactive spending to proactive optimization is crucial for effective cost management in modern data stack environments.

This shift requires a comprehensive approach that encompasses several key strategies:

1. Granular Visibility
Gain comprehensive insights into expenses by unifying fragmented data and breaking down silos, enabling precise financial planning and resource allocation for effective cost control. This unified approach allows teams to identify hidden cost drivers and inefficiencies across the entire data ecosystem.

By consolidating data from various sources, organizations can create a holistic view of their spending patterns, facilitating more accurate budget forecasting and informed decision-making. Additionally, this level of visibility empowers teams to pinpoint opportunities for optimization, such as underutilized resources or redundant processes, leading to significant cost savings over time.

2. ETL Pipeline Optimization
Design cost-effective pipelines from the outset, implementing resource utilization best practices and ongoing performance monitoring to identify and address inefficiencies. This approach involves carefully architecting ETL processes to minimize resource usage while maintaining optimal performance.

By employing advanced performance tuning techniques, such as optimizing query execution plans and leveraging built-in optimizations, organizations can significantly reduce processing time and associated costs. Continuous monitoring of pipeline performance allows for the early detection of bottlenecks or resource-intensive operations, enabling timely adjustments and ensuring sustained efficiency over time.

3. Intelligent Resource Management
Implement intelligent autoscaling to dynamically adjust resources based on workload demands, optimizing costs in real-time while maintaining performance. Efficiently manage data lake and compute resources to minimize unnecessary expenses during scaling. This approach allows organizations to provision automatically and de-provision resources as needed, ensuring optimal utilization and cost-efficiency.

By setting appropriate scaling policies and thresholds, you can avoid over-provisioning during periods of low demand and ensure sufficient capacity during peak usage times. Additionally, separating storage and compute resources enables more granular control over costs, allowing you to scale each component independently based on specific requirements.

4. FinOps Culture
Foster collaboration between data and finance teams, implementing cost allocation strategies like tagging and chargeback mechanisms to attribute expenses to specific projects or teams accurately. This approach creates a shared responsibility for cloud costs and promotes organizational transparency.

By establishing clear communication channels and regular meetings between technical and financial stakeholders, teams can align their efforts to optimize resource utilization and spending. A robust tagging system also allows for detailed cost breakdowns, enabling more informed decision-making and budget allocation based on actual usage patterns.

5. Advanced Forecasting
Develop sophisticated forecasting techniques and flexible budgeting strategies using historical data and AI-driven analytics to accurately predict future costs and create adaptive budgets that accommodate changing business needs. Organizations can identify trends and seasonal variations that impact costs by analyzing past usage patterns and performance metrics.

This data-driven approach enables more precise resource allocation and helps teams anticipate potential cost spikes, allowing for proactive adjustments to prevent budget overruns. Additionally, implementing AI-powered forecasting models can provide real-time insights and recommendations, enabling continuous optimization of environments as workloads and business requirements evolve.

Mastering these strategies can help you transform your approach to cost management from reactive to proactive, ensuring you maximize the value of your cloud investments while maintaining financial control.

To learn more about implementing these cost management strategies in your modern data environment, join our upcoming webinar series, “Controlling Cloud Costs.” This ten-part series will explore each aspect of effective cost management, providing actionable insights and best practices to gain control over your data platform costs.

The post Mastering Cost Management: From Reactive Spending to Proactive Optimization appeared first on Unravel.

Building a FinOps Ethos

Unravel Data — Mon, 09 Dec 2024 21:02:22 +0000

3 Key Steps to Build a FinOps Ethos in Your Data Engineering Team

In today’s data-driven enterprises, the intersection of fiscal responsibility and technical innovation has never been more critical. As data processing costs continue to scale with business growth, building a FinOps culture within your Data Engineering team isn’t just about cost control, it’s about creating a mindset that views cost optimization as an integral part of technical excellence.

Unravel’s ‘actionability for everyone’ approach has enabled executive customers to adopt three transformative steps to embed FinOps principles into their Data Engineering team’s DNA, ensuring that cost awareness becomes as natural as code quality or data accuracy. In this article, we walk you through how executives can get the cost rating of their workloads from Uncontrolled to Optimized with clear, guided actionable pathways.

Step 1. Democratize cost visibility

The foundation of any successful FinOps implementation begins with transparency. However, raw cost data alone isn’t enough-it needs to be contextualized and actionable.

Breaking Down the Cost Silos

Unravel provides real-time cost attribution dashboards to map cloud spending to specific business units, teams, projects, and data pipelines.

The custom views allow different stakeholders, from engineers to executives to discover top actions to control cost spends.
The ability to track key metrics like cost/savings per job/query, time-to-value, and wasted costs due to idleness, wait times, etc., transforms cost data from a passive reporting tool to an active management instrument.

Provide tools for cost decision making

Modern data teams need to understand how their architectural and implementation choices affect the bottom line. Unravel provides visualizations and insights to guide these teams to implement:

Pre-deployment cost impact assessments for new data pipelines (for example, what is the cost impact of migrating this workload from All-purpose compute to Job compute?).
What-if analysis tools for infrastructure changes (for example, will changing from the current instance types to recommended instance types affect performance if I save on cost?).
Historical trend analysis to identify cost patterns, budget overrun costs, cost wasted due to optimizations neglected by their teams, etc.

Step 2. Embed cost intelligence into the development and operations lifecycles

The next evolution in FinOps maturity comes from making cost optimization an integral part of the development process, not a post-deployment consideration. Executives should consider leveraging specialized AI agents across their technology stack that helps to boost productivity and free up their teams’ time to focus on innovation. Unravel provides a suite of AI-agents driven features that foster cost ethos in the organization and maximizes operational excellence with auto-fix capabilities.

Automated Optimization Lifecycle
Unravel helps you establish a systematic approach to cost optimization with automated AI-agentic workflows to help your teams operationalize recommendations while getting a huge productivity boost. Below are some ways to drive automation with Unravel agents:

Implement automated fix suggestions for most code and configuration inefficiencies
Assign auto-fix actions to AI agents for effortless adoption or route them to a human reviewer for approval
Configure automated rollback capabilities for changes if unintended performance or cost degradation is detected

Push Down Cost Consciousness To Developers

Automated code reviews that flag potential cost inefficiencies
Smart cost savings recommendations based on historical usage patterns
Allow developers to see the impact of their change on future runs

Define Measurable Success Metrics
Executives can track the effectiveness of FinOps awareness and culture using Unravel through:

Cost efficiency improvements over time (WoW, MoM, YTD)
Team engagement and rate of adoption with Accountability dashboards
Time-to-resolution for code and configuration changes

Step 3. Create a self-sustaining FinOps culture

The final and most crucial step is transforming FinOps from an initiative into a cultural cornerstone of your data engineering practice.

Operationalize with AI agents

FinOps AI Agent

Implement timely alerting systems that help to drive value-oriented decisions for cost optimization and governance. Unravel provides email, Slack, Teams integrations for ensuring all necessary stakeholders get timely notifications and insights into opportunities and risks.

DataOps AI Agent

Pipeline optimization suggestions for better resource utilization and mitigate SLA risks
Job signature level cost and savings impact analysis to help with prioritization of recommendations
Intelligent workload migration recommendations

Data Engineering AI Agent

Storage tier optimization recommendations to avoid wasted costs due to cold tables
Partition strategy optimization for cost-effective querying
Avoid recurring failures and bottlenecks due to inefficiencies not acted upon for several weeks

Continuous Evolution

Finally, it is extremely important to foster and track the momentum of FinOps growth by:

Regularly performing FinOps retrospectives with wider teams
Revisiting which Business units and Cost Centers are contributing to wasted costs, neglected cost due to unadopted recommendations and budget overruns despite timely alerting

The path forward

Building a FinOps ethos in your Data Engineering team is a journey that requires commitment, tools, and cultural change. By following the above three key steps – democratizing cost visibility, embedding cost intelligence, and creating a self-sustaining culture – you can transform how your team thinks about and manages cloud costs.

The most successful implementations don’t treat FinOps as a separate discipline but rather as an integral part of technical excellence. When cost optimization becomes as natural as writing tests or documenting code, you have achieved true FinOps maturity. Unravel provides a comprehensive set of features and tools to aid your teams in accelerating FinOps best practices throughout the organization.

Remember, the goal isn’t just to reduce costs – it is to maximize the value derived from every dollar spent on your infrastructure. This mindset shift, combined with the right tools and processes, will position your data engineering team for sustainable growth and success in an increasingly cost-conscious technology landscape.

To learn more on how Unravel can help, contact us or request a demo.

The post Building a FinOps Ethos appeared first on Unravel.

Data ActionabilityTM Finops Webinar

Unravel Data — Wed, 20 Nov 2024 21:25:52 +0000

Data Actionability: Cost Governance with Unravel’s New FinOps AI Agent

The post Data Actionability^TM Finops Webinar appeared first on Unravel.

Data ActionabilityTM Webinar

Unravel Data — Wed, 20 Nov 2024 21:04:00 +0000

Data Actionability: Empower Your Team with Unravel’s New AI Agents

The post Data Actionability^TM Webinar appeared first on Unravel.

BigQuery Cost Management

Unravel Data — Wed, 06 Nov 2024 16:46:23 +0000

Mastering BigQuery Cost Management and FinOps: A Comprehensive Checklist

Effective cost management becomes crucial as organizations increasingly rely on Google BigQuery for their data warehousing and analytics needs. This checklist delves into the intricacies of cost management and FinOps for BigQuery, exploring strategies to inform, govern, and optimize usage while taking a holistic approach that considers queries, datasets, infrastructure, and more.

While this checklist is comprehensive and very impactful when implemented fully, it can also be overwhelming to implement with limited staffing and resources. AI-driven insights and automation can solve this problem and are also explored at the bottom of this guide.

Understanding Cost Management for BigQuery

BigQuery’s pricing model is primarily based on data storage and query processing. While this model offers flexibility, it also requires careful management to ensure costs align with business value. Effective cost management for BigQuery is about more than reducing expenses—it’s also about optimizing spend, ensuring efficient resource utilization, and aligning costs with business outcomes. This comprehensive approach falls under the umbrella of FinOps (Financial Operations).

The Holistic Approach: Key Areas to Consider

1. Query Optimization

Are queries optimized? Efficient queries are fundamental to cost-effective BigQuery usage:

Query Structure: Write efficient SQL queries that minimize data scanned.
Partitioning and Clustering: Implement appropriate table partitioning and clustering strategies to reduce query costs.
Materialized Views: Use materialized views for frequently accessed or complex query results.
Query Caching: Leverage BigQuery’s query cache to avoid redundant processing.

2. Dataset Management

Are datasets managed correctly? Proper dataset management is crucial for controlling costs:

Data Lifecycle Management: Implement policies for data retention and expiration to manage storage costs.
Table Expiration: Set up automatic table expiration for temporary or test datasets.
Data Compression: Use appropriate compression methods to reduce storage costs.
Data Skew: Address data skew issues to prevent performance bottlenecks and unnecessary resource consumption.

3. Infrastructure Optimization

Is infrastructure optimized? While BigQuery is a managed service, there are still infrastructure considerations:

Slot Reservations: Evaluate and optimize slot reservations for predictable workloads.
Flat-Rate Pricing: Consider flat-rate pricing for high-volume, consistent usage patterns.
Multi-Region Setup: Balance data residency requirements with cost implications of multi-region setups.

4. Access and Governance

Are the right policies and governance in place? Proper access controls and governance are essential for cost management:

IAM Roles: Implement least privilege access using Google Cloud IAM roles.
Resource Hierarchies: Utilize resource hierarchies (organizations, folders, projects) for effective cost allocation.
VPC Service Controls: Implement VPC Service Controls to manage data access and potential egress costs.

Implementing FinOps Practices

To master cost management for BiqQuery, consider these FinOps practices:

1. Visibility and Reporting

Implement comprehensive labeling strategies for resources.
Create custom dashboards in Google Cloud Console or Data Studio for cost visualization.
Set up budget alerts and export detailed billing data for analysis.

2. Optimization

Regularly review and optimize queries based on BigQuery’s query explanation and job statistics.
Implement automated processes to identify and optimize high-cost queries.
Foster a culture of cost awareness among data analysts and engineers.

3. Governance

Establish clear policies for dataset creation, query execution, and resource provisioning.
Implement approval workflows for high-cost operations or large-scale data imports.
Create and enforce organizational policies to prevent costly misconfigurations.

Setting Up Guardrails

Implementing guardrails is crucial to prevent unexpected costs:

Query Limits: Set daily query limit quotas at the project or user level.
Cost Controls: Implement custom cost controls using Cloud Functions and the BigQuery API.
Data Access Controls: Use column-level and row-level security to restrict access to sensitive or high-volume data.
Budgets and Alerts: Set up project-level budgets and alerts in Google Cloud Console.

The Need for Automated Observability and FinOps Solutions

Given the scale and complexity of modern data operations, automated solutions can significantly enhance cost management efforts. Automated observability and FinOps solutions can provide the following:

Real-time cost visibility across your entire BigQuery environment.
Automated recommendations for query optimization and cost reduction.
Anomaly detection to quickly identify unusual spending patterns.
Predictive analytics to forecast future costs and resource needs.

These solutions can offer insights that would be difficult or impossible to obtain manually, helping you make data-driven decisions about your BigQuery usage and costs.

BigQuery-Specific Cost Optimization Techniques

Avoid SELECT: Instead, specify only the columns you need to reduce data processed.
Use Approximate Aggregation Functions: For large-scale aggregations where precision isn’t critical, use approximate functions like APPROX_COUNT_DISTINCT().
Optimize JOIN Operations: Ensure the larger table is on the left side of the JOIN to potentially reduce shuffle and processing time.
Leverage BigQuery ML: Use BigQuery ML for in-database machine learning to avoid data movement costs.
Use Scripting: Utilize BigQuery scripting to perform complex operations without multiple query executions.

Conclusion

Effective BigQuery cost management and FinOps require a holistic approach that considers all aspects of your data operations. By optimizing queries, managing datasets efficiently, leveraging appropriate pricing models, and implementing robust FinOps practices, you can ensure that your BigQuery investment delivers maximum value to your organization.

Remember, the goal isn’t just to reduce costs, but to optimize spend and align it with business objectives. With the right strategies and tools in place, you can transform cost management from a challenge into a competitive advantage, enabling your organization to make the most of BigQuery’s powerful capabilities while maintaining control over expenses.

To learn more about how Unravel can help with BigQuery cost management, request a health check report, view a self-guided product tour, or request a demo.

The post BigQuery Cost Management appeared first on Unravel.

Databricks Cost Management

Unravel Data — Wed, 06 Nov 2024 16:45:55 +0000

Mastering Databricks Cost Management and FinOps: A Comprehensive Checklist

In the era of big data and cloud computing, organizations increasingly turn to platforms like Databricks to handle their data processing and analytics needs. However, with great power comes great responsibility – and, in this case, the responsibility of managing costs effectively.

This checklist dives deep into cost management and FinOps for Databricks, exploring how to inform, govern, and optimize your usage while taking a holistic approach that considers code, configurations, datasets, and infrastructure.

Understanding Databricks Cost Management

Before we delve into strategies for optimization, it’s crucial to understand that Databricks cost management isn’t just about reducing expenses. It’s about gaining visibility into where your spend is going, ensuring resources are being used efficiently, and aligning costs with business value. This comprehensive approach is often referred to as FinOps (Financial Operations).

The Holistic Approach: Key Areas to Consider

1. Code Optimization

Is code optimized? Efficient code is the foundation of cost-effective Databricks usage. Consider the following:

Query Optimization: Ensure your Spark SQL queries are optimized for performance. Use explain plans to understand query execution and identify bottlenecks.
Proper Data Partitioning: Implement effective partitioning strategies to minimize data scans and improve query performance.
Caching Strategies: Utilize Databricks’ caching mechanisms judiciously to reduce redundant computations.

2. Configuration Management

Are configurations managed appropriately? Proper configuration can significantly impact costs:

Cluster Sizing: Right-size your clusters based on workload requirements. Avoid over-provisioning resources.
Autoscaling: Implement autoscaling to adjust cluster size based on demand dynamically.
Instance Selection: Choose the appropriate instance types for your workloads, considering both performance and cost.

3. Dataset Management

Are datasets managed correctly? Efficient data management is crucial for controlling costs:

Data Lifecycle Management: Implement policies for data retention and archiving to avoid unnecessary storage costs.
Data Format Optimization: Use efficient file formats like Parquet or ORC to reduce storage and improve query performance.
Data Skew Handling: Address data skew issues to prevent performance bottlenecks and unnecessary resource consumption.

4. Infrastructure Optimization

Is infrastructure optimized? Optimize your underlying infrastructure for cost-efficiency:

Storage Tiering: Utilize appropriate storage tiers (e.g., DBFS, S3, Azure Blob Storage) based on data access patterns.
Spot Instances: Leverage spot instances for non-critical workloads to reduce costs.
Reserved Instances: Consider purchasing reserved instances for predictable, long-running workloads.

Implementing FinOps Practices

To truly master Databricks cost management, implement these FinOps practices:

1. Visibility and Reporting

Implement comprehensive cost allocation and tagging strategies.
Create dashboards to visualize spend across different dimensions (teams, projects, environments).
Set up alerts for unusual spending patterns or budget overruns.

2. Optimization

Regularly review and optimize resource usage based on actual consumption patterns.
Implement automated policies for shutting down idle clusters.
Encourage a culture of cost awareness among data engineers and analysts.

3. Governance

Establish clear policies and guidelines for resource provisioning and usage.
Implement role-based access control (RBAC) to ensure appropriate resource access.
Create approval workflows for high-cost operations or resource requests.

Setting Up Guardrails

Guardrails are essential for preventing cost overruns and ensuring responsible usage:

Budget Thresholds: Set up budget alerts at various thresholds (e.g., 50%, 75%, 90% of budget).
Usage Quotas: Implement quotas for compute hours, storage, or other resources at the user or team level.
Automated Policies: Use Databricks’ Policy Engine to enforce cost-saving measures automatically.
Cost Centers: Implement chargeback or showback models to make teams accountable for their spend.

The Need for Automated Observability and FinOps Solutions

While manual oversight is important, the scale and complexity of modern data operations often necessitate automated solutions. Tools like Unravel can provide:

Real-time cost visibility across your entire Databricks environment.
Automated recommendations for cost optimization.
Anomaly detection to identify unusual spending patterns quickly.
Predictive analytics to forecast future costs and resource needs.

These solutions can significantly enhance your ability to manage costs effectively, providing insights that would be difficult or impossible to obtain manually.

Conclusion

Effective cost management and FinOps for Databricks require a holistic approach considering all aspects of your data operations. By optimizing code, configurations, datasets, and infrastructure, and implementing robust FinOps practices, you can ensure that your Databricks investment delivers maximum value to your organization. Remember, the goal isn’t just to reduce costs, but to optimize spend and align it with business objectives. With the right strategies and tools in place, you can turn cost management from a challenge into a competitive advantage.

To learn more about how Unravel can help with Databricks cost management, request a health check report, view a self-guided product tour, or request a demo.

The post Databricks Cost Management appeared first on Unravel.

Snowflake Cost Management

Unravel Data — Wed, 06 Nov 2024 16:43:11 +0000

Mastering Snowflake Cost Management and FinOps: A Comprehensive Checklist

Effective cost management becomes paramount as organizations leverage Snowflake’s powerful cloud data platform for their analytics and data warehousing needs. This comprehensive checklist explores the intricacies of cost management and FinOps for Snowflake, delving into strategies to inform, govern, and optimize usage while taking a holistic approach that considers queries, storage, compute resources, and more.

Understanding Cost Management for Snowflake

Snowflake’s unique architecture separates compute and storage, offering a flexible pay-as-you-go model. While this provides scalability and performance benefits, it also requires careful management to ensure costs align with business value.

Effective Snowflake cost management is about more than reducing expenses—it’s also about optimizing spend, ensuring efficient resource utilization, and aligning costs with business outcomes. This comprehensive approach falls under the umbrella of FinOps (Financial Operations).

The Holistic Approach: Key Areas to Consider

1. Compute Optimization

Are compute resources allocated efficiently?

Virtual Warehouse Sizing: Right-size your virtual warehouses based on workload requirements.
Auto-suspend and Auto-resume: Leverage Snowflake’s auto-suspend and auto-resume features to minimize idle time.
Query Optimization: Write efficient SQL queries to reduce compute time and costs.
Materialized Views: Use materialized views for frequently accessed or complex query results.
Result Caching: Utilize Snowflake’s result caching to avoid redundant computations.

2. Resource Monitoring and Governance

Are the right policies and governance in place? Proper monitoring and governance are essential for cost management:

Resource Monitors: Set up resource monitors to track and limit credit usage.
Account Usage and Information Schema Views: Utilize these views to gain insights into usage patterns and costs.
Role-Based Access Control (RBAC): Implement RBAC to ensure appropriate resource access and usage.

3. Storage Management

Is storage managed efficiently? While storage is typically a smaller portion of Snowflake costs, it’s still important to manage efficiently:

Data Lifecycle Management: Implement policies for data retention and archiving.
Time Travel and Fail-safe: Optimize usage of Time Travel and Fail-safe features based on your data recovery needs.
Zero-copy Cloning: Leverage zero-copy cloning for testing and development to avoid duplicating storage costs.
Data Compression: Use appropriate compression methods to reduce storage requirements.

4. Data Sharing and Marketplace

Are data sharing and marketplace usage optimized?

Secure Data Sharing: Leverage Snowflake’s secure data sharing to reduce data movement and associated costs.
Marketplace Considerations: Carefully evaluate the costs and benefits of data sets or applications from Snowflake Marketplace.

Implementing FinOps Practices

To master Snowflake cost management, consider these FinOps practices:

1. Visibility and Reporting

Implement comprehensive tagging strategies for resources.
Create custom dashboards using Snowsight or third-party BI tools for cost visualization.
Set up alerts for unusual spending patterns or budget overruns.

2. Optimization

Regularly review and optimize warehouse configurations and query performance.
Implement automated processes to identify and optimize high-cost queries or inefficient warehouses.
Foster a culture of cost awareness among data analysts, engineers, and scientists.

3. Governance

Establish clear policies for warehouse creation, data ingestion, and resource provisioning.
Implement approval workflows for high-cost operations or large-scale data imports.
Create and enforce organizational policies to prevent costly misconfigurations.

Setting Up Guardrails

Implementing guardrails is crucial to prevent unexpected costs:

Resource Monitors: Set up resource monitors with actions (suspend or notify) when thresholds are reached.
Warehouse Size Limits: Establish policies on maximum warehouse sizes for different user groups.
Query Timeouts: Configure appropriate query timeouts to prevent runaway queries.
Data Retention Policies: Implement automated data retention and archiving policies.

The Need for Automated Observability and FinOps Solutions

Given the complexity of modern data operations, automated solutions can significantly enhance cost management efforts. Automated observability and FinOps solutions can provide the following:

Real-time cost visibility across your entire Snowflake environment.
Automated recommendations for query optimization and warehouse right-sizing.
Anomaly detection to quickly identify unusual spending patterns.
Predictive analytics to forecast future costs and resource needs.

These solutions can offer insights that would be difficult or impossible to obtain manually, helping you make data-driven decisions about your Snowflake usage and costs.

Snowflake-Specific Cost Optimization Techniques

Cluster Keys: Properly define cluster keys to improve data clustering and query performance.
Search Optimization: Use search optimization service for tables with frequent point lookup queries.
Multi-cluster Warehouses: Leverage multi-cluster warehouses for concurrency without over-provisioning.
Resource Classes: Utilize resource classes to manage priorities and costs for different workloads.
Snowpipe: Consider Snowpipe for continuous, cost-effective data ingestion.

Conclusion

Effective Snowflake cost management and FinOps require a holistic approach considering all aspects of your data operations. By optimizing compute resources, managing storage efficiently, implementing robust governance, and leveraging Snowflake-specific features, you can ensure that your Snowflake investment delivers maximum value to your organization.

Remember, the goal isn’t just to reduce costs, but to optimize spend and align it with business objectives. With the right strategies and tools in place, you can transform cost management from a challenge into a competitive advantage, enabling your organization to make the most of Snowflake’s powerful capabilities while maintaining control over expenses.

By continuously monitoring, optimizing, and governing your Snowflake usage, you can achieve a balance between performance, flexibility, and cost-efficiency, ultimately driving better business outcomes through data-driven decision-making.

To learn more about how Unravel can help optimize your Snowflake cost, request a health check report, view a self-guided product tour, or request a personalized demo.

The post Snowflake Cost Management appeared first on Unravel.

AI Agents: Empower Data Teams With ActionabilityTM for Transformative Results

Unravel Data — Thu, 15 Aug 2024 18:25:10 +0000

AI Agents for Data Teams

Data is the driving force of the world’s modern economies, but data teams are struggling to meet demand to support generative AI (GenAI), including rapid data volume growth and the increasing complexity of data pipelines. More than 88% of software engineers, data scientists, and SQL analysts surveyed say they are turning to AI for more effective bug-fixing and troubleshooting. And 84% of engineers who use AI said it frees up their time to focus on high-value activities.

AI Agents represent the next wave of AI innovation and have arrived just in time to help data teams make more efficient use of their limited bandwidth to build, operate, and optimize data pipelines and GenAI applications on modern data platforms.

Data Teams Grapple with High Demand for GenAI

A surge in adoption of new technologies such as GenAI is putting tremendous pressure on data teams, leading to broken apps and burnout. In order to support new GenAI products, data teams must deliver more production data pipelines and data apps, faster. The result is that data teams have too much on their plates, the pipelines are too complex, there is not enough time, and not everyone has the deep tech skills required. No surprise that 70% of organizations have difficulty integrating data into AI models and only 48% of AI projects get deployed into production.

Understanding AI Agents

Defining AI Agents

AI agents are software-based systems that gather information, provide recommended actions, initiate and complete tasks in collaboration with or on behalf of humans to achieve a goal. AI agents can act independently, utilizing components like perception and reasoning, provide step-by-step guidance to augment human abilities, or can provide supporting information to support complex human-led tasks. AI agents play a crucial role in automating tasks and simplifying data-driven decision-making, and achieving greater productivity and efficiency.

How AI Agents Work

AI agents operate by leveraging a wide range of data sources and signals, using algorithms and data processing to identify anomalies and actions, then interact with their environment and users to effectively achieve specific goals. AI agents can achieve >90% accuracy, primarily driven by the reliability, volume, and variety of input data and telemetry to which they have access.

Types of Intelligent Agents

Reactive and proactive agents are two primary categories of intelligent agents.
Some agents perform work for you, while others help complete tasks with you or provide information to support your work.
Each type of intelligent agent has distinct characteristics and applications tailored to specific functions, enhancing productivity and efficiency.

AI for Data Driven Organizations

Enhancing Decision Making

AI agents empower teams by improving data support decision-making processes for you, with you, or by you. Examples of how AI agents act on your behalf include reducing toil and handling routine decisions based on AI insights. In various industries, AI agents optimize decision-making and provide recommendations to support your decisions. For complex tasks, AI agents provide supporting information needed to build data pipelines, write SQL queries, and partition data.

Benefits of broader telemetry sources for AI agents

Integrating telemetry from various platforms and systems enhances AI agents’ ability to provide accurate recommendations. Incorporating AI agents into root cause analysis (RCA) systems offers significant benefits. Meta’s AI-based root cause analysis system shows how AI agents enhance tools and applications.

Overcoming Challenges

Enterprises running modern data stacks face common challenges like high costs, slow performance, and impaired productivity. Leveraging AI agents can automate tasks for you, with you, and by you. Unravel customers such as Equifax, Maersk, and Novartis have successfully overcome these challenges using AI.

The Value of AI Agents for Data Teams

Reducing Costs

When implementing AI agents, businesses benefit from optimized data stacks, reducing operational costs significantly. These agents continuously analyze telemetry data, adapting to new information dynamically. Unravel customers have successfully leveraged AI to achieve operational efficiency and cost savings.

Accelerating Performance

Performance is crucial in data analytics, and AI agents play a vital role in enhancing it. By utilizing these agents, enterprise organizations can make well-informed decisions promptly. Unravel customers have experienced accelerated data analytics performance through the implementation of AI technologies.

Improving Productivity

AI agents are instrumental in streamlining processes within businesses, leading to increased productivity levels. By integrating these agents into workflows, companies witness substantial productivity gains. Automation of repetitive tasks by AI agents simplify troubleshooting to boost overall productivity and efficiency.

Future Trends in AI Agents for FinOps, DataOps, and Data Engineering

Faster Innovation with AI Agents

By 2026 conversational AI will reduce agent labor costs by $80 billion. AI agents are advancing, providing accurate recommendations to address more issues automatically. This allows your team to focus on innovation. For example, companies like Meta use AI agents to simplify root cause analysis (RCA) for complex applications.

Accelerated Data Pipelines with AI Agents

Data processing is shifting towards real-time analytics, enabling faster revenue growth. However, this places higher demands on data teams. Equifax leverages AI to serve over 12 million daily requests in near real time.

Improved Data Analytics Efficiency with AI Agents

Data management is the fastest-growing segment of cloud spending. In the cloud, time is money; faster data processing reduces costs. One of the word’s largest logistics companies improved efficiency by up to 70% in just 6 months using Unravel’s AI recommendations.

Empower Your Team with AI Agents

Harnessing the power of AI agents can revolutionize your business operations, enhancing efficiency, decision-making, and customer experiences. Embrace this technology to stay ahead in the competitive landscape and unlock new opportunities for growth and innovation.

Learn more about our FinOps Agent, DataOps Agent, and Data Engineering Agent.

The post AI Agents: Empower Data Teams With Actionability^TM for Transformative Results appeared first on Unravel.

Unravel Data Security and Trust

Unravel Data — Thu, 08 Aug 2024 15:00:57 +0000

UNRAVEL DATA SECURITY AND TRUST
ENABLE DATA ACTIONABILITY + FINOPS WITH CONFIDENCE

Privacy and security are top priorities for Unravel and our customers. At Unravel, we help organizations better understand and improve the performance, quality, and cost efficiency of their data and AI pipelines. As a data business, we appreciate the scope and implications of privacy and security threats.

This data sheet provides details to help information security (InfoSec) teams make informed decisions. Specifically, it includes:

An overview of our approach to security and trust
An architectural diagram with connectivity descriptions
Details about Unravel compliance and certifications
Common questions about Unravel privacy and security

For additional details, please reach out to our security experts.

The post Unravel Data Security and Trust appeared first on Unravel.

Unravel Data was Mentioned in the Gartner® Hype Cycle for Container Technology, 2024

Unravel Data — Wed, 24 Jul 2024 15:10:54 +0000

Unravel Data, the first AI-enabled data actionability and FinOps platform built to address the speed and scale of modern data platforms, today announced it has been included as a Sample Vendor in the Gartner® Hype Cycle for Container Technology, 2024 in the Augmented FinOps category.

Unravel’s Perspective

How Augmented FinOps Helps

Augmented FinOps empowers organizations by automating and enhancing financial operations through AI and automation. This innovative approach provides real-time insights into cloud spending, identifies cost-saving opportunities, and ensures budget adherence. By leveraging domain-specific knowledge and intelligent automation, Augmented FinOps reduces manual workload, improves financial accuracy, and optimizes resource allocation. Ultimately, it drives efficiency, enabling businesses to focus on strategic growth while ensuring financial health and governance.

Introducing Unravel’s New AI Agents

Unravel recently announced the three groundbreaking new AI agents: the Unravel FinOps AI Agent, the Unravel DataOps AI Agent, and the Unravel Data Engineering AI Agent. These AI agents are designed to transform how data teams manage and optimize their operations. The Unravel FinOps AI Agent helps automate financial governance, providing near real-time insights into cloud expenditures with showback and chargeback reports, identifying cost-saving opportunities and enabling action. The Unravel DataOps AI Agent streamlines data pipeline monitoring and anomaly detection, and troubleshooting, freeing up human experts for more strategic tasks. Meanwhile, the Unravel Data Engineering AI Agent enhances productivity by automating routine tasks, allowing data engineers to focus on high-value problem-solving. Together, these AI agents empower organizations to achieve greater efficiency, accuracy, and innovation in their data operations, driving transformative business outcomes.

Three Keys to Optimizing Containers for Your Modern Data Stack

In today’s fast-paced digital landscape, optimizing containers is crucial for the efficiency and scalability of your modern data stack. Here are three key strategies to ensure your containerized environments are performing at their best:

1. Implement FinOps for Containers:

FinOps principles like monitoring, optimization, automation, and cost allocation can be applied to container workloads to enhance their efficiency, scalability, and cost-effectiveness. Advanced observability tools, integrated with AI-driven insights, can proactively identify potential problems and suggest optimizations, keeping your data stack running efficiently.

2. Right-Size Your Containers:

Properly sizing your containers is essential to prevent resource wastage and ensure optimal performance. Over-provisioning can lead to unnecessary costs, while under-provisioning can cause performance bottlenecks. Utilize AI-powered tools and automation that provide real-time monitoring and analytics to understand your workloads’ demands and adjust resources accordingly. This dynamic approach helps maintain a balance between cost and performance, ensuring your applications run smoothly without incurring excessive expenses.

3. Automate Management and Scaling:

Automation is a game-changer in container management, allowing for seamless scaling and resource allocation based on real-time demands. Employ automation tools that can handle tasks such as load balancing, resource provisioning, and fault tolerance. Kubernetes, for example, offers powerful automation capabilities that can dynamically manage container orchestration, ensuring your data stack can scale efficiently as your workloads grow. By automating these processes, you reduce the risk of human error, increase operational efficiency, and ensure that your infrastructure can adapt to changing demands without manual intervention.

Optimizing containers using a FinOps approach enables teams to right-size, automate, and scale not only enhances performance and cost-efficiency but also ensures your modern data stack is resilient, scalable, and ready to meet the demands of today’s data-driven world.

Next Steps

Ready to optimize your data operations? Discover the transformative impact of Unravel’s new AI agents. Request a free health check to see how your organization can improve performance, efficiency, and cost management. Start your journey towards smarter, more actionable data insights today.

Gartner, Hype Cycle for Container Technology, By Dennis Smith, 20 June 2024
GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and is used herein with permission. All rights reserved.
Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

The post Unravel Data was Mentioned in the Gartner® Hype Cycle for Container Technology, 2024 appeared first on Unravel.

Empowering Data Agility: Equifax’s Journey to Operational Excellence

Unravel Data — Tue, 25 Jun 2024 16:14:24 +0000

In the data-driven world where real-time decision-making and innovation are not just goals but necessities, global data analytics and technology companies like Equifax must navigate a complex environment to achieve success. Equifax sets the standard for operational excellence by enabling real-time decision-making, accelerating innovation, scaling efficiently, consistently achieving service level agreements (SLAs), and building reliable data pipelines. Let’s delve into the strategies that lead to such a resounding transformation, with a spotlight on the pivotal role of data observability and FinOps.

When Speed Meets Precision: Revolutionizing Credit Scoring

Kumar Menon, the Equifax CTO for Data Mesh and Decision-Making Technologies, faced a formidable challenge: enabling sub-second credit scoring. And he is not alone. “Many organizations find it challenging to turn their data into insights quickly enough to generate real and timely results” (Deloitte). With Unravel’s data observability and FinOps platform, Kumar’s team overcame this hurdle to deliver faster insights and decisions while fortifying Equifax’s competitive edge.

Unifying Data with Vision and Strategy

The necessity to integrate data across six cloud regions cannot be overstated for a company operating at Equifax’s scale. Gartner discovered that many organizations struggle with “high-cost and low-value data integration cycles.” Kumar Menon and his team, leveraging the tools and methodologies of data observability and FinOps, streamlined this intricate process. The result? Faster and more economical products capable of satisfying the needs of a constantly shifting market.

Mastering the Data Deluge

Managing over 10 petabytes of data is no small feat. IDC noted enterprises’ need for “responsiveness, scalability, and resiliency of the digital infrastructure.” Equifax, with Kumar Menon’s foresight, embraced the power of Unravel’s data observability and FinOps frameworks to adapt and grow. By efficiently managing their cloud resource usage, the team was able to scale data processing with the appropriate cloud resources.

Delivering Real-Time Decisions

The Equifax team needed to support 12 million daily inquiries. Operating production services at this scale can be overwhelming for any system not prepared for such a deluge. In fact, Gartner uncovered a significant challenge around the ability to “detect errors and lower the cost to fix and shorten the resulting downtime.” Data observability and FinOps serve as Kumar Menon’s frameworks to not only confront these challenges but to ensure that Equifax can consistently deliver accurate, real-time decisions such as credit scores and employment verifications.

Streamlining Massive Data Ingestion

The Equifax data team’s colossal task of ingesting over 15 billion observations per day could potentially entangle any team in a web of complexity. Again, Gartner articulates the frustration many organizations face with “complex, convoluted data pipelines that require toilsome remediation to manage errors.” Unravel’s platform provided Kumar’s team the means to build and maintain reliable and robust data pipelines, assuring the integrity of Equifax’s reliable and responsive data products.

The Path Forward with Unravel

In a data-centric industry, Equifax exemplifies leadership through precision, agility, and efficiency. The company’s journey demonstrates their capacity to enable real-time decision-making, accelerate innovation, and ensure operational scale and reliability.

At Unravel, we understand and empathize with data teams facing substantial challenges like the ones described above. As your ally in data performance, productivity, and cost management, we’re committed to equipping you with tools that not only remove obstacles but also enhance your operational prowess. Harness the power of data observability and FinOps with Unravel Data through a self-guided tour that shows how you can:

Deliver faster insights and decisions with Unravel’s pipeline bottleneck analysis.
Bring faster and more efficient products to market by using Unravel’s speed, cost, and reliability optimizer.
Scale data processing efficiently with Unravel’s correlated data observability.
Achieve and exceed your SLAs using Unravel’s out-of-the-box reporting.
Build performant data pipelines with Unravel’s pipeline bottleneck analysis.

Ready to unlock the full potential of your data strategies and operations? See how you can achieve more with data observability and FinOps with Unravel Data in a self-guided tour.

The post Empowering Data Agility: Equifax’s Journey to Operational Excellence appeared first on Unravel.

How to Stop Burning Money (or at Least Slow the Burn)

Unravel Data — Tue, 25 Jun 2024 15:32:38 +0000

A Recap of the Unravel Data Customer Session at Gartner Data & Analytics Summit 2024

The Gartner Data & Analytics Summit 2024 (“D&A”) in London is a pivotal event for Chief Data Analytics Officers (CDAOs) and data and analytics leaders, drawing together a global community eager to dive deep into the insights, strategies, and frameworks that are shaping the future of data and analytics. With an unprecedented assembly of over 3,000 attendees, spanning 150+ knowledge-packed sessions and joined by 90+ innovative partners, the D&A Summit was designed to catalyze data transformation within organizations, offering attendees unique insights to think big and drive real change.

The Unravel Data customer session, titled “How to Stop Burning Money (or At Least Slow the Burn)”, emerged as a highlight of the D&A Summit, drawing attention to the pressing need for cost-efficient data processing in today’s rapid digital evolution. The session—presented by one of the largest logistics giants globally, including over 100,000 employees and a fleet of 700+ container vessels operating across 130 countries—captivated CDAOs & data and analytics leaders from over 150+ attendees. The audience was from 140+ companies across 30+ industries such as banking, retail, and pharma, spanning 110+ cities across 20+ countries. This compelling turnout underscored the universal relevance and urgency of cost-optimization strategies in data engineering. Watch the highlight reel here.

The session was presented by Peter Rees, Director of GenAI, AI, Data and Integration at Maersk, who garnered unprecedented accolades, including a 170% higher-than-average attendance. Peter Rees’ session was the third highest-rated of all 40 partner sessions. These results reflect the session’s relevance and the invaluable insights shared on revolutionizing the efficiency of data processing pipelines, cost allocation, and optimization techniques.
The Gartner Data & Analytics Summit 2024, and particularly the Unravel Data customer session, brought together organizations striving to align their data engineering costs with the value of their data and analytics. Unravel Data’s innovative approach, showcased through the success of a world-leading logistics company, provides a blueprint for organizations across industries looking to dramatically enhance the speed, productivity, and efficiency of their data processing and AI investments.

We invite you to explore how your organization can benefit from Unravel Data’s groundbreaking platform. Take the first step towards transforming your data processing strategy by scheduling an Unravel Data Health Check. Embark on your journey towards optimal efficiency and cost savings today.

The post How to Stop Burning Money (or at Least Slow the Burn) appeared first on Unravel.

Unravel Data was Mentioned in a Graphic source in the 2024 Gartner® Report

Unravel Data — Tue, 21 May 2024 14:27:12 +0000

In a recently published report, “Beyond FinOps: Optimizing Your Public Cloud Costs”, Gartner shares a graphic which is adapted from Unravel.

Unravel’s Perspective

How FinOps Helps

Unravel helps organizations adopt a FinOps approach to improve cloud data spending efficiency. FinOps helps organizations address overspend, including under- and over-provisioned cloud services, suboptimal architectures, and inefficient pricing strategies. Infrastructure and operations (I&O) leaders and practitioners can use FinOps principles to optimize cloud service design, configuration, and spending commitments to reduce costs. Data and analytics (D&A) leaders and teams are using FinOps to achieve predictable spend for their cloud data platforms.

Introducing Cloud Financial Management and Governance

Cloud Financial Management (CFM) and Governance empowers organizations to quickly adapt in today’s dynamic landscape, ensuring agility and competitiveness. CFM principles help organizations take advantage of the cloud’s variable consumption model through purpose-built tools tailored for your modern cloud finance needs. A well-defined cloud cost governance model lets cloud users monitor and manage their cloud costs and balance cloud spending vs. performance and end-user experience.

Three Keys to Optimizing Your Modern Data Stack

Cloud data platform usage analysis plays a crucial role in cloud financial management by providing insights into usage patterns, cost allocation, and optimization opportunities. By automatically gathering, analyzing, and correlating data from various sources, such as traces, logs, metrics, source code, and configuration files, organizations can identify areas for cost savings and efficiency improvements. Unravel’s purpose-built AI reduces the toil required to manually examine metrics, such as resource utilization, spending trends, and performance benchmarks for modern data stacks such as Databricks, Snowflake, and BigQuery.
Cost allocation, showback, and chargeback are essential for effective cloud cost optimization. Organizations need to accurately assign costs to different departments or projects based on their actual resource consumption. This ensures accountability and helps in identifying areas of overspending or underutilization. Automated tools and platforms like Unravel can streamline the cost allocation process for cloud data platforms such as Databricks, Snowflake, and BigQuery, making it easier to track expenses and optimize resource usage.
Budget forecasting and management is another critical aspect of cloud financial management. By analyzing historical data and usage trends, organizations can predict future expenses more accurately. This enables them to plan budgets effectively, avoid unexpected costs, and allocate resources efficiently. Implementing robust budget forecasting processes can help organizations achieve greater financial control and optimize their cloud spending.

Next Steps

You now grasp the essence of cloud financial management and governance to optimize cloud spending and your cloud data platform. Start your journey and embrace these concepts to enhance your cloud strategy and drive success. Take charge of your Databricks, Snowflake, and BigQuery optimization today with a free health check.

Gartner, Beyond FinOps: Optimizing Your Public Cloud Costs, By Marco Meinardi, 21 March 2024

GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and is used herein with permission. All rights reserved.
Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

The post Unravel Data was Mentioned in a Graphic source in the 2024 Gartner® Report appeared first on Unravel.

Winning the AI Innovation Race

Stephen Lamont — Mon, 11 Mar 2024 10:24:06 +0000

Business leaders from every industry now find themselves under the gun to somehow, someway leverage AI into an actual product that companies (and individuals) can use. Yet, an estimated 70%-85% of artificial intelligence (AI) and machine learning (ML) projects fail.

In our thought-leadership white paper Winning the AI Innovation Race: How AI Helps Optimize Speed to Market and Cost Inefficiencies of AI Innovation, you will learn:

• Top pitfalls that impede speed and ROI for running AI and data pipelines in the cloud

• How the answers to addressing these impediments can be found at the code level

• How AI is paramount for optimization of cloud data workloads

• How Unravel helps

The post Winning the AI Innovation Race appeared first on Unravel.

Understanding BigQuery Cost

Stephen Lamont — Thu, 22 Feb 2024 17:49:34 +0000

This article explains the two pricing models—on-demand and capacity—for the compute component of the BigQuery pricing, the challenges of calculating chargeback on compute, and what Unravel can do.

On-demand compute pricing: You are charged based on the billed bytes used by your queries. So if the price is $X/TiB and a query uses Y-TiB of billed bytes, you will be billed $(X*Y) for that query.
Capacity compute pricing: You buy slots and you are charged based on the number of slots and the time for which slots were made available.

The following section describes compute pricing in more detail.

Capacity Pricing

To use capacity pricing, you start by creating reservations. You can create one or more reservations inside an admin project. All the costs related to the slots for the reservations will be attributed to the admin project in the bill.

Reservations

While creating a reservation, you need to specify “baseline slots” and “max slots,” in multiples of 100. You will be charged for the baseline slots at the minimum for the duration of the reservation. When all the baseline slots have been utilized by queries running in that reservation, more slots can be made available via autoscaling. Autoscaling happens in multiple of 100 slots. The number of slots available for autoscaling is (max slots – baseline slots).

You can modify the baseline and max slots after you create the reservation. You can increase the values at any point in time, but you cannot decrease the values within 1 hour of creating a reservation or updating the reservation.

Assignments

After you have created a reservation, to enable the queries in projects to use slots from that reservation, you have to create assignments. You can assign projects, folders ,or organizations to a reservation. Whenever a query is started, it will first try to use the slots from this reservation. You can create or delete assignments at any point in time.

Pricing

BigQuery provides 3 editions —Standard, Enterprise, and Enterprise Plus. These editions have different capabilities and different pricing rates. The rates are defined in terms of slot hours. For example, the rate is $0.04 per slot hour for Standard edition in the US and $0.06 per slot hour for Enterprise edition in the same region.

In capacity pricing, you are charged for the number of slots made available and the time for which slots are made available. Suppose you have a reservation with 100 baseline slots and 500 max slots in the Standard edition. Consider the following usage:

In the first hour, no queries are running, so the slot requirement is 0.
In the second hour, there are queries running, but the slot requirement is less than 100.
In the third hour, more queries are running and the slot requirement is 150.

In the first 2 hours, even though the slot requirement is less than 100, you will still be charged for 100 slots—the baseline slots—for each of the first 2 hours.

In the third hour, we need 50 more slots than the baseline, so autoscaling kicks in to provide more slots. Since autoscaling only scales up or down in multiples of 100, 100 more slots are added. Hence, a total of 200 slots (100 baseline + 100 from autoscaling) are made available in this hour.

The number of slot-hours from this 3-hour period is 100 + 100 + 200 = 400. With a rate of $0.04 per slot-hour for Standard edition, you will be charged 0.04*400 = $16 for this usage.

Pay-as-you-go

Recall that you can create/delete/update reservations and baseline/max slots whenever you want. Also, you will be charged for just the number of slots made available to you and for the time the slots are made available. This model is called pay-as-you-go, as you are paying for the usage you are using.

Capacity Commitment

If you expect to use a certain number of slots over a long period of time, you can make a commitment of X slots over a 1-year or 3-year period, for a better rate. You will be charged for those slots for the entire period regardless of whether you use them or not. This model is called capacity commitment.

Consider the following example of capacity commitment. Let’s say you have:

1-year commitment of 1600 slots in the Enterprise edition.
Created 1 reservation with max size of 1500 slots and baseline of 1000 slots.
Hence your autoscaling slots are 1500-1000 = 500.
Pay-as-you-go price for enterprise edition is $0.06 per slot-hour.
1-year commitment price for the enterprise edition is $0.048 per slot hour.

Consider this scenario:

In the first hour, the requirement is less than 1000 slots.
In the second hour, the requirement is 1200 slots.
In the third hour, the requirement is 1800 slots.

In the first hour, the baseline slots of 1000 are made available for the reservation; these slots are available from the commitment slots. Since we have a commitment of 1600 slots, all the 1600 slots are actually available. The 1000 slots are available for the reservation as baseline. The remaining 600 are called idle slots and are also charged. So for the first hour, we are charged for 1600 slots as per commitment price, with a cost of $(1600 * 0.048).

In the second hour, since the requirement is 1200 slots, there is an additional requirement of 200 slots beyond the baseline of 1000 slots. Since 600 idle slots are available from the committed capacity, the additional requirement of 200 slots will come from these idle slots, while the remaining 400 slots will remain idle. Notice that autoscaling was not needed in this case. Before going for autoscaling, BigQuery will try to use idle slots (unless ignore_idle_slots config is set to True for that reservation). So how much are we charged for the second hour? The answer is 1600 slots, since that is what is committed. These 1600 slots are charged as per the commitment price, so the cost for the second hour is $(1600 * 0.048).

In the third hour, the requirement is 1800 slots: the first 1600 slots will come from commitment slots, and the other 200 will now come from autoscaling slots. The 1600 slots will be charged as per 1-year commit pricing, and the 200 slots coming from autoscale slots will be charged as per pay-as-you-go pricing at $0.06/slot-hour in this case. Therefore, the cost for the third hour is $((1600 * 0.048) + (200 * 0.06)).

Notes

Some points to note regarding capacity pricing:

The slots are charged with a maximum granularity of 1 second, and the charge is for a minimum of 1 minute.
Autoscaling always happens in increments/decrements of 100 slots.
Queries running in a reservation automatically use idle slots from other reservations within the same admin project. Unless ignore_idle_slots is set to True for the reservation.
The capacity commitment is specific to a region, organization, and edition. The idle slots can’t be shared across regions or editions.
Idle slot capacity is not shared between reservations in different admin projects.

The Cost Chargeback Problem

In an organization, typically there are one or more capacity commitments, one or more admin projects, and multiple reservations in these admin projects. GCP provides billing data that gives you the hourly reservation costs for a given admin project, edition, and location combination. However, there are multiple possible ways to map a team to a reservation: a team can be assigned to one reservation, multiple teams can be assigned to the same reservation, or multiple teams can be assigned to multiple reservations.

In any case, it is a tricky task to find out which team or user is contributing how much to your cost. How do you map the different teams and users to the different projects, editions, and locations? And how do you then track the cost incurred by these teams and users? Common chargeback approaches—such as chargeback by accounts, projects, and reservations—simply cannot provide clarity at the user or team level. There is also no direct data source that gives this information from GCP.

Unravel provides cost chargeback at the user and team levels by combining data from different sources such as billing, apps, and tags. The crux of our proprietary approach is providing an accurate cost estimate at the query level. We then associate the query costs with users and teams (via tags) to derive the user-level or team-level chargeback.

Computing query-level cost estimates involves addressing a number of challenges. Some of the challenges include:

A query may have different stages where chargeback could be different.
A query may use slots with commitment pricing or pay-as-you-go pricing.
Capacity is billed at one minute minimum.
Autoscaling increments by multiples of 100 slots.
Chargeback policy for idle resource.

Let’s understand these challenges further with a few scenarios. In all the examples below, we assume that we use the Enterprise edition in the US region, with rates of $0.06/slot-hour for pay-as-you-go, and $0.048/slot-hour for a 1-year commitment.

Scenario 1: Slot-hours billed differently at different stages of a query

Looking at the total slot-hours for chargeback could be misleading because the slot-hours at different stages of a query may be billed differently.

Consider the following scenario:

A reservation with a baseline of 0 slot and max 200 slots.
No capacity commitment.
Query Q1 runs from 5am to 6am with 200 slots for the whole hour.
Query Q2 runs from 6am to 8am in 2 stages:
- In the first stage, from 6am to 7am, it uses 150 slots for the whole hour.
- In the second stage, from 7am to 8am, it uses 50 slots for the whole hour.

Both queries use the exact same slot-hours total, i.e., 200 slot-hours, and use the same reservation and edition. Hence we may think the chargeback to both queries should be the same.

But if you look closely, the two queries do not incur the same amount of cost.

Q1 uses 200 slots for 1 hour. Given the reservation with a baseline of 0 slot and a max of 200 slots, 200 slots are made available in this hour, and the cost of the query is $(200*0.06) = $12.

In contrast, Q2’s usage is split into 150 slots for the first hour and 50 slots for the second hour. Since slots are autoscaled in increments of 100, to run Q2, 200 slots are made available in the first hour and 100 slots are made available in the second hour. The total slot-hours for Q2 is therefore 300, and the cost is $(300*0.06) = $18.

Summary: The cost chargeback to a query needs to account for how many slots are used in different stages of the query and not just the total slot-hours (or slot-ms) used.

Scenario 2: Capacity commitment pricing vs. pay-as-you-go pricing from autoscaling

At different stages, a query may use slots from capacity commitment or from autoscaling that are charged at the pay-as-you-go price.

Consider the following scenario:

1 reservation with a baseline of 100 slots and max slots as 300 slots.
1-year capacity commitment of 100 slots.
Query Q1 runs from 5am to 6am and uses 300 slots for the whole 1 hour.
Query Q2 runs from 6am to 8am in 2 stages. It uses 100 slots from 6am to 7am, and uses 200 slots from 7am to 8am.

Once again, the total slot-hours for both queries are the same, i.e., 300 slot hours, and we might chargeback the same cost to both queries.

But if you look closely, the queries do not incur the same amount of cost.

For Q1, 100 slots come from committed capacity and are charged at the 1-year commit price ($0.048/slot-hour), whereas 200 are autoscale slots that are charged at the pay-as-you-go price ($0.06/slot-hour). So the cost of Q1 is $((100*0.048) + (200*0.06)) = $16.80.

For Q2, from 6-7am, 100 slots come from committed capacity and are charged at the 1-year commit price ($0.048/slot-hour), so the cost for 6-7am is = $(100*0.048) = $4.80.

From 7-8am, 100 slots from committed capacity are charged at 1-year commit price ($0.048/slot-hour), and the other 100 are autoscale slots charged at pay-as-you-go price ($0.06/slot-hour). So the cost from 7-8am is = $(100*0.048) + (100*0.06) = $10.80.

Hence the cost between 6-8am (duration when Query-2 is running) is = $4.80 + $10.80 = $15.60.

Summary: The cost chargeback to a query needs to account for whether the slots come from committed capacity or from autoscaling charged at pay-as-you-go price. A query may use both at different stages.

Scenario 3: Minimum slot capacity and increment in autoscaling

A query may be billed for more than the resource it actually needs because of the minimum slot capacity and the minimum increment in autoscaling.

1 reservation with a baseline of 0 slots and max slots as 300 slots.
No capacity commitment.
Query Q1 uses 50 slots for 10 seconds between 05:00:00 to 05:00:10.
There is no other query running between 04:59:00 to 05:02:00.

If you were to chargeback by slot-ms, you would say that the query uses 50 slots for 10 seconds, or 500,000 slot-ms.

However, this assumption is flawed because of these two conditions:

Slot capacity is billed for a minimum of 1 minute before being billed per second.
Autoscaling happens in increments of 100 slots.

For Q1, 100 slots (not 50) are actually made available, for 1 minute (60,000 ms) and hence you will be charged for 6,000,000 slot-ms in your bill.

Summary: The cost chargeback needs to account for minimum slot capacity and autoscaling increments.

Scenario 4: Chargeback policy for idle resource

In the previous scenario, we see that a query that actually uses 500,000 slot-ms is billed for 6,000,000 slot-ms. Here we make the assumption that whatever resource is made available but not used is also included in the chargeback of the queries running at the same time. What happens if there are multiple queries running concurrently, with unused resources? Continuing with the example in Scenario 3, if there is another query, Q2, that uses 50 slots for 30s, from 05:00:10 to 05:00:40, then:

Q1 still uses 500,000 slot-ms like before.
Q2 uses 1,500,000 slot-ms.
The total bill remains 6,000,000 slot-ms as before, because slot capacity is billed for a minimum of 1 min and autoscaling increments by 100 slots.

There are several ways to consider the chargeback to Q1 and Q2:

Charge each query by its actual slot-ms, and have a separate “idle” category. In this case, Q1 is billed for 500,000 slot-ms, Q2 is billed for 1,500,000 slot-ms, and the remaining 4,000,000 slot-ms is attributed to the “idle” category.

Divide idle resources equally among the queries. In this case, Q1 is billed 2,500,000 slot-ms, and Q2 is billed 3,500,000 slot-ms.

Divide idle resources proportionally among the queries based on the queries’ usage. In this case, Q1 uses 1,833,333 slot-ms, while Q2 uses 4,166,667 slot-ms.

Summary: Chargeback policy needs to consider how to handle idle resources. Without a clear policy, there could be mismatches between users’ assumptions and the implementation, even leading to inconsistencies, such as the sum of the queries costs deviating from the bill. Moreover, different organizations may prefer different chargeback policies, and there’s no one-size-fits-all approach.

Conclusion

To conclude, providing accurate and useful chargeback for an organization’s usage of BigQuery presents a number of challenges. The common approaches of chargeback by accounts, reservations, and projects are often insufficient for most organizations, as they need user-level and team-level chargeback. However, chargeback by users and teams require us to be able to provide query-level cost estimates, and then aggregate by users and teams (via tags). Computing the query-level cost estimate is another tricky puzzle where simply considering the total slot usage of a query will not work. Instead, we need to consider various factors such as different billing for different stages of the same query, commitment pricing cs. pay-as-you-go pricing from autoscaling, minimum slot capacity and minimum autoscaling increments, and idle policy.

Fortunately, Unravel has information for all the pieces of the puzzle. Its proprietary algorithm intelligently combines these pieces of information and considers the scenarios discussed. Unravel recognizes that chargeback often doesn’t have a one-size-fits-all approach, and can work with customers to adapt its algorithm to specific requirements and use cases.

The post Understanding BigQuery Cost appeared first on Unravel.

Data Observability + FinOps for Snowflake Engineers

Stephen Lamont — Fri, 26 Jan 2024 20:22:57 +0000

AI-DRIVEN DATA OBSERVABILITY + FINOPS FOR SNOWFLAKE DATA ENGINEERS

Snowflake data engineers are under enormous pressure to deliver results. This data sheet provides more context about the challenges data engineers face and how Unravel helps them address these challenges.

Specifically, it discusses:

Key Snowflake data engineering roadblocks
Unravel’s purpose-built AI for Snowflake
Data engineering benefits

With Unravel, Snowflake data engineers can speed data pipeline development and analytics initiatives with granular and real-time cost visibility, predictive, predictive spend forecasting, and performance insights for their data cloud.

To see Unravel Data for Snowflake in action contact: Data experts

The post Data Observability + FinOps for Snowflake Engineers appeared first on Unravel.

How Shopify Fixed a $1 Million Single Query

Stephen Lamont — Fri, 26 Jan 2024 20:04:34 +0000

A while back, Shopify posted a story about how they avoided a $1 million query in BigQuery. They detail the data engineering that reduced the cost of the query to $1,300 and share tips for lowering costs in BigQuery.

Kunal Agarwal, CEO and Co-founder of Unravel Data, walks through Shopify’s approach to clustering tables as a way to bring the price of a highly critical query down to a more reasonable monthly cost. But clustering is not the only data engineering technique available to run far more efficiently—for cost, performance, reliability—and Kunal brings up 6-7 others.

Few organizations have the data engineering muscle of Shopify. All the techniques to keep costs low and performance high can entail a painstaking, toilsome slog through thousands of telemetry data points, logs, error messages, etc.

Unravel’s customers understand that they cannot possibly have people go through the hundreds of thousands or more lines of code to find the areas to optimize, but rather that this is all done better and faster with automation and AI. Whether your data runs on BigQuery, Databricks, or Snowflake.

Kunal shows what that looks like in a 1-minute drive-by.

Get Unravel Standard data observability + FinOps for free. Forever.
Get started here

The post How Shopify Fixed a $1 Million Single Query appeared first on Unravel.

Unravel Data Partners with Databricks for Lakehouse Observability and FinOps

Stephen Lamont — Tue, 05 Dec 2023 14:07:24 +0000

Purpose-built AI provides real-time cost and performance insights and efficiency recommendations for Databricks users

Palo Alto, CA — December 5, 2023 — Unravel Data, the first AI-enabled data observability and FinOps platform built to address the speed and scale of modern data platforms, today announced that it has joined the Databricks Partner Program to deliver AI-powered data observability into Databricks for granular visibility, performance optimizations, and cost governance of data pipelines and applications. With this new partnership, Unravel and Databricks will collaborate on Go-To-Market (GTM) efforts to enable Databricks customers to leverage Unravel’s purpose-built AI for the Lakehouse for real-time, continuous insights and recommendations to speed time to value of data and AI products and ensure optimal ROI.

With organizations increasingly under pressure to deliver data and AI innovation at lightning speed, data teams are on the front line of delivering production-ready data pipelines at an exponential rate while optimizing performance and efficiency to deliver faster time to value. Unravel’s purpose-built AI for Databricks integrates with Lakehouse Monitoring and Lakehouse Observability to deliver performance and efficiency needed to achieve speed and scale for data analytics and AI products. Unravel’s integration with Unity Catalog enables Databricks users to speed up lakehouse transformation by providing real-time, AI-powered cost insights, code-level optimizations, accurate spending predictions, and performance recommendations to accelerate data pipelines and applications for greater returns on cloud data platform investments. AutoActions and alerts help automate governance with proactive guardrails.

“Most organizations today are receiving unprecedented amounts of data from a staggering number of sources, and they’re struggling to manage it all, which can quickly lead to unpredictable cloud data spend. This combination of rapid lakehouse adoption and the hyperfocus companies have on leveraging AI/ML models for additional revenue and competitive advantage, brings the importance of data observability to the forefront,” said Kunal Agarwal, CEO and co-founder, Unravel Data. “Lakehouse customers who use Unravel can now achieve the agility required for AI/ML innovation while having the predictability and cost governance guardrails needed to ensure a strong ROI.”

Unravel’s purpose-built AI for Databricks delivers insights based on Unravel’s deep observability at the job, user, and code level to supply AI-driven cost efficiency recommendations, including compute provisioning, query performance, autoscaling efficiencies, and more.

Unravel for Databricks enables organizations to:

Speed cloud transformation initiatives by having real-time cost visibility, predictive spend forecasting, and performance insights for their workloads
Enhance time to market of new AI initiatives by mitigating potential pipeline bottlenecks and associated costs before they occur
Better manage and optimize the ROI of data projects with customized dashboards and alerts that offer insights on spend, performance, and unit economics

Unravel’s integration with popular DevOps tools like GitHub and Azure DevOps provides actionability in CI/CD workflows by enabling early issue detection during the code-merge phase and providing developers real-time insights into potential financial impacts of their code changes. This results in fewer production issues and improved cost efficiency.

Learn how Unravel and Databricks can help enterprises optimize their cloud data spend and increase ROI here.

About Unravel Data

Unravel Data radically transforms the way businesses understand and optimize the performance and cost of their modern data applications – and the complex data pipelines that power those applications. Unravel’s market-leading data observability and FinOps platform with purpose-built AI for each data platform, provides actionable recommendations needed for cost and performance data and AI pipeline efficiencies. A recent winner of the Best Data Tool & Platform of 2023 as part of the annual SIIA CODiE Awards, some of the world’s most recognized brands like Adobe, Maersk, Mastercard, Equifax, and Deutsche Bank rely on Unravel Data to unlock data-driven insights and deliver new innovations to market. To learn more, visit https://www.unraveldata.com.

The post Unravel Data Partners with Databricks for Lakehouse Observability and FinOps appeared first on Unravel.

Announcing Unravel for Snowflake: Faster Time to Business Value in the Data Cloud

Stephen Lamont — Tue, 14 Nov 2023 12:00:51 +0000

Snowflake’s data cloud has expanded to become a top choice among organizations looking to leverage data and AI—including large language models (LLMs) and other types of generative AI—to deliver innovative new products to end users and customers. However, the democratization of AI often leads to inefficient usage that results in a cost explosion and decreases the business value of Snowflake. The inefficient usage of Snowflake can occur at various levels. Below are just some examples.

Warehouses: A warehouse that is too large or has too many clusters for a given workload will be underutilized and incur a higher cost than necessary, whereas the opposite (the warehouse being too small or having too few clusters) will not do the work fast enough.
Workload: The democratization of AI results in a rapidly increasing number of SQL users, many of whom focus on getting value out of the data and do not think about the performance and the cost aspects of running SQL on Snowflake. This often leads to costly practices such as:
- SELECT * just to check out the schema
- Running a long, timed-out query repeatedly without checking why it timed out
- Expensive joins such as cartesian products
Data: No or poorly chosen cluster or partition keys lead to many table scans. Unused tables accumulate over time and by the time users notice the data explosion, they have a hard time knowing which tables may be deleted safely.

Snowflake, like other leading cloud data platforms, provides various compute options and automated features to help with the control of resource usage and spending. However, you need to understand the characteristics of your workload and other KPIs, and have domain expertise, to pick the right options and settings for these features—not to mention there’s no native support to identify bad practices and anti-patterns in SQL. Lastly, optimization is not a one-time exercise. As business needs evolve, so do workloads and data; optimizing for cost and performance becomes part of a continued governance of data and AI operations.

Introducing Unravel for Snowflake

Unravel for Snowflake is a purpose-built AI-driven data observability and FinOps solution that enables organizations to get maximum business value from Snowflake by achieving high cost efficiency and performance. It does so by providing deep, correlated visibility into cost, workload, and performance, with AI-powered insights and automated guardrails for continued optimization and governance. Expanding the existing portfolio of purpose-built AI solutions for Databricks, BigQuery, Amazon EMR, and Cloudera, Unravel for Snowflake is the latest data observability and FinOps product from Unravel Data.

The new Unravel for Snowflake features align with the FinOps phases of inform, optimize, operate:

INFORM

A “Health Check” that provides a top-down summary view of cost, usage, insights to improve inefficiencies, and projected annualized savings from applying these insights
A Cost 360 view that captures complete cost across compute, storage and data transfer, and shows chargeback and trends of cost and usage across warehouses, users, tags, and queries
Top K most expensive warehouses, users, queries
Detailed warehouse and query views with extensive KPIs
Side-by-side comparison of queries

OPTIMIZE

Warehouse insights
SQL insights
Data and storage insights

OPERATE

OpenSearch-based alerts on query duration and credits
Alert customization: ability to create custom alerts

Let us first take a look at the Health Check feature that kick-starts the journey of cost and performance optimization.

Health Check for Cost and Performance Inefficiencies

Dashboard-level summary of usage & costs, AI insights into inefficiencies, and projected savings.

The Health Check feature automatically analyzes the workload and cost over the past 15 days. It generates a top-down summary of the cost and usage during this period and, more important, shows insights to improve the inefficiencies for cost and performance—along with the projected annualized savings from applying these insights.

See at a glance your most expensive “query signatures,” with AI-driven insights on reducing costs.

Users can easily spot the most impactful insights at the warehouse and query levels, and drill down to find out the details. They can also view the top 10 most expensive “query signatures,” or groups of similar queries. Lastly, it recommends alerting policies specific to your workload.

Users can use the Health Check feature regularly to find new insights and their impact in savings. As the workloads evolve with new business use cases, new inefficiencies may arise and require continued monitoring and governance.

Uncover your Snowflake savings with a free Health Check report
Request report here

Digging Deep into Understanding Spending

Cost chargeback breakdown and trends by users & queries

Unravel also enables you to easily visualize and understand where money is spent and whether there are anomalies that you should investigate. The Cost 360 view provides cost breakdown and trends across warehouses, users, queries, and tags. It also shows top offenders by listing the most expensive warehouses, users, and query signatures, so that users can address them first.

Cost chargeback breakdown and trends by warehouses & tags.

Debugging and Optimizing Failed, Costly, and Slow Queries

Drill-down AI-driven insights and recommendations into query cost & performance.

Unravel captures extensive data and metadata about cost, workload, and data, and automatically applies AI to generate insights and recommendations for each query and warehouse. Users can filter queries based on status, insights, duration, etc., to find interesting queries, and drill down to look at query details, including the insights for cost and performance optimization. They can also see similar queries to a given query and do side-by-side comparison to spot the difference between two runs.

Get Started with Unravel for Snowflake

To conclude, Unravel supports a variety of use cases in FinOps, from understanding cost and usage, to optimizing inefficiencies and providing alerts for governance. Learn more about Unravel for Snowflake by reading the Unravel for Snowflake docs and request a personalized Snowflake Health Check report.

Role	Scenario	Unravel benefits
FinOps Practitioner	Understand what we pay for Snowflake down to the user/app level in real time, accurately forecast future spend with confidence	Granular visibility at the warehouse, query, and user level enables FinOps practitioners to perform cost allocation, estimate annual data cloud application costs, cost drivers, break-even, and ROI analysis.
FinOps Practitioner / Engineering / Operations	Identify the most impactful recommendations to optimize overall cost and performance	AI-powered performance and cost optimization recommendations enable FinOps and data teams to rapidly upskill team members, implement cost efficiency SLAs, and optimize Snowflake pricing tier usage to maximize the company’s cloud data ROI.
Engineering Lead / Product Owner	Identify the most impactful recommendations to optimize the cost and performance of a warehouse	AI-driven insights and recommendations enable product and data teams to improve slot utilization, boost SQL query performance, leverage table partitioning and column clustering to achieve cost efficiency SLAs and launch more data queries within the same warehouse budget.
Engineering / Operations	Live monitoring with alerts	Live monitoring with alerts speed mean time to repair (MTTR) and prevent outages before they happen.
Data Engineer	Debugging a query and comparing queries	Automatic troubleshooting guides data teams directly to the pinpoint the source of query failures down to the line of code or SQL query along with AI recommendations to fix it and prevent future issues.
Data Engineer	Identify expensive, inefficient, or failed queries	Proactively improve cost efficiency, performance, and reliability before deploying queries into production. Compare two queries side-by-side to find any metrics that are different between the two runs, even if the queries are different.

The post Announcing Unravel for Snowflake: Faster Time to Business Value in the Data Cloud appeared first on Unravel.

Unravel Data Launches Cloud Data Cost Optimization for Snowflake

Stephen Lamont — Tue, 14 Nov 2023 12:00:38 +0000

Efficiency Recommendations for Infrastructure, Configuration, and Code

PALO ALTO, CA — November 14, 2023 – Unravel Data, the first AI-enabled data observability and FinOps platform built to address the speed and scale of modern data platforms, today announced the release of Unravel for Snowflake. By employing AI that is purpose-built for managing the Snowflake technology stack, cloud data cost management is put into the hands of Snowflake customers by providing them with granular insights into specific cost drivers, as well as AI-driven cost and performance recommendations for optimizing SQL queries and data applications. Unravel for Snowflake is the latest data observability and FinOps product from Unravel Data, adding to the portfolio of purpose-built AI solutions that include Databricks, EMR, Cloudera, and BigQuery.

Today, companies are looking to AI to provide them with a competitive advantage, which is driving an exponential increase in data usage and workloads, use cases, pipelines, and generative AI/LLM models. In turn, companies are facing even greater problems with broken pipelines and inefficient data processing, slowing time-to-business value and adding to exploding cloud data bills. Unfortunately, most companies lack visibility into their data cloud spend or ways to optimize data pipelines/workloads to lower spend, speed innovation, and mitigate problems.

Unravel’s purpose-built AI for Snowflake delivers insights based on Unravel’s deep observability at granular levels to deliver AI-driven cost optimization recommendations for warehouses and SQL that include: warehouse provisioning, run-time, auto-scaling efficiencies, and more. With Unravel, Snowflake users can see real-time cost usage by query, user, department, and warehouse, and set customized dashboards, alerts, and guardrails to enable accurate, granular cost allocation, trend visualization, and forecasting.

“As companies double down on AI efforts, we can expect to see more wasted data cloud spend. Costs are incurred not only with infrastructure but with consumption, as most AI pipelines are created in ways that drive up unnecessary cloud data costs,” said Kunal Agarwal, CEO and co-founder, Unravel Data. “Data engineering and architecture teams need an early warning system to alert them to out-of-control spending, an automated way to pinpoint the source of performance issues and cost overruns, and AI-driven recommendations to optimize code in ways that mitigate unnecessary costs, speed new development, and eliminate data pipeline problems.”

At the core of Unravel Data’s platform is its AI-powered Insights Engine, which has been trained to understand all the intricacies and complexities of modern data platforms and the supporting infrastructure. The Insights Engine has been built to ingest and interpret the continuous millions of ongoing data streams to provide real-time insights into application and system performance, and recommendations to optimize costs, including right-sizing instances and applying code recommendations for performance and financial efficiencies. When combined with Unravel’s automated guardrails and alerts, the Insights Engine enables organizations to achieve data cloud efficiency at scale.

“Our latest research shows that the adopters of cloud data warehouses struggle with data pipeline complexity, lack of staff/expertise, and an inability to predict workloads,” says Kevin Petrie, VP of Research at The Eckerson Group. “FinOps platforms for cloud data analytics, such as Unravel, provide the granular visibility that stakeholders need to predict and monitor spending. This makes it easier for companies to optimize workloads, change user behavior, and get a handle on governing cloud costs.”

Unravel for Snowflake includes additional features such as:

Visibility for cost allocation with chargeback/showback reports
Warehouse-level insights and recommendations relating to warehouse consolidation and underutilization efficiencies
Compute + storage unit cost reporting with average cost per project, query, and user over time
SQL-related insights and recommendations for optimizing queries by filters, joins, projection inefficiencies, anti-patterns, and more to improve query efficiency and increase capacity so that more users and requests can be served at the same spend
Dashboard customization for at-a-glance summaries and drill-down insights for spend, performance, and unit costs
Alert customization using OpenSearch-based alerts beyond Snowflake’s out-of-the-box alerts to enable early warnings of resource usage spikes before they hit the cloud bill

To learn more about how we are helping Snowflake customers optimize their data cloud costs and to request a complimentary “Health Check” – projected annual cost savings for your Snowflake warehouses using Unravel’s optimization insights and recommended actions to start saving– visit Unravel for Snowflake.

About Unravel Data

Unravel Data radically transforms the way businesses understand and optimize the performance and cost of their modern data applications – and the complex data pipelines that power those applications. Unravel’s market-leading data observability and FinOps platform with purpose-built AI for each data platform, provides actionable recommendations needed for cost and performance data and AI pipeline efficiencies. A recent winner of the Best Data Tool & Platform of 2023 as part of the annual SIIA CODiE Awards, some of the world’s most recognized brands like Maersk, Mastercard, Equifax, and Deutsche Bank rely on Unravel Data to unlock data-driven insights and deliver new innovations to market. To learn more, visit https://www.unraveldata.com.

Media Contact
Blair Moreland
ZAG Communications for Unravel Data
unraveldata@zagcommunications.com

The post Unravel Data Launches Cloud Data Cost Optimization for Snowflake appeared first on Unravel.

Why Optimizing Cost Is Crucial to AI/ML Success

Stephen Lamont — Fri, 03 Nov 2023 13:59:09 +0000

This article was originally published by the Forbes Technology Council, September 13, 2023.

When gold was discovered in California in 1848, it triggered one of the largest migrations in U.S. history, accelerated a transportation revolution and helped revitalize the U.S. economy. There’s another kind of Gold Rush happening today: a mad dash to invest in artificial intelligence (AI) and machine learning (ML).

The speed at which AI-related technologies have been embraced by businesses means that companies can’t afford to sit on the sidelines. Companies also can’t afford to invest in models that fail to live up to their promises.

But AI comes with a cost. McKinsey estimates that developing a single generative AI model costs up to $200 million, up to $10 million to customize an existing model with internal data and up to $2 million to deploy an off-the-shelf solution.

The volume of generative AI/ML workloads—and data pipelines that power them—has also skyrocketed at an exponential rate as various departments run differing use cases with this transformational technology. Bloomberg Intelligence reports that the generative AI market is poised to explode, growing to $1.3 trillion over the next 10 years from a market size of just $40 billion in 2022. And every job, every workload, and every data pipeline costs money.

Because of the cost factor, winning the AI race isn’t just about getting there first; it’s about making the best use of resources to achieve maximum business goals.

The Snowball Effect

There was a time when IT teams were the only ones utilizing AI/ML models. Now, teams across the enterprise—from marketing to risk to finance to product and supply chain—are all utilizing AI in some capacity, many of whom lack the training and expertise to run these models efficiently.

AI/ML models process exponentially more data, requiring massive amounts of cloud compute and storage resources. That makes them expensive: A single training run for GPT-3 costs $12 million.

Enterprises today may have upwards of tens—even hundreds—of thousands of pipelines running at any given time. Running sub-optimized pipelines in the cloud often causes costs to quickly spin out of control.

The most obvious culprit is oversized infrastructure, where users are simply guessing how much compute resources they need rather than basing it on actual usage requirements. Same thing with storage costs, where teams may be using more expensive options than necessary for huge amounts of data that they rarely use.

But data quality and inefficient code often cause costs to soar even higher: data schema, data skew and load imbalances, idle time, and a rash of other code-related issues that make data pipelines take longer to run than necessary—or even fail outright.

Like a snowball gathering size as it rolls down a mountain, the more data pipelines you have running, the more problems, headaches, and, ultimately, costs you’re likely to have.

And it’s not just cloud costs that need to be considered. Modern data pipelines and AI workloads are complex. It takes a tremendous amount of troubleshooting expertise just to keep models working and meeting business SLAs—and that doesn’t factor in the costs of downtime or brand damage. For example, if a bank’s fraud detection app goes down for even a few minutes, how much would that cost the company?

Optimized Data Workloads on the Cloud

Optimizing cloud data costs is a business strategy that ensures a company’s resources are being allocated appropriately and in the most cost-efficient manner. It’s fundamental to the success of an AI-driven company as it ensures that cloud data budgets are being used effectively and providing maximum ROI.

But business and IT leaders need to first understand exactly where resources are being used efficiently and where waste is occurring. To do so, keep in mind the following when developing a cloud data cost optimization strategy.

• Reuse building blocks. Everything costs money on the cloud. Every file you store, every record you access, every piece of code you run incur a cost. Data processing can usually be broken down into a series of steps, and a smart data team should be able to reuse those steps for other processing. For example, code written to move data about a company’s sales records could be reused by the pricing and product teams rather than both teams building their own code separately and incurring twice the cost.

• Truly leverage cloud capabilities. The cloud allows you to quickly adjust the resources needed to process data workloads. Unfortunately, too many companies operate under “just in case” scenarios that lead to allocating more resources than actually needed. By understanding usage patterns and leveraging cloud’s auto-scaling capabilities, it’s possible for companies to dynamically control how they scale up and, more importantly, create guardrails to manage the maximum.

• Analyze compute and storage spend by job and by user. The ability to really dig down to the granular details of who is spending what on which project will likely yield a few surprises. You might find that the most expensive jobs are not the ones that are making your company millions. You may find that you’re paying way more for exploration than for data models that will be put to good use. Or, you may find that the same group of users are responsible for the jobs with the biggest spend and the lowest ROI (in which case, it might be time to tighten up on some processes).

Given the data demands that generative AI models and use cases place on a company, business and IT leaders need to have a deep understanding of what’s going on under the proverbial hood. As generative AI evolves, business leaders will need to address new challenges. Keeping cloud costs under control shouldn’t be one of them.

The post Why Optimizing Cost Is Crucial to AI/ML Success appeared first on Unravel.

Overcoming Friction & Harnessing the Power of Unravel: Try It for Free

Stephen Lamont — Wed, 11 Oct 2023 13:52:00 +0000

Overview

In today’s digital landscape, data-driven decisions form the crux of successful business strategies. However, the path to harnessing data’s full potential is strewn with challenges. Let’s delve into the hurdles organizations face and how Unravel is the key to unlocking seamless data operations.

The Roadblocks in the Fast Lane of Data Operations

In today’s data-driven landscape, organizations grapple with erratic spending, cloud constraints, AI complexities, and prolonged MTTR, urgently seeking solutions to navigate these challenges efficiently. The four most common roadblocks are:

Data Spend Forecasting: Imagine a roller coaster with unpredictable highs and lows. That’s how most organizations view their data spend forecasting. Such unpredictability wreaks havoc on financial planning, making operational consistency a challenge.
Constraints in Adding Data Workloads: Imagine tying an anchor to a speedboat. That’s how the constraints feel when trying to adopt cloud data solutions, holding back progress and limiting agility.
Surge in AI Model Complexity: AI’s evolutionary pace is exponential. As it grows, so do the intricacies surrounding data volume and pipelines, which strain budgetary limitations.
The MTTR Bottleneck: The multifaceted nature of modern tech stacks means longer Mean Time to Repair (MTTR). This slows down processes, consumes valuable resources, and stalls innovation.

By acting as a comprehensive data observability and FinOps solution, Unravel Data empowers businesses to move past the challenges and frictions that typically hinder data operations, ensuring smoother, more efficient data-driven processes. Here’s how Unravel Data aids in navigating the roadblocks in the high-speed lane of data operations:

Predictive Data Spend Forecasting: With its advanced analytics, Unravel Data can provide insights into data consumption patterns, helping businesses forecast their data spending more accurately. This eliminates the roller coaster of unpredictable costs.
Simplifying Data Workloads: Unravel Data optimizes and automates workload management. Instead of being anchored down by the weight of complex data tasks, businesses can efficiently run and scale their data processes in the cloud.
Managing AI Model Complexity: Unravel offers visibility and insights into AI data pipelines. Analyzing and optimizing these pipelines ensure that growing intricacies do not overwhelm resources or budgets.
Reducing MTTR: By providing a clear view of the entire tech stack and pinpointing bottlenecks or issues, Unravel Data significantly reduces Mean Time to Repair. With its actionable insights, teams can address problems faster, reducing downtime and ensuring smooth data operations.
Streamlining Data Pipelines: Unravel Data offers tools to diagnose and improve data pipeline performance. This ensures that even as data grows in volume and complexity, pipelines remain efficient and agile.
Efficiency and ROI: With its clear insights into resource consumption and requirements, Unravel Data helps businesses run 50% more workloads in their existing Databricks environments, ensuring they only pay for what they need, reducing wastage and excess expenditure.

The Skyrocketing Growth of Cloud Data Management

As the digital realm expands, cloud data management usage is soaring, with data services accounting for a significant chunk. According to the IDC, the public cloud IaaS and PaaS market is projected to reach $400 billion by 2025, growing at a 28.8% CAGR from 2021 to 2025. Some highlights are:

Data management and application development account for 39% and 20% of the market, respectively, and are the main workloads backed by PaaS solutions, capturing a major share of its revenue.
In IaaS revenue, IT infrastructure leads with 25%, trailed by business applications (21%) and data management (20%).
Unstructured data analytics and media streaming are predicted to be the top-growing segments with CAGRs of 41.9% and 41.2%, respectively.

Unravel provides a comprehensive solution to address the growth associated with cloud data management. Here’s how:

Visibility and Transparency: Unravel offers in-depth insights into your cloud operations, allowing you to understand where and how costs are accruing, ensuring no hidden fees or unnoticed inefficiencies.
Optimization Tools: Through its suite of analytics and AI-driven tools, Unravel pinpoints inefficiencies, recommends optimizations, and automates the scaling of resources to ensure you’re only using (and paying for) what you need.
Forecasting: With predictive analytics, Unravel provides forecasts of data usage and associated costs, enabling proactive budgeting and financial planning.
Workload Management: Unravel ensures that data workloads run efficiently and without wastage, reducing both computational costs and storage overhead.
Performance Tuning: By optimizing query performance and data storage strategies, Unravel ensures faster results using fewer resources, translating to 50% more workloads.
Monitoring and Alerts: Real-time monitoring paired with intelligent alerts ensures that any resource-intensive operations or anomalies are flagged immediately, allowing for quick intervention and rectification.

By employing these strategies and tools, Unravel acts as a financial safeguard for businesses, ensuring that the ever-growing cloud data bill remains predictable, manageable, and optimized for efficiency.

The Tightrope Walk of Efficiency Tuning and Talent

Modern enterprises hinge on data and AI, but shrinking budgets and talent gaps threaten them. Gartner pinpoints overprovisioning and skills shortages as major roadblocks, while Google and IDC underscore the high demand for data analytics skills and the untapped potential of unstructured data. Here are some of the problems modern organizations face:

Production environments are statically overprovisioned and therefore underutilized. On-premises, 30% utilization is common, but it’s all capital expenditures (capex), and as long as it’s within budget, no one has traditionally cared about the waste. However, in the cloud, you pay for that excess resource monthly, forcing you to confront the ongoing cost of the waste. – Gartner
The cloud skills gap has reached a crisis level in many organizations – Gartner
Revenue creation through digital transformation requires talent engagement that is currently scarce and difficult to acquire and maintain. – Gartner
Lack of skills remains the biggest barrier to infrastructure modernization initiatives, with many organizations finding they cannot hire outside talent to fill these skills gaps. IT organizations will not succeed unless they prioritize organic skills growth. – Gartner
Data analytics skills are in demand across industries as businesses of all types around the world recognize that strong analytics improve business performance.- Google via Coursera

Unravel Data addresses the delicate balancing act of budget and talent in several strategic ways:

Operational Efficiency: Purpose-built AI provides actionable insights into data operations across Databricks, Spark, EMR, BigQuery, Snowflake, etc. Unravel Data reduces the need for trial-and-error and time-consuming manual interventions. At the core of Unravel’s data observability platform is our AI-powered Insights Engine. This sophisticated Artificial Intelligence engine incorporates AI techniques, algorithms, and tools to process and analyze vast amounts of data, learn from patterns, and make predictions or decisions based on that learning. This not only improves operational efficiency but also ensures that talented personnel spend their time innovating rather than on routine tasks.
Skills Gap Bridging: The platform’s intuitive interface and AI-driven insights mean that even those without deep expertise in specific data technologies can navigate, understand, and optimize complex data ecosystems. This eases the pressure to hire or train ultra-specialized talent.
Predictive Analysis: With Unravel’s ability to predict potential bottlenecks or inefficiencies, teams can proactively address issues, leading to more efficient budget allocation and resource utilization.
Cost Insights: Unravel provides detailed insights into the efficiency of various data operations, allowing organizations to make informed decisions on where to invest and where to cut back.
Automated Optimization: By automating many of the tasks traditionally performed by data engineers, like performance tuning or troubleshooting, Unravel ensures teams can do more with less, optimizing both budget and talent.
Talent Focus Shift: With mundane tasks automated and insights available at a glance, skilled personnel can focus on higher-value activities, like data innovation, analytics, and strategic projects.

By enhancing efficiency, providing clarity, and streamlining operations, Unravel Data ensures that organizations can get more from their existing budgets while maximizing the potential of their talent, turning the tightrope walk into a more stable journey.

The Intricacies of Data-Centric Organizations

Data-centric organizations grapple with the complexities of managing vast and fast-moving data in the digital age. Ensuring data accuracy, security, and compliance, while integrating varied sources, is challenging. They must balance data accessibility with protecting sensitive information, all while adapting to evolving technologies, addressing talent gaps, and extracting actionable insights from their data reservoirs. Here is some relevant research on the topic:

“Data is foundational to AI” yet “unstructured data remains largely untapped.” – IDC
Even as organizations rush to adopt data-centric operations, challenges persist. For instance, manufacturing data projects often hit roadblocks due to outdated legacy technology, as observed by the World Economic Forum.
Generative AI is supported by large language models (LLMs), which require powerful and highly scalable computing capabilities to process data in real-time. – Gartner

Unravel Data provides a beacon for data-centric organizations amid complex challenges. Offering a holistic view of data operations, it simplifies management using AI-driven tools. It ensures data security, accessibility, and optimized performance. With its intuitive interface, Unravel bridges talent gaps and navigates the data maze, turning complexities into actionable insights.

Embarking on the Unravel Journey: Your Step-By-Step Guide

Beginning your data journey with Unravel is as easy as 1-2-3. We guide you through the sign-up process, ensuring a smooth and hassle-free setup.
Unravel for Databricks page

Unravel for Databricks plans & pricing

Unravel for Databricks free account

Level Up with Unravel Premium

Ready for an enhanced data experience? Unravel’s premium account offers a plethora of advanced features that the free version can’t match. Investing in this upgrade isn’t just about more tools; it’s about supercharging your data operations and ROI.

Wrap-Up

Although rising demands on the modern data landscape are challenging, they are not insurmountable. With tools like Unravel, organizations can navigate these complexities, ensuring that data remains a catalyst for growth, not a hurdle. Dive into the Unravel experience and redefine your data journey today.

Unravel is a business’s performance sentinel in the cloud realm, proactively ensuring that burgeoning cloud data expenses are not only predictable and manageable but also primed for significant cost savings. Unravel Data transforms the precarious balance of budget and talent into a streamlined, efficient journey for organizations. Unravel Data illuminates the path for data-centric organizations, streamlining operations with AI tools, ensuring data security, and optimizing performance. Its intuitive interface simplifies complex data landscapes, bridging talent gaps and converting challenges into actionable insights.

The post Overcoming Friction & Harnessing the Power of Unravel: Try It for Free appeared first on Unravel.

Databricks Cost Efficiency: 5 Reasons Why Unravel Free Observability vs. Databricks Free Observability

Stephen Lamont — Tue, 26 Sep 2023 13:33:35 +0000

“Data is the new oil, and analytics is the combustion engine.” – Peter Sondergaard

Cloud data analytics is the key to maximizing value from your data. The lakehouse has emerged as a flexible and efficient architecture, and Databricks has emerged as a popular choice. However, data lakehouse processing volumes can fluctuate, leading to unpredictable surges in cloud data spending that impact budgeting and profitability. Executives want to make sure they are getting the most from their lakehouse investments and not overspending.

Implementing a proactive data observability and FinOps approach early in your lakehouse journey helps ensure you achieve your business objectives and bring predictability to your financial planning. Choosing the right lakehouse observability and FinOps tool sets your team up for success. Since the goal is efficiency, starting with free tools makes sense. Two free options stand out:

Overwatch – the open source Databricks observability tool
Unravel Standard – the free version of Unravel’s data observability and FinOps platform

Below are 5 reasons to choose Unravel free observability vs. Databricks free observability:

Reason #1: Complete observability

Many organizations take a do-it-yourself approach, building piecemeal observability solutions in-house by cobbling together a variety of data sources using open source tools. The problem is that it takes months or even years to get something usable up and running. Unravel’s data observability and FinOps platform helps you get results fast.

Unravel provides a 360° out-of-the-box solution

Unravel provides a holistic view of your Databricks estate, reducing the time to value. Gain deep insights into cluster performance, job execution, resource utilization, and cost drivers through comprehensive lakehouse observability. Unravel’s observability solution provides you with detailed visibility into the performance of your Databricks clusters.

Insights overview dashboard in Unravel Standard

Ensure no blind spots in your analysis by leveraging Unravel’s end-to-end visibility across all aspects of your Databricks environment. View your lakehouse landscape at a glance with the Insights Overview dashboard. You can see the overall health of your Databricks estate, including the number of clusters that are over- or under-provisioned, the total number of inefficient and failed apps, and other summary statistics to guide your efforts to optimize your lakehouse towards better performance and cost efficiency.

Purpose-built correlation

Unravel’s purpose-built correlation models help you identify inefficient jobs at code, data layout/partitioning, and infrastructure levels. Databricks logs, metrics, events, traces, and source code are automatically evaluated to simplify root cause analysis and issue resolution. You can dive deep into the execution details of your Databricks jobs, track the progress of each job, and see resource usage details. This helps you identify long-running and resource-intensive jobs that might be impacting the overall performance and efficiency of your lakehouse estate.

End-to-end visibility

Visual summaries provide a way to look across all the jobs and clusters in your Databricks workspace. No need to click around your Databricks workspace looking for issues, run queries, or pull details into a spreadsheet to summarize results. Unravel helps you easily see all the details in one place.

Reason #2: Real-time visibility

A single errant job or underutilized cluster can derail your efficiency goals and delay critical data pipelines. The ability to see job and cluster performance and efficiency in real time provides an early warning system.

Live updates for running jobs and clusters

React promptly to any anomalies or bottlenecks in your clusters and jobs to ensure efficiency. Unravel’s real-time insights allow you to catch long-running jobs before they impact pipeline performance or consume unnecessary resources.

Workspace Trends dashboard in Unravel Standard

See DBU usage and cluster session trends

By understanding the real-time performance of your Databricks workloads, you can identify areas where improvements can be made to improve efficiency without sacrificing performance. Leverage Unravel’s real-time insights to make data-driven decisions for better resource allocation and workload management.

Drill down to see DBU usage and tasks for a specific day

Quickly find resource consumption outliers by day to understand how usage patterns are driving costs. Unravel helps you identify opportunities to reduce waste and increase cluster utilization. By having visibility into the real-time cost implications of your jobs and clusters, you can make faster decisions to boost performance and improve business results.

User-level reporting for showback/chargeback

Granular reporting to the user and job level helps you produce accurate and timely showback and chargeback reports. With Unravel’s real-time visibility into your Databricks workloads, you have the power to see which teams are consuming the most resources and proactively manage costs to achieve efficient operations. Reacting quickly to anomalies and leveraging real-time, user-level insights enables better decision-making for resource allocation and utilization. Unravel enables central data platform and operations teams to provide a reliable, single source of truth for showback and chargeback reporting.

Try Unravel for free
Sign up for free account

Reason #3: Automated Cluster Discovery

You can’t fix problems you can’t see. It all begins with getting visibility across all your workspace clusters and jobs. Unravel automates this process to save you time and ensure you don’t miss anything.

Easily connect to all of your clusters in the workspace

Simplify the process of connecting to your Databricks clusters with Unravel’s automated cluster discovery. This streamlines the observability and management of your compute clusters, so you can focus on resource optimization to boost productivity. Unravel lets you easily see all of your clusters without adding dependencies.

Compute dashboard shows clusters in Unravel Standard

Quickly discover clusters with errors, delays, and failures

Unravel lets you see clusters grouped by event type (e.g., Contended Driver, High I/O, Data Skew, Node Downsizing). This helps you quickly identify patterns in compute clusters that are not being fully utilized. This eliminates the need for manual monitoring and analysis, saving you time and effort.

View cluster resource trends

Unravel’s intelligent automation continuously monitors cluster activity and resource utilization over time. This helps you spot changing workload requirements and helps ensure optimal performance while keeping costs in check by avoiding overprovisioning or underutilization to make the most of your cloud infrastructure investments.

Reason #4: Ease of Entry

Open source and DIY solutions typically have a steep learning curve to ensure everything is correctly configured and connected. Frequent changes and updates add to your team’s already full load. Unravel offers a simpler approach.

Unravel is quick to set up and get started with minimal learning curve

Integrating Unravel into your existing Databricks environment is a breeze. No complex setup or configuration required. With Unravel, you can seamlessly bring data observability and FinOps capabilities to your data lakehouse estate without breaking a sweat.

Unravel SaaS makes setup and configuration a breeze

But what exactly does this mean for you? It means that you can focus on what matters most—getting the most out of your Databricks platform while keeping costs in check. Unravel’s data observability and FinOps capabilities are provided as a fully managed service, giving you the power to optimize performance and resources, spot efficiency opportunities, and ensure smooth operation of your data pipelines and data applications.

No DIY coding or development required

Unravel is trusted by large enterprise customers across many industries for its ease of integration into their Databricks environments. Whether you’re a small team or an enterprise organization, Unravel’s data observability and FinOps platform is designed to meet your specific needs and use cases without the need to build anything from scratch.

Try Unravel for free
Sign up for free account

Reason #5: Avoid lock-in

A lakehouse architecture gives you flexibility. As your data analytics and data processing needs grow and evolve, you may choose additional analytics tools to complement your cloud data estate. Your data observability and FinOps tool should support those tools as well.

Unravel is purpose-built for Databricks, Snowflake, BigQuery, and other modern data stacks

Each cloud data platform is different and requires a deep understanding of its inner workings in order to provide the visibility you need to run efficiently. Unravel is designed from the ground up to help you get the most out of each modern data platform, leveraging the most relevant and valuable metadata sources and correlating them all into a unified view of your data estate.

No need to deploy a separate tool as your observability needs grow

Unravel provides a consistent approach to data observability and FinOps to minimize time spent deploying and learning new tools. Data teams spend less time upskilling and more time getting valuable insights.

Independent reporting for FinOps

Data analytics is the fastest growing segment of cloud computing as organizations invest in new use cases such as business intelligence (BI), AI and machine learning. Organizations are adopting FinOps practices to ensure transparency in resource allocation, usage, and reporting. Unravel provides an independent perspective of lakehouse utilization and efficiency to ensure objective, data-driven decisions.

Compare Unravel and Databricks free observability

Get started today

Achieve predictable spend and gain valuable insights into your Databricks usage. Get started today with Unravel’s complete data observability and FinOps platform for Databricks that provides real-time visibility, automated cluster discovery, ease of entry, and independent analysis to help you take control of your costs while maximizing the value of your Databricks investments. Create your free Unravel account today.

Try Unravel for free
Sign up for free account

Unravel for Databricks FAQ

Can I use Unravel’s data observability platform with other cloud providers?

Yes. Unravel’s data observability platform is designed to work seamlessly across multiple cloud providers including AWS, Azure, and Google Cloud. So regardless of which cloud provider you choose for your data processing needs, Unravel can help you optimize costs and gain valuable insights.

How does automated cluster discovery help in managing Databricks costs?

Automated cluster discovery provided by solutions like Unravel enables you to easily identify underutilized or idle clusters within your Databricks environment. By identifying these clusters, you can make informed decisions about resource allocation and ensure that you are only paying for what you actually need.

Does Unravel offer real-time visibility into my Databricks usage?

Yes. With Unravel’s real-time visibility feature, you can monitor your Databricks usage in real time. This allows you to quickly identify any anomalies or issues that may impact cost efficiency and take proactive measures to address them.

Can Unravel help me optimize my Databricks costs for different workloads?

Yes. Unravel’s data observability platform provides comprehensive insights into the performance and cost of various Databricks workloads. By analyzing this data, you can identify areas for optimization and make informed decisions to ensure cost efficiency across different workloads.

How easy is it to get started with Unravel’s data observability platform?

Getting started with Unravel is quick and easy. Simply sign up for a free account on our website, connect your Databricks environment, and start gaining valuable insights into your usage and costs. Our intuitive interface and user-friendly features make it simple for anyone to get started without any hassle.

The post Databricks Cost Efficiency: 5 Reasons Why Unravel Free Observability vs. Databricks Free Observability appeared first on Unravel.

5 Key Ingredients to Accurate Cloud Data Budget Forecasting

Stephen Lamont — Mon, 28 Aug 2023 20:14:45 +0000

Hey there! Have you ever found yourself scratching your head over unpredictable cloud data costs? It’s no secret that accurately forecasting cloud data spend can be a real headache. Fluctuating costs make it challenging to plan and allocate resources effectively, leaving businesses vulnerable to budget overruns and financial challenges. But don’t worry, we’ve got you covered!

Uncontrolled fluctuations in cloud data spend can hinder growth and profitability, disrupting the smooth sailing of your business operations. That’s why it’s crucial to gain control over your cloud data workload forecasts. By understanding the changes in your cloud data spend and having a clear picture of your billing data, you can make informed decisions that align with your company’s goals.

We’ll discuss practical strategies to improve forecast accuracy, identify data pipelines and analytics workloads that are above or below budget, and enhance accountability across different business units.

So let’s dive right in and discover how you can steer your business towards cost-effective cloud management!

Unanticipated cloud data spend

Last year, over $16 billion was wasted in cloud spend. Data management is the largest and fastest-growing category of cloud spending, representing 39% of the typical cloud bill. Gartner noted that in 2022, 98% of the overall database management system (DBMS) market growth came from cloud-based database platforms. Cloud data costs are often the most difficult to predict due to fluctuating workloads. 82% of 157 data management professionals surveyed by Forrester cited difficulty predicting data-related cloud costs. On top of the fluctuations that are inherent with data workloads, a lack of visibility into cloud data spend makes it challenging to manage budgets effectively.

Fluctuating workloads: Data processing and storage costs are driven by the amount of data stored and analyzed. With varying workloads, it becomes challenging to accurately estimate the required data processing and storage costs. This unpredictability can result in budget overruns that affect 60% of infrastructure and operations (I&O) leaders.
Unexpected expenses: Streaming data, large amounts of unstructured and semi-structured data, and shared slot pool consumption can quickly drive up cloud data costs. These factors contribute to unforeseen spikes in usage that may catch organizations off guard, leading to unexpected expenses on their cloud bills.
Lack of visibility: Without granular visibility into cloud data analytics billing information, businesses have no way to accurately allocate costs down to the job or user level. This makes it difficult for them to track usage patterns and identify areas where budgets will be over- or under-spent, or where performance and cost optimization are needed.

By implementing a FinOps approach, businesses can gain better control over their cloud data spend, optimize their budgets effectively, and avoid unpleasant surprises when it comes time to pay the bill.

Why cloud data costs are so unpredictable

Cloud data costs can be a source of frustration for many businesses due to their unpredictability. Cloud and data platform providers often have complex pricing models that make it challenging to accurately forecast expenses. Here are some key reasons why cloud data analytics costs can be so difficult to predict:

Variety of factors affect analytics costs: Cloud and data platform providers offer a range of services and pricing options, making it hard to determine the exact cost of using specific resources. Factors such as compute instance and cluster sizes, regional pricing, and additional features all contribute to the final cloud and data platform bill.

Usage patterns impact cost: Fluctuations in usage patterns can significantly affect cloud data costs. Peak demand periods or sudden increases in data volume can result in unexpected expenses. Without proper planning, businesses may find themselves facing higher bills during these periods.

Lack of visibility into resource utilization: Inefficient workload management and a lack of visibility into resource utilization can lead to higher expenses. Without the ability to monitor and optimize resource allocation, businesses may end up paying for unused or underutilized resources.

Inability to allocate historical spend: A lack of granular visibility into data costs at the job, project, and user level makes it nearly impossible to accurately allocate historical spend or forecast future investments. This makes budgeting and financial planning challenging for businesses that rely on cloud data platforms.

Changes in technology or service offerings: Cloud and data platform providers frequently introduce new technologies or adjust their service offerings, which can impact cost structures. Businesses must stay updated on these changes as they may influence their overall cloud expenditure.

Navigating the complexities of cloud data forecasting requires careful analysis and proactive management or resource consumption fluctuations and cost unpredictability. By understanding usage patterns, optimizing capacity utilization, and staying informed about changes from cloud and data platform providers, businesses can gain better control over their cloud data costs.

5 key ingredients of an accurate cloud data cost forecast

To ensure an accurate cloud data cost forecast, several key ingredients must be considered. These include:

Comprehensive understanding of historical usage patterns and trends: Analyzing past usage data provides valuable insights into resource consumption and enables more accurate predictions of future spending.
Granular visibility into data resource usage: It is essential to have detailed visibility into the utilization of resources down to the job and user level. This level of granularity enables a more precise estimation of costs associated with specific tasks or individuals.
Analysis of current platform configurations and workload requirements: Evaluating the existing data platform settings, data access patterns, and workload demands help predict growth rates and changes in cloud data spend.
Consideration of external factors: External factors such as market conditions or regulatory changes can significantly impact cloud data processing costs. Incorporating these variables into the forecasting model ensures a more realistic projection.
Utilization of advanced forecasting techniques and algorithms: Leveraging advanced techniques and algorithms enhances the accuracy of predictions by accounting for various factors simultaneously, resulting in more reliable forecasts.

By incorporating these key ingredients into your cloud data forecasting strategy, you can gain better control over your forecast and achieve higher accuracy in predicting future expenses. With a comprehensive understanding of historical patterns, granular visibility into resource usage, analysis of configurations and workload requirements, consideration of external influences, and advanced forecasting techniques, you can make informed decisions to increase the accuracy of your cloud data spend forecasts.

Remember that accurate cloud data cost forecasting is crucial for effective financial planning within your organization’s cloud environment.

Explore different methods and tools for accurate cloud data forecasting

Statistical modeling techniques like time series analysis can be used to predict future trends based on patterns in historical data. These predictive models help improve forecast accuracy by identifying recurring patterns and extrapolating them into the future.

Machine learning algorithms offer another powerful tool for cloud data forecasting. By analyzing vast amounts of information, these algorithms can generate accurate forecasts by identifying complex relationships and trends within the data. This enables organizations to make informed estimates of their cloud data processing patterns to help anticipate cyclical usage and growth.

Cloud management platforms provide built-in forecasting capabilities. However, it is important to note that forecasts are only as good as the underlying data. Without granular visibility into the jobs, projects, and users utilizing the cloud resources, forecast models cannot take key drivers into account. To ensure accurate predictions, it is crucial to distinguish one-time or end-of-period data processing from ongoing processing related to growing customer and end-user activity.

How purpose-built AI improves cloud data cost forecasting

Purpose-built AI is a game-changer. By leveraging advanced algorithms and machine learning capabilities, businesses can unlock the full potential of their cloud data cost management.

Here’s how purpose-built AI improves cloud data cost forecasting:

Identifying hidden cost drivers: Purpose-built AI has the ability to analyze vast amounts of data and identify subtle factors that contribute to increased costs. It goes beyond surface-level analysis and uncovers underlying patterns, enabling businesses to accurately anticipate cloud analytics resource needs.

Continuous learning for improved accuracy: Machine learning models are continuously trained on past performance, enabling them to learn from historical data and improve accuracy over time. This means that as more data becomes available, the forecasts become increasingly reliable.

Proactive decision-making with predictive analytics: Predictive analytics powered by purpose-built AI enable businesses to make proactive decisions regarding their cloud expenditure. By analyzing trends and patterns, organizations can anticipate future costs and take necessary steps to avoid unnecessary expenses or mitigate risks associated with fluctuating costs.

AI-driven recommendations for enhanced cost efficiency: Purpose-built AI provides valuable recommendations based on its analysis of cloud data cost patterns. These recommendations help businesses improve their overall cost efficiency, ensuring that resources are allocated optimally to accelerate cloud data platform usage.

Using Unravel to Improve Cloud Data Cost Forecast Accuracy to within ±10%

These different approaches help improve the accuracy of your cloud data cost forecasts. But how can you take it a step further? That’s where Unravel comes in.

Unravel is a purpose-built AI platform that can revolutionize the accuracy of your cloud data cost forecasts. Unravel provides real-time insights into your data usage patterns, identifies budget trends, and predicts future costs with remarkable accuracy. With its intuitive interface and easy-to-use features, Unravel empowers you to make informed decisions about resource allocation, budget planning, and overall cost management in the cloud.

Ready to take control of your cloud data costs? Start using Unravel today and unlock the full potential of accurate cloud data cost forecasting.

FAQs

Q: How does Unravel improve cloud data cost forecasting?

A: Unravel leverages advanced machine learning algorithms to analyze historical usage patterns and identify trends in your cloud data costs. By understanding these patterns, it can accurately predict future costs and provide actionable insights for optimizing resource allocation.

Q: Can I integrate Unravel with my existing cloud data platform?

A: Yes! Unravel seamlessly integrates with popular cloud platforms such as AWS, Azure, and Google Cloud Platform. It supports both on-premises and hybrid environments as well.

Q: Is Unravel suitable for businesses of all sizes?

A: Absolutely! Whether you’re a small startup or a large enterprise, Unravel caters to businesses of all sizes with large data estates looking to leverage data to maximize business value. Unravel’s scalable architecture ensures that it can handle the needs of organizations with 10,000s of data platform jobs, such as DBS Bank.

Q: How long does it take to see results with Unravel?

A: You’ll start seeing immediate benefits once you integrate Unravel into your infrastructure. Its real-time insights and actionable recommendations allow you to make informed decisions right from the get-go.

Q: Can Unravel help with other aspects of cloud data management?

A: Yes, Unravel offers a comprehensive suite of features for end-to-end cloud data performance management, FinOps for your data cloud, intelligent data quality, and forecast accuracy. From performance optimization to cost governance, Unravel provides a holistic solution for your cloud analytics needs.

The post 5 Key Ingredients to Accurate Cloud Data Budget Forecasting appeared first on Unravel.

Unlocking Success with FinOps: Top Insights from Expert Virtual Event

Stephen Lamont — Fri, 11 Aug 2023 16:26:15 +0000

The data landscape is constantly evolving, and with it come new challenges and opportunities for data teams. While generative AI and large language models (LLMs) seem to be all everyone is talking about, they are just the latest manifestation of a trend that has been evolving over the past several years: organizations tapping into petabyte-scale data volumes and running increasingly massive data pipelines to deliver ever more data analytics projects and AI/ML models.

The scale and pace of data consumption is rising exponentially.

Data is now core to every business. Untapping its potential to uncover business insights, drive operational efficiencies, and develop new data products is a key competitive differentiator that separates winners from losers. But running the enormous data workloads that fuel such innovation is expensive. Businesses are already struggling to keep their cloud data costs within budget. So, how can companies increase business-critical data output without sending cloud data costs through the roof?

Effective cost management becomes paramount. That’s where FinOps principles come into play, helping organizations to optimize their cloud resources and align them with business goals.

A recent virtual fireside chat with Sanjeev Mohan, former Gartner Research VP for Big Data & Advanced Analytics and founder/principal of SanjMo, Unravel VP of Solutions Engineering Chris Santiago, and DataOps Champion and Certified FinOps Practitioner Clinton Ford discussed five tips for getting started with FinOps for data workloads.

Why FinOps for Data Makes Sense

The virtual event kicked off discussing Gartner analyst Lydia Leong’s argument that building a dedicated FinOps team is a waste of time. Our panelists broke down why FinOps for data teams is actually crucial for companies running big data workloads in the cloud. Then they talked about how Spotify fixed a $1 million query, using that as an example of FinOps principles in practice.

The experts emphasized that having a clear strategy and plan in place is critical for ensuring that resources are allocated effectively and in line with business objectives.

Five Best Practices for DataFinOps

To set data teams on the path to FinOps success, Sanjeev Mohan and Chris Santiago shared five practical tips during the presentation:

Discover: Begin by tracking and visualizing your costs at three different layers: the query level, user level, and workload level. You must first know where the money is going, to ensure that resources are properly aligned.
Understand Cost Drivers: Dig into the top cost drivers—what’s actually incurring the most cost—at a granular level. Not only is this vital for accurately forecasting budgets, but you can prioritize workloads based on their strategic value, ensuring that you’re focusing on tasks that contribute meaningfully to your bottom line.
Collaborate: FinOps is a team sport among Finance, Engineering, and the business. For many organizations, this is a cultural shift as much as anything. A good start is to leverage chargeback/showback models to hold teams accountable and encourage better management of resources.
Build Early Warning Systems: Implement guardrails to nip cost overruns in the bud. It’s best to catch performance/cost problems early on in development, whether as part of your CI/CD process or even as simple as triggering an alert when some cost/performance threshold is violated.
Automate and Optimize: Continuously monitor, automate, and optimize key processes to minimize waste, save time, and achieve better results.

Audience Questions

In addition to discussing FinOps best practices, the panelists fielded several questions from the audience. They addressed topics such as calculating unit costs, selecting impactful visualization tools, and employing cost reduction strategies tailored for their organizations. Throughout the session, the experts emphasized collaboration and partnership, showcasing Unravel’s commitment to empowering data teams to reach their full potential.

The Unravel Advantage

With its AI-powered Insights Engine built specifically for modern data platforms like Databricks, Snowflake, Google Cloud BigQuery, and Amazon EMR, Unravel provides data teams with the performance/cost optimization insights and automation they need to thrive in the competitive data landscape. Just as the recent case study of a leading health insurance provider demonstrated, Unravel’s capabilities are instrumental in helping organizations optimize code and infrastructure to help run more data workloads without increasing their budget. The five FinOps best practices shared by Sanjeev and Chris offer actionable insights for data teams looking to optimize costs, drive efficiency, and achieve their goals in an ever-changing data landscape.

With Unravel as your trusted partner, you can approach FinOps with confidence, knowing that you have access to the expertise, tools, and support required to succeed.

Next Steps

Schedule a demo with Unravel to discover how it can revolutionize your cost optimization efforts.
Download Eckerson Group’s Governing Cost with FinOps for Cloud Analytics.

The post Unlocking Success with FinOps: Top Insights from Expert Virtual Event appeared first on Unravel.

Harnessing Google Cloud BigQuery for Speed and Scale: Data Observability, FinOps, and Beyond

Stephen Lamont — Thu, 10 Aug 2023 12:05:21 +0000

Data is a powerful force that can generate business value with immense potential for businesses and organizations across industries. Leveraging data and analytics has become a critical factor for successful digital transformation that can accelerate revenue growth and AI innovation. Data and AI leaders enable business insights, product and service innovation, and game-changing technology that helps them outperform their peers in terms of operational efficiency, revenue, and customer retention, among other key business metrics. Organizations that fail to harness the power of data are at risk of falling behind their competitors.

Despite all the benefits of data and AI, businesses face common challenges.

Unanticipated cloud data spend

Fluctuating workloads: Google Cloud BigQuery data processing and storage costs are driven by the amount of data stored and analyzed. With varying workloads, it becomes challenging to accurately estimate the required data processing and storage costs. This unpredictability can result in budget overruns that affect 60% of infrastructure and operations (I&O) leaders.
Unexpected expenses: Streaming data, large amounts of unstructured and semi-structured data, and shared slot pool consumption can quickly drive up cloud data costs. These factors contribute to unforeseen spikes in usage that may catch organizations off guard, leading to unexpected expenses on their cloud bills.
Lack of visibility: Without granular visibility into cloud data analytics billing information, businesses have no way to accurately allocate costs down to the job or user level. This makes it difficult for them to track usage patterns and identify areas where budgets will be over- or under-spent, or where performance and cost optimization are needed.

Budget and staff constraints limit new data workloads

In 2023, CIOs are expecting an average increase of only 5.1% in their IT budgets, which is lower than the projected global inflation rate of 6.5%. Economic pressures, scarcity and high cost of talent, and ongoing supply challenges are creating urgency to achieve more value in less time.

Limited budget and staffing resources can hinder the implementation of new data workloads. For example, “lack of resources/knowledge to scale” is the leading reason preventing IoT data deployments. Budget and staffing resources constraints pose real risks to launching profitable data and AI projects.

Exponential data volume growth for AI

The rapid growth of disruptive technologies such as generative AI, has led to an exponential increase in cloud computing data volumes. However, managing and analyzing massive amounts of data poses significant challenges for organizations.

Data is foundational for AI and much of it is unstructured, yet IDC found most unstructured data is not leveraged by organizations. A lack of production-ready data pipelines for diverse data sources was the second most cited reason (31%) for AI project failure.

Data pipeline failures slow innovation

Data pipelines are becoming increasingly complex, increasing the Mean Time To Repair (MTTR) breaks and delays. Time is a critical factor that pulls skilled and valuable talent into unproductive firefighting. The more time they spend dealing with pipeline issues or failures, the greater the impact on productivity and new innovation.

Manually testing and running release process checklists are heavy burdens for new and growing data engineering teams. With all of the manual toil, it is no surprise that over 70% of data projects in manufacturing stall at Proof of Concept (PoC) stage and do not see sustainable value realization.

Downtime resulting from pipeline disruptions can have a significant negative impact on the service level agreements (SLAs). It not only affects the efficiency of data processing, but also impacts downstream tasks like analysis and reporting. These slowdowns directly affect the ability of team members and business leaders to make timely decisions based on data insights.

Conclusion

Unravel 4.8.1 for BigQuery provides improved visibility to accelerate performance, boost query efficiency, allocate costs, and accurately predict spend. This launch aligns with the recent BigQuery pricing model change. With Unravel for BigQuery, customers can easily choose the best pricing plan to match their usage. Unravel helps you optimize your workloads and get more value from your cloud data investments.

The post Harnessing Google Cloud BigQuery for Speed and Scale: Data Observability, FinOps, and Beyond appeared first on Unravel.

Unravel Data Launches Cloud Data Cost Observability and Optimization for Google Cloud BigQuery

Stephen Lamont — Thu, 10 Aug 2023 12:04:34 +0000

New Functionality Delivers FinOps, AI-driven Cloud Cost Management and Performance Optimization for BigQuery Users

PALO ALTO, CA — August 10, 2023 – Unravel Data, the first AI-enabled data observability and FinOps platform built to address the speed and scale of modern data platforms, today announced the release of Unravel 4.8.1, enabling Google Cloud BigQuery customers to see and better manage their cloud data costs by understanding specific cost drivers, allocation insights, and performance and cost optimization of SQL queries. This launch comes on the heels of the recent BigQuery pricing model change that replaced flat-rate and flex slot pricing with three new pricing tiers, and will help BigQuery customers to implement FinOps in real time to select the right new pricing plan based on their usage, and maximize workloads for greater return on cloud data investments.

As today’s enterprises implement artificial intelligence (AI) and machine learning (ML) models to continually garner more business value from their data, they are experiencing exploding cloud data costs, with a lack of visibility into cost drivers and a lack of control for managing and optimizing their spend. As cloud costs continue to climb, managing cloud spend remains a top challenge for global business leaders. Data management services are the fastest-growing category of cloud service spending, representing 39% of the total cloud bill. Unravel 4.8.1 enables visibility into BigQuery compute and storage spend and provides cost optimization intelligence using its built-in AI to improve workload cost efficiency.

Unravel’s AI-driven cloud cost optimization for BigQuery delivers insights based on Unravel’s deep observability of the job, user, and code level to supply AI-driven cost optimization recommendations for slots and SQL queries, including slot provisioning, query duration, autoscaling efficiencies, and more. With Unravel, BigQuery users can speed cloud transformation initiatives by having real-time cost visibility, predictive spend forecasting, and performance insights for their workloads. BigQuery customers can also use Unravel to customize dashboards and alerts with easy-to-use widgets that offer insights on spend, performance, and unit economics.

“As AI continues to drive exponential data usage, companies are facing more problems with broken pipelines and inefficient data processing which slows time to business value and adds to the exploding cloud data bills. Today, most organizations do not have the visibility into cloud data spend or ways to optimize data pipelines and workloads to lower spend and mitigate problems,” said Kunal Agarwal, CEO and co-founder, Unravel Data. “With Unravel’s built-in AI, BigQuery users have data observability and FinOps in one solution to increase data pipeline reliability and cost efficiency so that businesses can bring even more workloads to the cloud for the same spend.”

“Enterprises are increasingly concerned about lack of visibility into and control of their cloud-related costs, especially for cloud-based analytics projects,” says Kevin Petrie, VP of Research at The Eckerson Group. “By implementing FinOps programs, they can predict, measure, monitor, optimize and account for cloud-related costs related to data and analytics projects.”

At the core of Unravel Data’s platform is its AI-powered Insights Engine, purpose-built for data platforms, which understands all the intricacies and complexities of each modern data platform and the supporting infrastructure to optimize efficiency and performance. The Insights Engine ingests and interprets the continuous millions of ongoing metadata streams to provide real-time insights into application and system performance, and recommendations to optimize costs and performance for operational and financial efficiencies.

Unravel 4.8.1 includes additional features, such as:

Recommendations for baseline and max setting for reservations
Scheduling insights for recurring jobs
SQL insights and anti-patterns
Recommendations for custom quotas for projects and users
Top-K projects, users, and jobs
Showback by compute and storage types, services, pricing plans, etc.
Chargeback by projects and users
Out-of-the-box and custom alerts and dashboards
Project/Job views of insights and details
Side-by-side job comparisons
Data KPIs, metrics, and insights such as size and number of tables and partitions, access by jobs, hot/warm/cold tables

To learn more on how we are helping BigQuery customers optimize their data cost and management, or to partner with Unravel Data, please visit https://www.unraveldata.com/google-cloud-bigquery/.

About Unravel Data

Unravel Data radically transforms the way businesses understand and optimize the performance and cost of their modern data applications – and the complex data pipelines that power those applications. Providing a unified view across the entire data stack, Unravel’s market-leading data observability platform leverages AI, machine learning, and advanced analytics to provide modern data teams with the actionable recommendations they need to turn data into insights. A recent winner of the Best Data Tool & Platform of 2023 as part of the annual SIIA CODiE Awards, some of the world’s most recognized brands like Adobe, Maersk, Mastercard, Equifax, and Deutsche Bank rely on Unravel Data to unlock data-driven insights and deliver new innovations to market. To learn more, visit https://www.unraveldata.com.

Media Contact

Blair Moreland

ZAG Communications for Unravel Data

unraveldata@zagcommunications.com

The post Unravel Data Launches Cloud Data Cost Observability and Optimization for Google Cloud BigQuery appeared first on Unravel.

Announcing Unravel 4.8.1: Maximize business value with Google Cloud BigQuery Editions pricing

Stephen Lamont — Thu, 10 Aug 2023 12:03:50 +0000

Google recently introduced significant changes to its existing BigQuery pricing models, affecting both compute and storage. They announced the end of sale for flat-rate and flex slots for all BigQuery customers not currently in a contract. Google announced an increase to the price of on-demand analysis by 25% across all regions, starting on July 5, 2023.

Main Components of BigQuery Pricing

Understanding the pricing structure of BigQuery is crucial to effectively manage expenses. There are two big components to BigQuery pricing:

Compute (analysis) pricing is the cost to process queries, including SQL queries, user-defined functions, scripts, and certain data manipulation language (DML) and data definition language (DDL) statements
Storage pricing is the cost to store data that you load into BigQuery. Storage options are logical (the default) or physical. If data storage is converted from logical to physical, customers cannot go back to logical storage.

Selecting the appropriate edition and accurately forecasting data processing needs is essential to cloud data budget planning and maximizing the value derived from Google Cloud BigQuery.

Introducing Unravel 4.8.1 for BigQuery

Unravel 4.8.1 for BigQuery includes AI-driven FinOps and performance optimization features and enhancements, empowering Google Cloud BigQuery customers to see and better manage their cloud data costs. Unravel helps users understand specific cost drivers, allocation insights, and performance and cost optimization of SQL queries. New Unravel features are align with the FinOps phases:

Inform

Compute and storage costs
Unit costs and trends for projects, users, and jobs

Optimize

Reservation insights
SQL insights
Data and storage insights
Scheduling insights

Operate

OpenSearch-based alerts on job duration and slot-ms
Alert customization: ability to create custom alerts

Improving visibility, optimizing data performance, and automating spending guardrails can help organizations overcome resource limitations to get more out of their existing data environments.

Visibility into BigQuery compute and storage spend

Getting insights into your cloud data spending starts with understanding your cloud bill. With Unravel, BigQuery users can see their overall spend as well as spending trends for their selected time window, such as the past 30 days.

The cost dashboard shows details and trends, including compute, storage, and services by pricing tier, project, job, and user.

Unravel provides cost analysis, including the average cost of both compute and storage per project, job, and user over time. Compute spending can be further split between on-demand and reserved capacity pricing.

Armed with this detail, BigQuery customers can better understand both infrastructure and pricing tier usage as well as efficiencies by query, user, department, and project. This granular visibility enables accurate, precise cost allocation, trend visualization, and forecasting.

This dashboard provides BigQuery project chargeback details and trends, including a breakdown by compute and storage tier.

With Unravel, BigQuery users can speed cloud transformation initiatives by having real-time cost visibility, predictive spend forecasting, and performance insights for their workloads.

AI-driven cloud cost optimization for BigQuery

At the core of Unravel Data’s data observability and FinOps platform is the AI-powered Insights Engine. It is purpose-built for data platforms—including BigQuery—to understand all the unique aspects and capabilities of each modern data stack and the underlying infrastructure to optimize efficiency and performance.

Unravel’s AI-powered Insights Engine continuously ingests and interprets millions of metadata inputs to provide real-time insights into application and system performance, along with recommendations to improve performance and efficiency for faster results and greater positive business impact for your existing cloud data spend.

Unravel provides insights and recommendations to optimize BigQuery reservations.

Using Unravel’s cost and performance optimization intelligence based on its deep observability at the job, user, and code level, users get recommendations such as:

Reservation sizing that achieves optimal cost efficiency and performance
SQL insights and anti-patterns to avoid
Scheduling insights for recurring jobs
Quota insights with respect to workload patterns
and more

With Unravel, BigQuery customers can speed cloud transformation initiatives by having predictive cost and performance insights of existing workloads prior to moving them to the cloud.

Visualization dashboards and unit costs

Visualizing unit costs not only simplifies cost management but also enhances decision-making processes within your organization. With clear insights into spending patterns and resource utilization, you can make informed choices regarding optimization strategies or budget allocation.

With Unravel, BigQuery customers can customize dashboards and alerts with easy-to-use widgets to enable at-a-glance and drill down dashboards on:

Spend
Performance
Unit economics

Unravel displays user count and cost trends by compute pricing tier.

From a unit economics perspective, BigQuery customers can build dashboards to show unit costs in terms of average cost per user, per project, and per job.

Take advantage of visualization dashboards in Unravel for BigQuery to effortlessly gain valuable insights into unit costs.

Additional features included in this release

Unravel 4.8.1 includes additional features, such as showback/chargeback reports, SQL insights and anti-patterns. You can compare two jobs side-by-side, enabling you to point out any metrics that are different between two runs, even if the queries are different.

With this release, you also get:

Top-K projects, users, and jobs
Showback by compute and storage types, services, pricing plans, etc.
Chargeback by projects and users
Out-of-the-box and custom alerts and dashboards
Project/Job views of insights and details
Side-by-side job comparisons
Data KPIs, metrics, and insights such as size and number of tables and partitions, access by jobs, hot/warm/cold tables

Use case scenarios

Unravel for BigQuery provides a single source of truth to improve collaboration across functional teams and accelerates workflows for common use cases. Below are just a few examples of how Unravel helps BigQuery users for specific situations:

Role	Scenario	Unravel benefits
FinOps Practitioner	Understand what we pay for BigQuery down to the user/app level in real time, accurately forecast future spend with confidence	Granular visibility at the project, job, and user level enables FinOps practitioners to perform cost allocation, estimate annual cloud data application costs, cost drivers, break-even, and ROI analysis
FinOps Practitioner / Engineering / Operations	Identify the most impactful recommendations to optimize overall cost and performance	AI-powered performance and cost optimization recommendations enable FinOps and data teams to rapidly upskill team members, implement cost efficiency SLAs, and optimize BigQuery pricing tier usage to maximize the company’s cloud data ROI
Engineering Lead / Product Owner	Identify the most impactful recommendations to optimize the cost and performance of a project	AI-driven insights and recommendations enable product and data teams to improve slot utilization, boost SQL query performance, leverage table partitioning and column clustering to achieve cost efficiency SLAs and launch more data jobs within the same project budget
Engineering / Operations	Live monitoring with alerts	Live monitoring with alerts speed MTTR and prevent outages before they happen
Data Engineer	Debugging a job and comparing jobs	Automatic troubleshooting guides data teams directly to the pinpoint the source of job failures down to the line of code or SQL query along with AI recommendations to fix it and prevent future issues
Data Engineer	Identify expensive, inefficient, or failed jobs	Proactively improve cost efficiency, performance, and reliability before deploying jobs into production. Compare two jobs side-by-side to find any metrics that are different between the two runs, even if the queries are different.

Get Started with Unravel for BigQuery

Learn more about Unravel for BigQuery by reviewing the docs and creating your own free account.

The post Announcing Unravel 4.8.1: Maximize business value with Google Cloud BigQuery Editions pricing appeared first on Unravel.

Unlocking Cost Optimization: Insights from FinOps Camp Episode #1

Stephen Lamont — Fri, 04 Aug 2023 17:30:57 +0000

With the dramatic increase in the volume, velocity, and variety of data analytics projects, understanding costs and optimizing expenditure is crucial for success. Data teams often face challenges in effectively managing costs, accurately attributing them, and finding ways to enhance cost efficiency. Fortunately, Unravel Data, with its comprehensive platform, addresses these pain points and empowers data teams to unlock their full cost optimization potential—enabling them to run more cloud data workloads without increasing their budget. In this blog, we will delve into the key takeaways from the recent FinOps Camp Episode #1, focusing on the importance of understanding costs, attributing costs, and achieving cost efficiency with Unravel.

Understanding Costs: The Foundation of Cost Optimization

One of the main challenges faced by data teams is gaining deep insights into cost drivers. Traditional observability platforms like Cloudability, or vendor-specific tools like AWS Cost Explorer and Azure Cost Dashboard, often provide limited visibility, focusing solely on infrastructure costs. This lack of granular insights hinders data teams from making informed decisions about resource allocation and identifying areas for cost optimization. Unravel, however, offers a comprehensive dashboard that provides a holistic view of cost breakdowns. This enables data teams to understand exactly where costs are being incurred, facilitating smarter decision-making around resource allocation and optimization efforts.

Accurate Cost Attribution: The Key to Fairness and Transparency

Accurately attributing costs is another hurdle for data teams. Shared services, such as MySQL or Kafka, require meticulous cost allocation to ensure fairness and transparency within an organization. Unravel understands this challenge and provides a solution that simplifies cost allocation. By seamlessly attributing costs to individual users, jobs, teams/departments/lines of business, projects, applications and pipelines, clusters, etc., Unravel enables data teams to break down the proportional costs of shared services. This not only enhances cost tracking but also promotes accountability and fairness within the organization.

Unlocking Cost Efficiency: Recommendations for Optimization

Cost efficiency is the goal of every data team striving for excellence. In the virtual event, Unravel highlighted its powerful feature of identifying inefficiencies within data jobs and providing actionable recommendations for optimization. By analyzing tasks with its AI-powered Recommendation Engine designed specifically for specific platforms like Databricks, Snowflake, BigQuery, and Amazon EMR, Unravel can pinpoint resources that have been oversized and lines of code that contribute to performance bottlenecks. Armed with these insights, data teams can collaborate with developers to address these inefficiencies effectively, resulting in improved resource utilization, reduced costs, and accelerated application performance. Unravel’s proactive optimization recommendations enable data teams to achieve peak cost efficiency and deliver exceptional results.

Operationalizing Unravel: Going From Reactive to Proactive

Unravel’s platform lets data teams go beyond responding to cost and performance inefficiencies after the fact to getting ahead of cost issues beforehand. Unravel empowers ongoing cost governance by enabling team leaders to set up automated guardrails that trigger an alert (or even autonomous “circuit breaker” actions) whenever a predefined threshold is violated—be it projected budget overrun, jobs exceeding a certain size, runtime, cost, etc. Essentially, these automated guardrails take the granular cost allocation information and the AI-powered recommendations and apply them to context-specific workloads to track real-time spending against budgets, prevent runaway jobs and rogue users, identify optimizations to meet SLAs, and nip cost overruns in the bud.

Conclusion: Unleash Your Cost Optimization Potential with Unravel

Understanding costs, attributing costs accurately, and achieving cost efficiency are critical components of any successful data analytics strategy. In the FinOps Camp Episode #1, Unravel showcased its ability to address these concerns and empower data teams to optimize costs effectively. By providing in-depth insights, seamless cost attribution, and proactive optimization recommendations, Unravel enables data teams to understand at a deep level exactly where their cloud data spend is going, predict spending with data-driven accuracy, and optimize data applications/pipelines so organizations can run more high-value workloads within existing budgets. Unravel unlocks your cost optimization potential and maximizes the value of your data analytics efforts. Together, we can transform cost optimization from a challenge into a competitive advantage.

Next Steps:

Schedule a demo with Unravel to discover how it can revolutionize your cost optimization efforts..
Download Eckerson Group’s Governing Cost with FinOps for Cloud Analytics

The post Unlocking Cost Optimization: Insights from FinOps Camp Episode #1 appeared first on Unravel.

Unravel Data Named a Sample Vendor in the Gartner® Hype Cycle™ for Emerging Technologies in Finance, 2023

Stephen Lamont — Thu, 20 Jul 2023 20:43:51 +0000

Unravel Data Recognized in Both Augmented FinOps and Data Observability Categories by Gartner

PALO ALTO, CA — July 20, 2023 – Unravel Data, the first data observability and FinOps platform built to address the speed and scale of modern data platforms, today announced it has been included as a Sample Vendor in the Gartner^® Hype Cycle for Emerging Technologies in Finance, 2023 for both Augmented FinOps and Data Observability. Aimed at helping Chief Financial Officers (CFOs) identify the top emerging technologies shaping the future of finance, this Hype Cycle offers a 10-year outlook on the most relevant technology and data trends, and provides recommendations for CFOs looking to increase flexibility and resiliency while also increasing productivity and profitability.

As more data workloads move to the cloud, IT and financial leaders must increasingly measure and optimize efficiency (both cost and performance). Both augmented financial operations (FinOps) and data observability are crucial elements to this process, helping organizations maximize the business value of their data. According to the report, “data observability is a technology that supports an organizations’ ability to understand the health of an organization’s data, data pipelines, data landscape, and data infrastructure by continuously monitoring, tracking, alerting and troubleshooting issues to reduce and prevent data errors or system downtime.”

The report further states that Augmented FinOps, “applies the traditional DevOps concepts of agility, continuous integration and deployment, and end-user feedback to financial governance, budgeting and cost optimization efforts. Augmented FinOps automates this process through the application of artificial intelligence (AI) and machine learning (ML) practices — predominantly in the cloud — to enable environments that automatically optimize cost based on defined business objectives expressed in natural language.”

“As enterprises continue to move to the cloud and integrate AI into their everyday business practices, there is more demand than ever for visibility into cloud data spend in order to maximize ROI and the impact of AI on their business,” said Kunal Agarwal, CEO and Co-founder of Unravel Data. “Unravel enables companies to visually see how their cloud data spend is trending for accurate forecasting and provides proactive alerts and guardrails to help govern that spend, as well as AI-driven automated suggestions for maximizing TCO of their cloud data platforms — enabling data-forward companies to lead their markets with their AI initiatives.”

The Unravel Data Platform provides organizations with data observability on data workload spending, and an early warning system for alerting on out-of-control spending, while offering an automated way to pinpoint the source of cost overruns. Unravel Data enables each of the three FinOps phases for modern data platforms, such as Databricks. These include:

Inform: Unravel provides granular visibility required to manage allocation, consumption efficiency, and charge/showback at the job, user and workgroup levels
Optimize: Unravel maximizes business impact of available budget and resources by optimizing data workloads to perform to required SLAs at most efficient cost
Operate: Unravel delivers ongoing, continuous improvement through cost governance and self-service optimization to reliably predict costs and maximize value (ROI)

At the core of Unravel Data’s platform is its AI-powered Recommendation Engine, which understands all the intricacies and complexities of each modern data platform and the supporting infrastructure to optimize efficiency.

For more information on how Unravel Data is helping organizations around the world control cloud costs, please visit https://www.unraveldata.com/platform/finops-data-teams/.

Gartner, Hype Cycle for Emerging Technologies in Finance, 2023, By Mark D. McDonald, Published 11 July 2023

GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, and HYPE CYCLE is a registered trademark of Gartner, Inc. and/or its affiliates and are used herein with permission. All rights reserved.

Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

About Unravel Data

Unravel Data radically transforms the way businesses understand and optimize the performance and cost of their modern data applications – and the complex data pipelines that power those applications. Providing a unified view across the entire data stack, Unravel’s market-leading data observability platform leverages AI, machine learning, and advanced analytics to provide modern data teams with the actionable recommendations they need to turn data into insights. A recent winner of the Best Data Tool & Platform of 2023 as part of the annual SIIA CODiE Awards, some of the world’s most recognized brands like Adobe, Maersk, Mastercard, Equifax, and Deutsche Bank rely on Unravel Data to unlock data-driven insights and deliver new innovations to market. To learn more, visit https://www.unraveldata.com.

Media Contact

Blair Moreland

ZAG Communications for Unravel Data unraveldata@zagcommunications.com

The post Unravel Data Named a Sample Vendor in the Gartner^® Hype Cycle™ for Emerging Technologies in Finance, 2023 appeared first on Unravel.

Unravel Data Recognized by SIIA as Best Data Tool & Platform at 2023 CODiE Awards

Stephen Lamont — Wed, 21 Jun 2023 21:58:08 +0000

Unravel Data Observability Platform earns prestigious industry recognition for Best Data Tool & Platform

Palo Alto, CA — June 21, 2023 —Unravel Data, the first data observability platform built to meet the needs of modern data teams, was named Best Data Tool & Platform of 2023 as part of the annual SIIA CODiE Awards. The prestigious CODiE Awards recognize the companies producing the most innovative Business Technology products across the country and around the world.

“We are deeply honored to win a CODiE Award for Best Data Tool & Platform. Today, as companies put data products and AI/ML innovation front and center of their growth and customer service strategies, the volume of derailed projects and the costs associated are ascending astronomically. Companies need a way to increase performance of their data pipelines and a way to manage costs for effective ROI,” said Kunal Agarwal, CEO and co-founder, Unravel Data. “The Unravel Data Platform brings pipeline performance management and FinOps to the modern data stack. Our AI-driven Insights Engine provides recommendations that allow data teams to make smarter decisions that optimize pipeline performance along with the associated cloud data spend, making innovation more efficient for organizations.”

Cloud-first companies are seeing cloud data costs exceed 40% of their total cloud spend. These organizations lack the visibility into queries, code, configurations, and infrastructure required to manage data workloads effectively, which in turn, leads to over-provisioned capacity for data jobs, an inability to quickly detect pipeline failures and slowdowns, and wasted cloud data spend.

“The 2023 Business Technology CODiE Award Winners maintain the vital legacy of the CODiEs in spotlighting the best and most impactful apps, services and products serving the business tech market,” said SIIA President Chris Mohr. “We are so proud to recognize this year’s honorees – the best of the best! Congratulations to all of this year’s CODiE Award winners!”

The Software & Information Industry Association (SIIA), the principal trade association for the software and digital content industries, announced the full slate of CODiE winners during a virtual winner announcement. Awards were given for products and services deployed specifically for education and learning professionals, including the top honor of the Best Overall Business Technology Solution.

A SIIA CODiE Award win is a prestigious honor, following rigorous reviews by expert judges whose evaluations determined the finalists. SIIA members then vote on the finalist products, and the scores from both rounds are tabulated to select the winners.

Details about the winning products can be found at https://siia.net/codie/2023-codie-business-technology-winners/.

Learn more about Unravel’s award-winning data observability platform.

About the CODiE Awards

The SIIA CODiE Awards is the only peer-reviewed program to showcase business and education technology’s finest products and services. Since 1986, thousands of products, services and solutions have been recognized for achieving excellence. For more information, visit siia.net/CODiE.

About Unravel Data

Unravel Data radically transforms the way businesses understand and optimize the performance and cost of their modern data applications – and the complex data pipelines that power those applications. Providing a unified view across the entire data stack, Unravel’s market-leading data observability platform leverages AI, machine learning, and advanced analytics to provide modern data teams with the actionable recommendations they need to turn data into insights. Some of the world’s most recognized brands like Adobe, Maersk, Mastercard, Equifax, and Deutsche Bank rely on Unravel Data to unlock data-driven insights and deliver new innovations to market. To learn more, visit https://www.unraveldata.com.

Media Contact

Blair Moreland
ZAG Communications for Unravel Data
unraveldata@zagcommunications.com

The post Unravel Data Recognized by SIIA as Best Data Tool & Platform at 2023 CODiE Awards appeared first on Unravel.

AI-Driven Observability for Snowflake

Stephen Lamont — Wed, 21 Jun 2023 19:29:44 +0000

AI-DRIVEN DATA OBSERVABILITY + FINOPS FOR SNOWFLAKE

Performance. Reliability. Cost-effectiveness.

Unravel’s automated, AI-powered data observability + FinOps platform for Snowflake and other modern data stacks provides 360° visibility to allocate costs with granular precision, accurately predict spend, run 50% more workloads at the same budget, launch new apps 3X faster, and reliably hit greater than 99% of SLAs.

Unravel Data Observability + FinOps for Snowflake you can:

Launch new apps 3X faster: End-to-end observability of data-native applications and pipelines. Automatic improvement of performance, cost efficiency, and reliability.
Run 50% more workloads for same budget: Break down spend and forecast accurately. Optimize apps and platforms by eliminating inefficiencies. Set guardrails and automate governance. Unravel’s AI helps you implement observability and FinOps to ensure you achieve efficiency goals.
Reduce firefighting time by 99% using AI-enabled troubleshooting: Detect anomalies, drift, skew, missing and incomplete data end-to-end. Integrate with multiple data quality solutions. All in one place.
Forecast budget with ⨦ 10% accuracy: Accurately anticipate cloud data spending to for more predictable ROI. Unravel helps you accurately forecast spending with granular cost allocation. Purpose-built AI, at job, user and workgroup levels, enables real-time visibility of ongoing usage.

To see Unravel Data for Snowflake in action contact us today!

The post AI-Driven Observability for Snowflake appeared first on Unravel.

Logistics giant optimizes cloud data costs up front at speed & scale

Stephen Lamont — Tue, 20 Jun 2023 16:17:40 +0000

One of the world’s largest logistics companies leverages automation and AI to empower every individual data engineer with self-service capability to optimize their jobs for performance and cost. The company was able to cut its cloud data costs by 70% in six months—and keep them down with automated 360° cost visibility, prescriptive guidance, and guardrails for its 3,000 data engineers across the globe. The company pegs the ROI of Unravel at 20X: “for every $1 we invested, we save 20.”

Key Results

20X ROI from Unravel
cut costs by 70% in 6 months
75% time savings via automation
proactive guardrails to keep costs within budgets
automated AI health checks in CI/CD prevent inefficiencies in production

Holding individuals accountable for cloud usage/cost

Like many organizations moving their data workloads to the cloud, the company soon found that its cloud data costs were very rapidly rising to unacceptable levels. Data analytics are core to the business, but the cost of its cloud data workloads was simply getting too unpredictable and expensive. Cloud data expenses had to be brought under control.

The company chose Unravel to enable a shift-left approach where data engineers become more aware and individually accountable for their cloud usage/spending, and are given the means to make better, more cost-effective decisions when incurring expenses.

Data is core to the business

The company is increasingly doing more things with more data for more reasons. Says its Head of Data Platform Optimization, “Data is pervasive in logistics. Data is literally at the center of pretty much everything [we do]. Picking up goods to transport them, following the journeys of those goods, making all the details of those journeys available to customers. Our E Class ships can take 18,000 shipping containers on one journey from, say, China to Europe. One journey on one of those ships moves more goods than was moved in the entire 19th century between continents. One journey. And we’ve got six of them going back and forth all the time.”

But the company also uses data to drive innovation in integrated logistics, supply chain resiliency, and corporate social responsibility. “[We’re] a company that doesn’t just use data to figure out how to make money, we use data to better the company, make us more profitable, and at the same time put back into the planet.

“The data has risen exponentially, and we’re just starting to come to grips with what we can do with it. For example, in tandem with a couple of nature organizations, we worked out that if a ship hits a whale at 12 knots and above, that whale will largely die. Below 12 knots, it will live. We used the data about where the whales were to slow the ships down.”

Getting visibility into cloud data costs

The single biggest obstacle to controlling cloud costs for any data-forward organization is having only hazy visibility into cloud usage. The company saw its escalating cloud data platform costs as an efficiency issue—how efficiently the company’s 3,000 “relatively young and inexperienced” data engineers were running their jobs.

Says the company’s Head of Data Platform Optimization, “We’ve been moving into the cloud over the past 3-4 years. Everybody knows that [the] cloud isn’t free. There’s not a lot of altruism there from the cloud providers. So that’s the biggest issue we faced. We spent 12 months deploying a leading cloud data platform, and at the end of 12 months, the platform was working fine but the costs were escalating.

“The problem with that is, if you don’t have visibility on those costs, you can’t cut those costs. And everybody—no matter what your financial situation—wants to cut costs and keep them down. We had to attain [cost] visibility. Unravel gives us the visibility, the insight, to solve that problem.”

“The [cloud data] platform was working fine but the costs were escalating. If you don’t have visibility on those costs, you can’t cut those costs.”

Get costs right in Dev, before going into production

The logistics company emphasizes that you have to get it right for cost and performance up front, in development. “Don’t ever end up with a cost problem. That’s part of the [shifting] mindset. Get in there early to deal with cost. Go live with fully costed jobs. Don’t go live and then work out what the job cost is and figure out how to cut it. [Determine] what it’s going to cost in Dev/Test, what it’s going to cost in Prod, then check it as soon as it goes live. If the delta’s right, game on.”

As the company’s data platform optimization leader points out, “Anybody can spin up a cloud environment.” Quite often their code and resource configurations are not optimized. Individual engineers may be requesting oversized resources (size, number, type) than what they actually need to run their jobs successfully, or they have code issues that are leading to inefficient performance—and jobs costing more than they need to.

“The way to deal with this [escalating cost] problem is to push it left. Don’t have somebody charging in from Finance waving a giant bill saying, ‘You’re costing a fortune.’ Let’s keep Finance out of the picture. And crucial to this is: Do it up front. Do it in your Dev environment. Don’t go into production, get a giant bill, and only then try to figure out how to cut that.”

Unravel AI automatically identifies inefficient code, oversized resources, data partitioning problems, and other issues that lead to higher-than-necessary cloud data costs.

“One of the big problems with optimizing jobs is the sheer scale of what we’re talking about. We have anywhere between 5,000-7,500 data pipelines. You’re not just looking for a needle in a haystack . . . first of all, you have to find the haystack. Then you have to learn how to dig into it. That’s an awful lot of code for human beings to look at, something that machines are perfectly suited to. And Unravel is the best implementation we’ve seen of its kind.”

The Unravel platform harnesses full-stack visibility, contextual awareness, AI-powered actionable intelligence, and automation to go “beyond observability”—to not only show you what’s going on and why, but guide you with crisp, prescriptive recommendations on exactly how to make things better and then keep them that way proactively. (See the Unravel platform overview page for more detail.)

“We put Unravel right in the front of our development environment. So nothing goes into production unless we know it’s going to work at the right cost/price. We make sure problems never reach production. We cut them off at the pass, so to speak. Because otherwise, you’ve just invented the world’s best mechanism for closing the stable door after the cost horse has bolted.”

Empower self-service via immediate feedback loops

The company used to outsource a huge amount of its data workloads but is now moving to become an open source–first, built-in-house company. A key part of the company’s strategy is to enable strong engineering practices, design tenets (of which cost is one), and culture. For data platform optimization, that means empowering every data engineer with the insights, guidance, and guardrails to optimize their code so that workloads run highly efficiently and cost is not an afterthought.

“We’ve got approximately 3,000 people churning out Spark code. In a ‘normal environment,’ you can ask the people sitting next to you how they’d do something. We’ve had thousands of engineers working from home for the past two years. So how do you harvest that group knowledge and how do people learn?

“We put Unravel in to look at and analyze every single line of code written, and come up with those micro-suggestions—and indeed macro-suggestions—that you’d miss. We’ve been through everything like code walk-throughs, code dives, all those things that are standard practice. But if you have a couple of thousand engineers writing, say, 10 lines of code a day, you’ll never be able to walk through all that code.”

That’s where Unravel’s high degree of automation and AI really help. Unravel auto-discovers and captures metadata from every platform, system, and application across the company’s data stack, correlates it all into a meaningful workload-aware context, and automatically analyzes everything to pinpoint inefficiencies and offer up AI-powered recommendations to guide engineers on how to optimize their jobs.

“We put Unravel right in the front of our development environment to look at and analyze every single line of code written and come up with suggestions [to improve efficiency].”

“Data engineers hate fixing live problems. Because it’s boring! And they want to be doing the exciting stuff, keep developing, innovating. So if we can stop those problems at Dev time, make sure they deploy optimal code, it’s a win-win. They never have to fix that production code, and honestly we don’t have to ask them to fix it.”

The company leverages Unravel’s automated AI analysis to up-level its thousands of developers and engineers worldwide. Optimizing today’s complex data applications/pipelines—for performance, reliability, and cost—requires a deeper level of data engineering.

“Because Unravel takes data from lots of other organizations, we’re harvesting the benefits of hundreds of thousands of coders and data engineers globally. We’re gaining the insights we couldn’t possibly get by being even the best at self-analysis.

“The key for me is to be able to go back to an individual data engineer and say, ‘Did you realize that if you did your code this way, you’d be 10 times more efficient?’ And it’s about giving them feedback that allows them to learn themselves. What I love about Unravel is that you get the feedback, but it’s not like they’re getting pulled into an office and having ‘a talk’ about those lines of code. You go into your private workspace, [Unravel] gives you the suggestions, you deal with the suggestions, you learn, you move on and don’t make the mistakes again. And they might not even be mistakes; they might just be things you didn’t know about. What we’re finding with Unravel is that it’s sometimes the nuances that pop up that give you the benefits. It’s pivotal to how we’re going to get the benefits, long term, out of what we’re doing.”

Efficiency improvements cut cloud data costs by 70%

The company saw almost immediate business value from Unravel’s automated AI-powered analysis and recommendations. “We were up and running within 48 hours. Superb professional services from Unravel, and a really willing team of people from our side. It’s a good mix.

The company needed to get cloud data costs under control—fast. More and more mission-critical data workloads were being developed on a near-constant cadence, and these massive jobs were becoming increasingly expensive. Unravel enabled the company to get ahead of its cloud data costs at speed and scale, saving millions.

“We started in the summer, and by the time Christmas came around, we had cut in excess of 70% of our costs. I’d put the ROI of Unravel at about 20X: every $1 we invested, we save $20.”

The company has been able to put into individual developers’ and engineers’ hands a tool to make smarter, data-driven decisions about how they incur cloud data expenses.

“What I say to new data engineers is that we will empower them to create the best systems in the world, but only you can empower yourself to make them the most efficient systems in the world. Getting data engineers to actually use Unravel was not a difficult task. We’re very lucky: people on our team are highly motivated to do the right thing—by the company, by themselves. If doing the right thing becomes the default option, people will follow that path.

“Unravel makes it easy to do the right thing.”

The post Logistics giant optimizes cloud data costs up front at speed & scale appeared first on Unravel.

Demystifying Data Observability

Stephen Lamont — Fri, 16 Jun 2023 02:44:21 +0000

Check out the 2023 Intellyx Analyst Guide for Unravel, Demystifying Data Observability, for an independent discussion on the specific requirements and bottlenecks of data-dependent applications/pipelines that are addressed by data observability.

Discover:

Why DataOps needs its own observability
How DevOps and DataOps are similar–and how they’re very different
How the emerging discipline of DataFinOps is more than cost governance
Unique considerations of DataFinOps
DataOps resiliency and tracking down toxic workloads

The post Demystifying Data Observability appeared first on Unravel.

Equifax Optimizes GCP Costs at Scale

Stephen Lamont — Mon, 12 Jun 2023 16:49:28 +0000

The post Equifax Optimizes GCP Costs at Scale appeared first on Unravel.

Managing FinOps at Equifax

Stephen Lamont — Mon, 12 Jun 2023 16:49:11 +0000

The post Managing FinOps at Equifax appeared first on Unravel.

Governing Costs with FinOps for Cloud Analytics

Christine Della Penna — Tue, 02 May 2023 18:49:05 +0000

Check out the latest white paper from Eckerson Group: Governing Cost with FinOps for Cloud Analytics: Program Elements, Use Cases, and Principles by VP of Research Kevin Petrie.

Discover how to turn cloud usage-based pricing to your advantage.
Learn how data observability tools can help you pay for only what you use, and only use what you need.
Read about FinOps uses cases, including business, data, and IT.
Read how cross-functional teams govern cloud costs as they forecast, monitor, and account for resources.
Explore the FinOps lifecycle, including design, operation, and optimization.

Download this actionable white paper today and discover how a well-implemented FinOps program can drive measurable, achievable ROI for cloud-based analytics projects.

Get your free copy of the white paper.

The post Governing Costs with FinOps for Cloud Analytics appeared first on Unravel.

Data observability’s newest frontiers: DataFinOps and DataBizOps

Christine Della Penna — Thu, 20 Apr 2023 20:02:20 +0000

Check out Sanjeev Mohan’s, Principal, SanjMo & Former Gartner Research VP, Big Data and Advanced Analytics, chapter on “Data observability’s newest frontiers: DataFinOps and DataBizOps” in the book, Data Observability, The Reality.

What you’ll learn from the former VP of Research at Gartner:

DataFinOps defined
Why you need DataFinOps
The challenges of DataFinOps
DataFinOps case studies & more!

Don’t miss out on your chance to read this chapter and gain valuable insights from a top industry leader.

The post Data observability’s newest frontiers: DataFinOps and DataBizOps appeared first on Unravel.

Data Observability, The Reality eBook

Christine Della Penna — Tue, 18 Apr 2023 14:07:13 +0000

Thrilled to announce that Unravel is contributing a chapter in Ravit Jain’s new ebook, Data Observability, The Reality.

What you’ll learn from reading this ebook:

What data observability is and why it’s important
Identify key components of an observability framework
Learn how to design and implement a data observability strategy
Explore real-world use cases and best practices for data observability
Discover tools and techniques for monitoring and troubleshooting data pipelines

Don’t miss out on your chance to read our chapter, “Automation and AI Are a Must for Data Observability,” and gain valuable insights from top industry leaders such as Sanjeev Mohan and others.

The post Data Observability, The Reality eBook appeared first on Unravel.

DataFinOps: Holding individuals accountable for their own cloud data costs

Stephen Lamont — Mon, 06 Mar 2023 20:32:16 +0000

Most organizations spend at least 37% (sometimes over 50%) more than they need to on their cloud data workloads. A lot of costs are incurred down at the individual job level, and this is usually where there’s the biggest chunk of overspending. Two of the biggest culprits are oversized resources and inefficient code. But for an organization running 10,000s or 100,000s of jobs, finding and fixing bad code or right-sizing resources is shoveling sand against the tide. Too many jobs taking too much time and too much expertise. That’s why more and more organizations are progressing to the next logical step, to DataFinOps, and leveraging observability, automation, and AI to find–and ideally, fix–all those thousands of places where costs could and should be optimized. This is the stuff that without AI takes even experts hours or days (even weeks) to figure out. In a nutshell, DataFinOps empowers “self-service” optimization, where AI does the heavy lifting to show people exactly what they need to do to get cost right from the get-go.

DataFinOps and job-level “spending decisions”

One of the highly effective core principles of DataFinOps–the overlap and marriage between DataOps and FinOps–is the idea of holding every individual accountable for their own cloud usage and cost. Essentially, shift cost-control responsibility left, to the people who are actually incurring the expenses.

Now that’s a big change. The teams developing and running all the data applications/pipelines are, of course, still on the hook for delivering reliable results on time, every time–but now cost is right up there with performance and quality as a co-equal SLA. But it’s a smart change: let’s get the cost piece right, everywhere, then keep it that way.

At any given time, your organization probably has hundreds of people running 1,000s of individual data jobs in the cloud–building a Spark job, or Kafka, or doing something in dbt or Databricks. And the meter is always ticking, all the time. How high that meter runs depends a lot on thousands of individual data engineering “spending decisions” about how a particular job is set up to run. In our experience with data-forward organizations over the years, as much as 60% of their cost savings were found by optimizing things at the job level.

Enterprises have 100,000s of places where cloud data spending decisions need to be made.

When thinking about optimizing cloud data costs at the job level, what springs to mind immediately is infrastructure. And running oversized/underutilized cloud resources is a big problem–one that the DataFinOps approach is especially good at tackling. Everything runs on a machine, sooner or later, and data engineers have to make all sorts of decisions about the number, size, type of machine they should request, how much memory to call for, and a slew of other configuration considerations. And every decision carries a price tag.

There’s a ton of opportunity to eliminate inefficiency and waste here. But the folks making these spending decisions are not experts at making these decisions. Very few people are. Even a Fortune 500 enterprise could probably count on one hand the number of experts who can “right-size” all the configuration details accurately. There just aren’t enough of these people to tackle the problem at scale.

But it’s not just “placing the infrastructure order” that drives up the cost of cloud data operations unnecessarily. Bad code costs money, but finding and fixing it takes–again–time and expertise.

So, we have the theory of FinOps kind of hitting a brick wall when it comes into contact with the reality of today’s data estate operations. We want to hold the individual engineers accountable for their cloud usage and spend, but they don’t have the information at their fingertips to be able to use and spend cloud resources wisely.

In fact, “getting engineers to take action on cost optimization” remains the #1 challenge according to the FinOps Foundation’s State of FinOps 2023 survey. But that stat masks just how difficult it is. And to get individuals to make the most cost-effective choices about configuring and running their particular jobs–which is where a lot of the money is being spent–it has to be easy for them to do the right thing. That means showing them what the right thing is. Otherwise, even if they knew exactly what they were looking for, sifting through thousands of logs and cloud vendor billing data is time-sucking toil. They don’t need more charts and graphs that, while showing a lot of useful information, still leave it to you to figure out how to fix things.

That’s where DataFinOps comes in. Combining end-to-end, full-stack observability data (at a granular level), a high degree of automation, and AI capabilities, DataFinOps identifies a wide range of cost control opportunities–based on, say, what resources you have allocated vs. what resources you actually need to run the job successfully–then automatically provides a prescriptive AI-generated recommendation on what, where, how to make things more cost-effective.

Using AI to solve oversized resources and inefficient code at the job level

DataFinOps uses observability, automation, and AI to do a few things. First, it collects all sorts of information that is “hidden in plain sight”–performance metrics, logs, traces, events, and metadata from the dozens of components you have running in your modern data stack, at both the job (Spark, Kafka) and platform (Databricks, Snowflake, BigQuery, Amazon EMR, etc.) level; details from the cloud vendors about what is running, where, and how much it costs; details about the datasets themselves, including lineage and quality–and then stitches it all together into an easily understandable, correlated context. Next, DataFinOps throws some math at all that data to analyze usage and cost, hundreds of AI algorithms and ML models that have been well trained over the years. AI can do this kind of investigation, or detective work, faster and more accurately than humans. Especially at scale, when we’re looking at thousands and thousands of individual spending decisions.

Two areas where AI helps individuals cut to the chase and actually do something about eliminating overspending at the job level–which is where the biggest and longer-term cost savings opportunities can be found–are oversized resources and inefficient code.

Watch our DataFinOps virtual discussion on demand (no form to fill out)
Watch now

Oversized resources are simply where somebody (or lots of somebodies) requested more resources from the cloud provider than are actually needed to run the job successfully. What DataFinOps AI does is analyze everything that’s running, pinpoint among the thousands of jobs where the number, size, type of resources you’re using is more than you need, calculate the cost implications, and deliver up a prescriptive recommendation (in plain English) for a more cost-effective configuration.

For example, in the screenshot below, the AI has identified more cost-effective infrastructure for this particular job at hand, based on real-time data from your environment. The AI recommendation specifies exactly what type of resource, with exactly how much memory and how many cores, would be less expensive and still get the job done. It’s not quite a self-healing system, but it’s pretty close.

Code problems may be an even more insidious contributor to overspending. Workloads are constantly moving from on-prem Cloudera/Hadoop environments to the cloud. There’s been an explosion of game-changing technologies and platforms. But we’ve kind of created a monster: everybody seems to be using a different system, and it’s often different from the one they used yesterday. Everything’s faster, and there’s more of it. Everybody is trying to do code checks as best they can, but the volume and velocity of today’s data applications/pipelines make it a losing battle.

DataFinOps pinpoints where in all that code there are problems causing cloud costs to rise unnecessarily. The ML model recognizes anti-patterns in the code–it’s learned from millions of jobs just like this one–and flags the issue.

In the example below, there’s a slow SQL stage: Query 7 took almost 6½ minutes. The AI/ML has identified this as taking too long, an anomaly or at least something that needs to be corrected. It performs an automated root cause analysis to point directly to what in the code needs to be optimized.

Every company that’s running cloud data workloads is already trying to crack this nut of how to empower individual data engineers with this kind of “self-service” ability to optimize for costs themselves. It’s their spending decisions here at the job level that have such a big impact on overall cloud data costs. But you have to have all the data, and you have to have a lot of automation and AI to make self-service both easy and impactful.

AI recommendations show the individual engineers what to do, so being held accountable for cloud usage and cost is no longer the showstopper obstacle. From a wider team perspective, the same information can be rolled up, sliced-and-diced, into a dashboard that gives a clear picture on the overall state of cost optimization opportunities. You can see how you’re doing (and what still needs to be done) with controlling costs at the cluster or job or even individual user level.

Bottom line: The best way to control cloud data costs in a meaningful way, at scale and speed, with real and lasting impact, is the DataFinOps approach of full-stack observability, automation, and AI.

The post DataFinOps: Holding individuals accountable for their own cloud data costs appeared first on Unravel.

DataFinOps: More on the menu than data cost governance

Christine Della Penna — Fri, 24 Feb 2023 20:14:25 +0000

By Jason English, Principal Analyst, Intellyx
Part 3 in the Demystifying Data Observability Series, by Intellyx for Unravel Data

IT and data executives find themselves in a quandary about deciding how to wrangle an exponentially increasing volume of data to support their business requirements – without breaking an increasingly finite IT budget.

Like an overeager diner at a buffet who’s already loaded their plate with the cheap carbs of potatoes and noodles before they reach the protein-packed entrees, they need to survey all of the data options on the menu before formulating their plans for this trip.

In our previous chapters of this series, we discussed why DataOps needs its own kind of observability, and then how DataOps is a natural evolution of DevOps practices. Now there’s a whole new set of options in the data observability menu to help DataOps teams track the intersection of value and cost.

From ROI to FinOps

Executives can never seem to get their fill of ROI insights from IT projects, so they can measure bottom-line results or increase top-line revenue associated with each budget line item. After all, predictions about ROI can shape the perception of a company for its investors and customers.

Unfortunately, ROI metrics are often discussed at the start of a major technology product or services contract – and then forgotten as soon as the next initiative gets underway.

The discipline of FinOps burst onto the scene over the last few years, as a strategy to address the see-saw problem of balancing the CFO’s budget constraints with the CIO’s technology delivery requirements to best meet the current and future needs of customers and employees.

FinOps focuses on improving technology spending decisions of an enterprise using measurements that go beyond ROI, to assess the value of business outcomes generated through technology investments.

Some considerations frequently seen on the FinOps menu include:

Based on customer demand or volatility in our consumption patterns, should we buy capacity on-demand or reserve more cloud capacity?
Which FinOps tools should we buy, and what functionality should we build ourselves, to deliver this important new capability?
Which cloud cost models are preferred for capital expenditures (capex) projects and operational expenditures (opex)?
What is the potential risk and cost of known and unknown usage spikes, and how much should we reasonably invest in analysts and tools for preventative purposes?

As a discipline, FinOps has come a long way, building communities of interest among expert practitioners, product, business, and finance teams as well as solution providers through its own FinOps Foundation and instructional courses on the topic.

FinOps + DataOps = DataFinOps?

Real-time analytics and AI-based operational intelligence are enabling revolutionary business capabilities, enterprise-wide awareness, and innovative machine learning-driven services. All of this is possible thanks to a smorgasbord of cloud data storage and processing, cloud data lakes, cloud data warehouse, and cloud lakehouse options.

Unfortunately, the rich streams of data required for such sophisticated functionality bring along the unwanted side effect of elastically expanding budgetary waistbands, due to ungoverned cloud storage and compute consumption costs. Nearly a third of all data science projects go more than 40% over budget on cloud data, according to a recent survey–a huge delta between cost expectations and reality.

How can better observability into data costs help the organization wring more value from data assets without cutting into results, or risking cost surprises?

As it turns out, data has its own unique costs, benefits, and value considerations. Combining the disciplines of FinOps and DataOps – which I’ll dub DataFinOps just for convenience here – can yield a unique new set of efficiencies and benefits for the enterprise’s data estate.

Some of unique considerations of DataFinOps:

Which groups within our company are the top spenders on cloud data analytics, and is anything anomalous about their spending patterns versus the expected budgets?
What is the value of improving data performance or decreasing the latency of our data estate by region or geography, in order to improve local accuracy, reduce customer and employee attrition and improve retention?
If we are moving to a multi-cloud, hybrid approach, what is an appropriate and realistic mix of reserved instances and spot resources for processing data of different service level agreements (SLAs)?
Where are we paying excessive ingress / egress fees within our data estate? Would it be more cost effective to process data near the data or move our data elsewhere?
How much labor do our teams spend building and maintaining data pipelines, and what is that time worth?
Are cloud instances being intelligently right-sized and auto-scaled to meet demand?

Systems-oriented observability platforms such as DataDog and Dynatrace can measure system or service level telemetry, which is useful for a DevOps team looking at application-level cloud capacity and cost/performance ratios. Unfortunately these tools do not dig into enough detail to answer data analytics-specific FinOps questions.

Taming a market of data options

Leading American grocery chain Kroger launched its 84.51° customer experience and data analytics startup to provide predictive data insights and precision marketing for its parent company and other retailers, across multiple cloud data warehouses such as Snowflake and Databricks, using data storage in multiple clouds such as Azure and GCP.

Using the Unravel platform for data observability, they were able to get a grip on data costs and value across multiple data platforms and clouds without having to train up more experts on the gritty details of data job optimization within each system.

“The end result is giving us tremendous visibility into what is happening within our platforms. Unravel gave recommendations to us that told us what was good and bad. It simply cut to the chase and told us what we really needed to know about the users and sessions that were problematic. It not only identified them, but then made recommendations that we could test and implement.”

– Jeff Lambert, Vice President Engineering, 84.51°

It’s still early days for this transformation, but a data cost reduction of up to 50% would go a long way toward extracting value from deep customer analytics, as transaction data volumes continue to increase by 2x or 3x a year as more sources come online.

The Intellyx Take

It would be nice if CFOs could just tell CIOs and CDOs to simply stop consuming and storing so much data, and have that reduce their data spend. But just like in real life, crash diets will never produce long-term results, if the ‘all-you-can-eat’ data consumption pattern isn’t changed.

The hybrid IT underpinnings of advanced data-driven applications evolves almost every day. To achieve sustainable improvements in cost/benefit returns on data, analysts and data scientists would have to become experts on the inner workings of each public cloud and data warehousing vendor.

DataFinOps practices should encourage data team accountability for value improvements, but more importantly, it should give them the data observability, AI-driven recommendations, and governance controls necessary to both contain costs, and stay ahead of the organization’s growing business demand for data across hybrid IT data resources and clouds.

The post DataFinOps: More on the menu than data cost governance appeared first on Unravel.

Three Companies Driving Better Business Outcomes from Data Analytics

Stephen Lamont — Thu, 09 Feb 2023 19:00:08 +0000

You’re unlikely to be able to build a business without data.

But how can you use it effectively?

There are so many ways you can use data in your business, from creating better products and services for customers, to improving efficiency and reducing waste.

Enter data observability. Using agile development practices, companies can create, deliver, and optimize data products, quickly and cost-effectively. Organizations can easily identify a problem to solve and then break it down into smaller pieces. Each piece is then assigned to a team that breaks down the work to solve the problem into a defined set of time – usually called a sprint – that includes planning, work, deployment, and review.

Marion Shaw, Head of Data and Analytics at Chaucer Group, and Unravel’s Ben Cooper presented on transforming data analytics to build better products and services and on making data analytics more efficient, respectively, at a recent Chief Disruptor virtual event.

Following on from the presentation, members joined a roundtable discussion and took part in a number of polls, in order to share their experiences. Here are just some examples of how companies have used data analytics to drive innovation:

Improved payments. An influx of customer calls asking “where’s my money or payment?” prompted a company to introduce a “track payments” feature as a way of digitally understanding the payment status. As a result, the volume of callers decreased, while users of the new feature actually eclipsed the amount of original complaints, which proved there was a batch of customers who couldn’t be bothered to report problems but still found the feature useful. “If you make something easy for your customers, they will use it.“

Cost reduction and sustainability: Moving from plastics to paper cups improved cost reduction and sustainability for one company, showing how companies can use their own data to make business decisions.

New products: Using AI in drug discovery, data collaboration, to explore disease patterns can help pharmaceutical companies find new treatments for diseases with the potential for high returns. Cost of discovery is as expensive as using big data sets.

The key takeaways from the discussion were:

Make it simple. When you make an action easy for your customers, they will use it.
Lean on the data. If there isn’t data behind someone’s viewpoint, then it is simply an opinion.
Get buy-in. Data teams need to buy into the usage of data—just because a data person owns the data side of things does not mean that they are responsible for the benefits or failings of it.

Using data analytic effectively with data observability is key. Companies across all industries are using data observability to create better products and services, reduce waste, and to improve productivity.

But data observability is no longer just about data quality or observing the condition of the data itself. Today it encompasses much more, and you can’t “borrow” your software teams’ observability solution. Discover more in our report DataOps Observability: The Missing Link for Data Teams.

The post Three Companies Driving Better Business Outcomes from Data Analytics appeared first on Unravel.

Maximize Business Results with FinOps

Christine Della Penna — Thu, 09 Feb 2023 17:32:33 +0000

As organizations run more data applications and pipelines in the cloud, they look for ways to avoid the hidden costs of cloud adoption and migration. Teams seek to maximize business results through cost visibility, forecast accuracy, and financial predictability.

Watch the breakout session video from Data Teams Summit and see how organizations apply agile and lean principles using the FinOps framework to boost efficiency, productivity, and innovation. Transcript available below.

Transcript

Clinton Ford:

Hi, and welcome to this session, Maximize Business Results with FinOps. I’m Clinton Ford, director of Product Marketing at Unravel, and I’m joined today by Thiago Gil, an ambassador from the FinOps Foundation and a KubeCon, Cloudnative Con 2021/2022 Kubernetes AI Day Program Committee member. Great to have you with us today, Thiago. Thank you.

Now, if you have any questions during our session, please feel free to put those in the Q&A box, or visit Unravel booth after this session. Happy to answer your questions.
So today, we’ll talk about some of the challenges that you face with cloud adoption and how FinOps empowers you and your team to harness those investments, maximize business results, and we’ll share some success stories from companies who are applying these principles. Then Thiago is going to share the state of production machine learning.

So among the challenges that you face, the first is visibility. Simply understanding and deciphering the cloud bill can be an enormous hurdle, and forecasting spend can be really difficult to do accurately.

How to optimize your costs once you get visibility. There are complex dependencies, as you know, within your data pipeline. So making adjustments to resources can have downstream effects, and you don’t want to interrupt the flow of those pipelines.

Finally, governance. So governing that cloud spending is hard. With over 200 services and over 600 instance types on AWS alone, it’s difficult to define what good looks like. The result is that on average, organizations report their public cloud spend is over budget by 13%, and they expect cloud spending to increase by 29% this year.

Observability is a key here, because it unlocks several important benefits. First, visibility. Just getting full visibility to understand where spending is going and how teams are tracking towards their budgets.

Granularity, and seeing the spending details by data team, by pipeline, by data application or product or division, and forecasting, seeing those trends and being able to project out accurately to help forecast future spending and profitability.

So data management represents approximately 40% of the typical cloud bill, and data management services are the fastest growing category of cloud service spending. It’s also driving a lot of the incredible revenue growth that we’ve seen, and innovation in products.

When you combine the best of DataOps and the best of FinOps, you get DataFinOps, and DataFinOps empowers data engineers and business teams to make better choices about your cloud usage. It helps you get the most from your modern data stack investments.

A FinOps approach, though, isn’t just about slashing costs. Although you’ll almost invariably wind up saving money, it’s about empowering data engineers and business teams to make better choices about their cloud usage, and derive the most value from their modern data stack investments.

Managing costs consists of three iterative phases. The first is getting visibility into where the money is going, measuring what’s happening in your cloud environment, understanding what’s going on in a workload aware context. Once you have that observability, next you can optimize. You begin to see patterns emerge where you can eliminate waste, remove inefficiencies and actually make things better, and then you can go from reactive problem solving to proactive problem preventing, sustaining iterative improvements in automating guardrails, enabling self-service optimization.

So each phase builds upon the previous one to create a virtuous cycle of continuous improvement and empowerment for individual team members, regardless of their expertise, to make better decisions about their cloud usage while still hitting their SLAs and driving results. In essence, this shifts the budget left, pulling accountability for managing costs forward.

So now let me share a few examples of FinOps success. A global company in the healthcare industry discovered they were spending twice their target spending for Amazon EMR. They could manually reduce their infrastructure spending without using observability tools, but each time they did, they saw job failures happen as a result. They wanted to understand the reason why cost was so far above their expected range.

Using observability tools, they were able to identify the root cause for the high costs and reduce them without failures.

Using a FinOps approach, they were able to improve EMR efficiency by 50% to achieve their total cost of ownership goals. Using the FinOps framework, their data analytics environment became much easier to manage. They used the best practices from optimizing their own cloud infrastructure to help onboard other teams, and so they were able to improve the time to value across the entire company.

A global web analytics company used a FinOps approach to get the level of fidelity that they needed to reduce their infrastructure costs by 30% in just six months. They started by tagging the AWS infrastructure that powered their products, such as EC2 instances, EBS volumes, RDS instances and network traffic.

The next step was to look by product and understand where they could get the biggest wins. As they looked across roughly 50 different internal projects, they were able to save more than five million per year, and after running some initial analysis, they realized that more than 6,000 data sources were not connected to any destinations at all, or were sent to destinations with expired credentials.

They were wasting $20,000 per month on unused data infrastructure. The team provided daily reporting on their top cost drivers visualized in a dashboard, and then using this information, they boosted margins by 20% in just 90 days.

All right, with that, let’s hand it over to Thiago to give us an update on the state of production machine learning. Thiago, over to you.

Thiago Gil:

Thank you, Clinton. Let’s talk about the state of production ML. This includes understanding the challenges and the best practice for deploying, scaling and managing ML models in production environments, and how FinOps principles and Kubernetes can help organizations to optimize and manage the costs associated with their ML workloads, improve efficiency and scalability and cost effectiveness of their models while aligning them with the business objectives.

ML is moving to Kubernetes because it provides a flexible and scalable platform for deploying and managing machine learning models… Kubernetes [inaudible 00:07:29] getting resources such as CP-1 memory to match the demands of our workloads. Additionally, Kubernetes provides features such as automatic aerobics, self healing and service discovery, which are useful in managing and deploying ML models in a production environment.

The FinOps framework, which includes principles such as team collaboration, ownership and cloud usage, centralized team for financial operations, realtime reporting, decision driven by business value, and taking advantage of the variable cost of model… And taking advantage of the variable cost model of the cloud can relate to Kubernetes in several ways.

Kubernetes can also be used to allocate costs to specific teams of project and track and optimize the performance and cost of workloads in real time. By having a centralized team for financial operations and collaboration among teams, organizations can make better decisions driven by business value, and take advantage of the variable cost model of the cloud by only paying for the resources they use.

FinOps principles such as optimization, automation, cost allocation, and monitoring and metrics can be applied to ML workloads running on Kubernetes to improve their efficiency, scalability and cost effectiveness.

Kubernetes, by its nature, allows for cloud diagnostic workloads. It means that workloads deployed on Kubernetes can run on any cloud provider or on premises. This allows for more flexibility in terms of where ML workloads are deployed and can help to avoid vendor lock-in.

FinOps can help DataOps teams identify and eliminate unnecessary expenses, such as redundant or underutilized resources. This can include optimizing cloud infrastructure costs, negotiating better pricing service and licenses, and identifying opportunities to recycle or repurpose existing resources.

FinOps can help DataOps teams develop financial plans that align with business goals and priorities, such as inventing new technologies or expanding data capabilities.
By setting clear financial objectives and budgets, DataOps teams can make more informed decisions about how to allocate resources and minimize costs.

FinOps can help data teams automate financial processes, such as invoice and payment tracking, to reduce the time and effort to manage these tasks. This can free up DataOps teams members to focus on more strategic tasks, such as data modeling and analysis. FinOps help DataOps teams track financial performance and identify areas for improvement. This can include monitoring key financial metrics, such as cost per data unit or return on investment, to identify opportunities to reduce costs and improve efficiency.

A FinOps team known as Cloud Cost Center of Excellence is a centralized team within an organization that is responsible for managing and optimizing the financial aspects of the organization cloud infrastructure. This team typically has a broad remit that includes monitoring and analyzing cloud usage and cost, developing and implementing policies and best practices, collaborating with teams across the organization, providing guidance and support, providing real-time reporting, and continuously monitoring and adapting to changes in cloud pricing and services. The goal of this team is to provide a centralized point of control and expertise for all cloud related financial matters, ensuring that the organization cloud usage is optimized, cost-effective, and aligns with the overall business objectives.

Our product mindset focus on delivering value to the end user and the business, which help data teams better align their full efforts with the organization’s goals and priorities.

Changing the mindset from project to products can help improving collaboration. When FinOps teams adopt a product mindset, it helps to have a better collaboration between the team responsible for creating and maintaining the products, and cost transparency allows an organization to clearly understand and track the costs associated with its operation, including its cloud infrastructure, by providing visibility into cloud usage, costs and performance metrics, forecasting future costs, making data-driven decision, allocating costs, improving collaboration, and communicating and aligning cloud usage with overall business objectives.

When moving workloads to the cloud, organizations may discover hidden costs related to Kubernetes, so just cost of managing and scaling the cluster, the cost of running the control plane itself, and the cost of networking and storage. This hidden cost can arise from not fully understanding the pricing model of cloud providers, not properly monitoring or managing usage of cloud resources, or not properly planning for data transfer or storage costs.

Applications that require different amounts of computational power can be placed on that spectrum. Some applications like training large AI models require a lot of processing power to keep GPUs fully utilized during training processes by batch processing hundreds of data samples in parallel. However, other applications may only require a small amount of processing power, leading to underutilization of the computational power of GPUs.

When it comes to GPU resources, Kubernetes does not have the native support for GPU allocation, and must rely on third-party solutions, such as Kubernetes device plugin to provide this functionality. These solutions add an extra layer of complexity to resource allocation and scaling, as they require additional configuration and management.
Additionally, GPUs are not as easily being shareable as CPU resources, and have more complex life cycles. They have to be allocated and deallocated to specific parts that have to be managed by the Kubernetes collective itself. This can lead to situations where the GPU resources are not being fully utilized, or where multiple parts are trying to access the same GPU resources at the same time, resulting in computation and performance issues.

So why do we need realtime observability? Sometimes data teams do not realize GPU memories, CPU limits and requests are not treated the same way before it’s too late.
The Prius effect refers to the changing driving behavior observed in some drivers of the Toyota Prius hybrid car, altered their driving style to reduce fuel consumption after receiving realtime feedback on their gasoline consumption.

Observability by design on ML workloads, which includes collecting and monitoring key metrics, logging, tracing, setting up alerts, and running automated experiments allow teams to gain insights into the performance behavior and impact on their ML models. Make data-driven decisions to improve their performance and the reliability, and align with FinOps principles such as cost optimization, forecasting, budgeting, cost allocation, and decision making based on cost benefit analysis, all of which can help organizations optimize and manage the cost associated with their ML workloads.

By providing real time visibility into the performance and resource usage of AI and ML workloads, organizations can proactively identify and address [inaudible 00:18:05] and make more informed decision about how to optimize the cost of running these workloads in the cloud, understand how GPU resources are being consumed by different workloads and make informed decisions about scaling and allocating resources to optimize costs, and even find and troubleshoot GPU scheduling issues, such as GPU starvation or GPU oversubscription, that can cause workloads to consume more resources than necessary, and correct them.

Clinton Ford:

Fantastic. Thank you so much, Thiago. It’s been great having you here today, and we look forward to answering all of your questions. Feel free to enter them in the chat below, or head over to the Unravel booth. We’d be happy to visit with you. Thanks.

The post Maximize Business Results with FinOps appeared first on Unravel.

Why is Cost Governance So Hard for Data Teams?

Christine Della Penna — Fri, 03 Feb 2023 17:46:18 +0000

Chris Santiago, VP Solutions Engineering Unravel Data, shares his perspective on Why Cost governance is so hard for data teams in this 3 min video. Transcript below.

Transcript

Cost governance – a burning issue for companies that are running mission-critical workloads in production, using large amounts of data, and are at the point where they’re starting to see rising costs but they don’t have a handle or know what to do next in order to curtail those costs.

It all starts from the cloud vendors itself, because that’s really the only way we’re going to get true costs. And so the way that cloud companies do this is they have loads of customers. And so it’s going to take them time to batch up all the information about what the resources are consumed for each of their customers. And then when it’s time for billing, they go ahead and they send out a report that’s highly aggregated in nature, to really just show this is how much you were spending. And it comes in as a batch process, right? What you really need is fine-grained, or fine granularity, of where costs are going. But the challenge is, because we’re getting this highly aggregated data from the cloud vendors itself, it becomes challenging.

Now, what do customers do? What they’ll end up doing is they’ll try to do this themselves by capturing whatever metrics that they can. I see a lot of customers do this through manual spreadsheets, trying to track things by job, in a very manual fashion. In fact, one customer that I was working with actually had two full-time, very talented developers doing this just so that way they can get a semblance of where cost is going at a granularity that they wanted. Now, their issue is that this spreadsheet was prone to errors. Things would always change on the cloud vendor side, which would make their reports not accurate. And at the end of the day, they were not getting buy-in from the business.

And so that’s why you can’t use DevOps tools to solve this problem. You need a purpose-built platform built for DataOps teams, and that’s where Unravel comes in. So we’ll be able to take in all this information, provide you the ability to understand costs at whatever granularity you want. And best of all, we’re capturing this as soon as this information comes in. So now you can do things such as accurate forecasting of how resources are being spent. Setting budgets so that we can really understand who’s spending what. Being able to provide chargeback visibility so we can see who the top spenders are and we can actually have that conversation. And most importantly of all, being able to optimize at the job level. So that way we really, truly can reduce costs wherever we would want them to be.

Full stack visibility is just that easy when you utilize Unravel. For a full demo or try it yourself.

The post Why is Cost Governance So Hard for Data Teams? appeared first on Unravel.

Taming Cloud Costs for Data Analytics with FinOps

Stephen Lamont — Fri, 03 Feb 2023 15:15:11 +0000

Uncontrolled cloud costs pose an enormous risk for any organization. The longer these costs go ungoverned, the greater your risk. Volatile, unforeseen expenses eat into profits. Budgets become unstable. Waste and inefficiency go unchecked. Making strategic decisions becomes difficult, if not impossible. Uncertainty reigns.

Everybody’s cloud bill continues to get steeper month over month, and the most rapidly escalating (and often the single largest) slice of the pie is cloud costs for data analytics—usually at least 40%. With virtually every organization asking for ever more data analytics, and more data workloads moving to the cloud, uncontrolled cloud data costs increasingly become a bigger part of the overall problem.

Data workloads are the fastest-growing cloud cost category and, if they’re not already, will soon be the #1 cloud expense.

All too often, business leaders don’t even have a clear understanding of where the money is going—or why—for cloud data expenditures, much less a game plan for bringing these costs under control.

Ungoverned cloud data usage and costs result in multiple, usually coexisting, business vulnerabilities.

Consider these common scenarios:

You’re a C-suite executive who is ultimately responsible for cloud costs, but can’t get a good explanation from your data team leaders about why your Azure Databricks or AWS or Google Cloud costs in particular are skyrocketing.
You’re a data team leader who’s on the hook to explain to the C-suite these soaring cloud data costs, but can’t. You don’t have a good handle on exactly who’s spending how much on what. This is a universal problem, no matter what platform or cloud provider: 70% of organizations aren’t sure what they spend their cloud budget on.
You’re in Finance and are getting ambushed by your AWS (or Databricks or GCP) bill every month. Across companies of every size and sector, usage and costs are wildly variable and unpredictable—one survey has found that cloud costs are higher than expected for 6 of every 10 organizations.
You’re a business product owner who needs additional budget to meet the organization’s increasing demand for more data analytics but don’t really know how much more money you’ll need. Forecasting—whether for Snowflake, Databricks, Amazon EMR, BigQuery, or any other cloud service—becomes a guessing game: Forrester has found that 80% of companies have difficulty predicting data-related cloud costs, and it’s becoming ever more problematic to keep asking for more budget every 3-6 months.
You’re an Engineering/Operations team lead who knows there’s waste and inefficiency in your cloud usage but don’t really know how much or exactly where, much less what to do about it. Your teams are groping in the dark, playing blind man’s bluff. And it’s getting worse: 75% of organizations report that cloud waste is increasing.
Enterprise architecture teams are seeing their cloud migration and modernization initiatives stall out. It’s common for a project’s three-year budget to be blown by Q1 of Year 2. And increasingly companies are pulling the plug. A recent report states that 81% of IT teams have been directed by the C-suite to cut or halt cloud spending. You find yourself paralyzed, unable to move either forward or back—like being stranded in a canoe halfway through a river crossing.
Data and Finance VPs don’t know the ROI of their modern data stack investments—or even how to figure that out. But whether you measure it or not, the ROI of your modern data stack investments nosedives as costs soar. You find it harder and harder to justify your decision to move to Databricks or Snowflake or Amazon EMR or BigQuery. PricewaterhouseCoopers has found that over half (53%) of enterprises have yet to realize substantial value from their cloud investments.
With seemingly no end in sight to escalating cloud costs, data executives may even be considering the radical step of sacrificing agility and speed-to-market gains and going back to on-prem (repatriation). The inability to control costs is a leading reason why 71% of enterprises expect to move all or some of their workloads to on-prem environments.

How’d we get into this mess?

The problem of ungoverned cloud data costs is universal and has been with us for a while: 83% of enterprises cite managing cloud spend as their top cloud challenge, and optimizing cloud usage is the top cloud initiative for the sixth straight year, according to the 2022 State of the Cloud Report.

Some of that is simply due to the increased volume and velocity of data and analytics. In just a few short years, data analytics has gone from a science project to an integral business-critical function. More data workloads are running in the cloud, often more than anticipated. Gartner has noted that overall cloud usage for data workloads “almost always exceeds initial expectations,” stating that workloads may grow 4X or more in the first year alone.

If you’re like most data-driven enterprises, you’ve likely invested millions in the innovative data platforms and cloud offerings that make up your modern data stack—Databricks, Snowflake, Amazon EMR, BigQuery, Dataproc, etc.—and have realized greater agility and go-to-market speed. But those benefits have come at the expense of understanding and controlling the associated costs. You’re also seeing your cloud costs lurch unpredictably upwards month over month. And that’s the case no matter which cloud provider(s) you’re on, or which platform(s) you’re running. Your modern data stack consumes ever more budget, and it never seems to be enough. You’re under constant threat of your cloud data costs jumping 30-40% in just six months. It’s become a bit like the Wild West, where everybody is spinning up clusters and incurring costs left and right, but nobody is in control to govern what’s going on.

The only thing predictable about modern data stack costs is that they seem to always go up. Source: Statistica

Many organizations that have been wrestling with uncontrolled cloud data costs have begun adopting a FinOps approach. Yet they are struggling to put these commonsensical FinOps principles into practice for data teams. They find themselves hamstrung by generic FinOps tools and are hitting a brick wall when it comes to actualizing foundational core capabilities.

So why are organizations having trouble implementing DataFinOps (FinOps for data teams)?

FinOps for data teams

Just as DevOps and DataOps are “cousin” approaches—bringing agile and lean methodologies to software development and data management, tackling similar types of challenges, but needing very distinct types of information and analyses to get there—FinOps and DataFinOps are related but different. In much the same way (and for similar reasons) DevOps observability built for web applications doesn’t work for data pipelines and applications, DataFinOps brings FinOps best practices to data management, taking the best of FinOps to help data teams measure and improve cost effectiveness of data pipelines and data applications.

DataOps + FinOps = DataFinOps

FinOps principles and approach

As defined by the FinOps Foundation, FinOps is “an evolving cloud financial management discipline and cultural practice that enables organizations to get maximum business value by helping engineering, finance, technology and business teams to collaborate on data-driven spending decisions.”

It’s important to bear in mind that FinOps isn’t just about lowering your cloud bill—although you will wind up saving money in a lot of areas, running things more cost-effectively, and being able to do more with less (or at least the same). It’s about empowering engineers and business teams to make better choices about their cloud usage and deriving the most value from their cloud investments.

There are a few underlying “north star” principles that guide all FinOps activities:

Cloud cost governance is a team sport. Too often controlling costs devolves into Us vs. Them friction, frustration, and finger-pointing. It can’t be done by just one group alone, working with their own set of data (and their own tools) from their own perspective. Finance, Engineering, Operations, technology teams, and business stakeholders all need to be on the same page, pulling in the same direction to the same destination, working together collaboratively.
Spending decisions are driven by business value. Not all cloud usage is created equal. Not everything merits the same priority or same level of investment/expenditure. Value to the business is the guiding criterion for making collaborative and intelligent decisions about trade-offs between performance, quality, and cost.
Everyone takes ownership of their cloud usage. Holding individuals accountable for their own cloud usage—and costs—essentially shifts budget responsibility left, onto the folks who actually incur the expenses.This is crucial to controlling cloud costs at scale, but to do so, you absolutely must empower engineers and Operations with the self-service optimization capabilities to “do the right thing” themselves quickly and easily.
Reports are accessible and timely. To make data-driven decisions, you need accurate, real-time data. The various players collaborating on these decisions all bring their own wants and needs to the table, and everybody needs to be working from the same information, seeing the issue(s) the same way—a single pane of glass for a single source of truth.Dashboards and reports must be visible, understandable, and practical for a wide range of people making these day-to-day cost-optimization spending decisions.

Applying FinOps to data management

Data teams can put these FinOps principles into practice with a three-phase iterative DataFinOps life cycle:

Observability, or visibility, is understanding what is going on in your environment, measuring everything, tracking costs, and identifying exactly where the money is going.
Optimization is identifying where you can eliminate waste and inefficiencies, take advantage of less-expensive cloud pricing options, or otherwise run your cloud operations more cost-efficiently—and then empowering individuals to actually make improvements.
Governance is all about going from reactive problem-solving to proactive problem-preventing, sustaining iterative improvements, and implementing guardrails and alerts.

The DataFinOps life cycle: observability, optimization, and governance.

The essential principles and three-phase approach of FinOps are highly relevant to data teams. But just as DataOps requires a different set of information than DevOps, data teams need unique and data-specific details to achieve observability, optimization, and governance in their everyday DataFinOps practice.

What makes DataFinOps different—and difficult

The following challenges all manifest themselves in one way or another at each stage of the observability/optimization/governance DataFinOps life cycle. Be aware that while you can’t do anything until you have full visibility/observability into your cloud data costs, the optimization and governance phases are usually the most difficult and most valuable to put into action.

First off, just capturing and correlating the volume of information across today’s modern data stacks can be overwhelming. The sheer size and complexity of data applications/pipelines—all with multiple sub-jobs and sub-tasks processing in parallel—generates millions of metrics, metadata, events, logs, etc. Then everything has to be stitched together somehow in a way that makes sense to Finance and Data Engineering and DataOps and the business side.

Even a medium-sized data analytics operation has 100,000s of individual “data-driven spending decisions” to make. That’s the bad news. The good news is that this same complexity means there are myriad opportunities to optimize costs.

Second, the kinds of details (and the level of granularity) your teams need to make informed, intelligent DataFinOps decisions simply are not captured or visualized by cloud cost-management FinOps tools or platform-specific tools like AWS Cost Explorer, OverWatch, or dashboards and reports from GCP and Microsoft Cost Management. It’s the highly granular details about what’s actually going on in your data estate (performance, cost) that will uncover opportunities for cloud cost optimizations. But those granular details are scattered across dozens of different tools, technologies, and platforms.

Third, there’s usually no single source of truth. Finance, data engineers, DataOps, and the business product owners all use their own particular tools of choice to manage different aspects of cloud resources and spending. Without a common mechanism (the proverbial single pane of glass) to see and measure cost efficiency, it’s nearly impossible for any of them to make the right call.

Finally, you need to be able to see the woods for the trees. Seemingly simple, innocuous changes to a single cloud data application/pipeline can have a huge blast radius. They’re all highly connected and interdependent: the output of something upstream is the input to something else downstream or, quite likely, another application/pipeline for somebody somewhere else in the company. Everything must be understood within a holistic business context so that everybody involved in DataFinOps understands how everything is working as a whole.

Implementing DataFinOps in practice: A 6-step game plan

At its core, DataFinOps elevates cost as a first-class KPI metric, right alongside performance and quality. Most data team SLAs revolve around performance and quality: delivering reliable results on time, every time. But with cloud spend spiraling out of control, now cost must be added to reliability and speed as a co-equal consideration.

1. Set yourself up for success

Before launching DataFinOps into practice, you need to lay some groundwork around organizational alignment and expectations. The collaborative approach and the principle of holding individuals accountable for their cloud usage are a cultural sea-change. DataFinOps won’t work as a top-down mandate without buy-in from team members further down the ladder. Recognize that improvements will grow proportionally over time as your DataFinOps practice gains momentum. It’s best to adopt a crawl-walk-run approach, where you roll out a pilot project to discover best practices (and pitfalls) and get a quick win that demonstrates the benefits. Find a data analyst or engineer with an interest in the financial side who can be the flag-bearer, maybe take some FinOps training (the FinOps Foundation is an excellent place to start), and then dig into one of your more expensive workloads to see how to apply DataFinOps principles in practice. Similarly, get at least one person from the finance and business teams who is willing to get onboard and discover how DataFinOps works from their side.

Check out this 15-minute discussion on how JPMorgan Chase & Co, tackled FinOps at scale.

2. Get crystal-clear visibility into where the money is going

Before you can even begin to control your cloud costs, you have to understand—with clarity and precision—where the money is going. Most organizations have only hazy visibility into their overall cloud spend. The 2022 State of Cloud Cost Report states that gaining visibility into cloud usage is the single biggest challenge to controlling costs. The 2022 State of Cloud Cost Intelligence Report further finds that only 3 out of 10 organizations know exactly where their spend is going, with the majority either guesstimating or having no idea. And the larger the company, the more significant the cost visibility problem.

A big part of the problem are the cloud bills themselves. They’re either opaque or inscrutable; either way, they lack the business context that makes sense to your particular organization. Cloud vendor billing consoles (and third-party cost-management tools) give you only an aggregated view of total spend for different services (compute, storage, platform, etc.).

Cloud bills don’t answer questions like, Who spent on these services? Which department, teams, users are the top spenders?

Or you have to dedicate a highly trained expert or two to decode hundreds of thousands (millions, even) of individual billing line items and figure out who submitted which jobs (by department, team, individual user) for what purpose—who’s spending what, when, and why.

Decoding 1,000,000s of individual lines of billing data is a full-time job.

The lack of visibility is compounded by today’s heterogeneous multi-cloud, multi-platform reality. Most enterprise-scale organizations use a combination of different cloud providers and different data platforms, often the same platform on different cloud providers. While this is by no means a bad thing from an agility perspective, it does make comprehending overall costs across different providers even more difficult.

So, the first practical step to implementing DataFinOps is understanding your costs with clarity and precision. You need to be able to understand at a glance, in terms that make sense to you, which projects, applications, departments, teams, and individual users are spending exactly how much, and why.

You get precise visualization of cloud data costs through a combination of tagging and full-stack observability. Organizations need to apply some sort of cost-allocation tagging taxonomy to every piece of their data estate in order to categorize and track cloud usage. Then you need to capture a wide and deep range of granular performance details, down to the individual job level, along with cost information.

All this information is already available in different systems, hidden in plain sight. What you need is a way to pull it all together, correlating everything in a data model and mapping resources to the tagged applications, teams, individuals, etc.

Once you have captured and correlated all performance/cost details and have everything tagged, you can slice and dice the data to do a number of things with a high degree of accuracy:

See who the “big spenders” are—which departments, teams, applications, users are consuming the most resources—so you can prioritize where to focus first on cost optimization, for the biggest impact.
Track actual spend against budgets—again, by project or group or individual. Know ahead of time whether usage is projected to be on budget, is at risk, or has already gone over. Avoid “sticker shock” surprises when the monthly bill arrives.
Forecast with accuracy and confidence. You can run regular reports that analyze historical usage and trends (e.g., peaks and valleys, auto-scaling) to predict future capacity needs. Base projections on data, not guesswork.
Allocate costs with pinpoint precision. Generate cost-allocation reports that tell you exactly—down to the penny—who’s spending how much, where, and on what.

3. Understand why the costs are what they are

Beyond knowing what’s being spent and by whom, there’s the question of how money is being spent—specifically, what resources were allocated for the various workload jobs and how much they cost. Your data engineers and DataOps teams need to be able to drill down into the application- and cluster-level configuration details to identify which individuals are using cloud data resources, the size and number of resources they’re using, which jobs are being run, and what their performance looks like. For example, some data jobs are constrained by network bandwidth, while others are memory- or CPU-bound. Running those jobs on clusters and instance types may not maximize the return on your cloud data investments. If you think about cost optimization as a before-and-after exercise, your teams need a clear and comprehensive picture of the “before.”

With so many different cloud pricing models, instance types, platforms, and technologies available to choose from, you need everybody to have a fully transparent 360° view into the way expenses are being incurred at any given time so that the next step—cost optimization and collaborative decision-making about cloud spend—can happen intelligently.

4. Identify opportunities to do things more cost-effectively

If the observability phase gives you visibility into where costs are going, how, and why, the optimization phase is about taking that information and figuring out where you can eliminate waste, remove inefficiencies, and leverage different options that are less expensive but still meet business needs.

Most companies have very limited insight (if any) into where they’re spending more than they need to. Everyone knows on an abstract level that cloud waste is a big problem in their organization, but identifying concrete examples that can be remediated is another story. Various surveys and reports peg the amount of cloud budget going to waste between 30-50%, but the true percentage may well be higher—especially for data teams—because most companies are just guessing and tend to underestimate their waste. For example, after implementing cloud data cost optimization, most Unravel customers have seen their cloud spend lowered by 50-60%

An enterprise running 10,000s (or 100,000s) of data jobs every month has literally millions of decisions to make—at the application, pipeline, and cluster level—about where, when, and how to run those jobs. And each individual decision about resources carries a price tag.

Enterprises have 100,000s of places where cloud data spending decisions need to be made.

Only about 10% of cost-optimization opportunities are easy to see, for example, shutting down idle clusters (deployed but no-longer-in-use resources). The remaining 90% lie below the surface.

90% of cloud data cost-optimization opportunities are buried deep, out of immediate sight.

You need to go beyond knowing how much you are currently spending, to knowing how much you should be spending in a cost-optimized environment. You need insight into what resources are actually needed to run the different applications/pipelines vs. what resources are being used.

Data engineers are not wasting money intentionally; they simply don’t have the insights to run their jobs most cost-efficiently. Identifying waste and inefficiencies needs two things in already short supply: time and expertise. Usually you need to pull engineering or DataOps superstars away from what they’re doing—often the most complex or business-critical projects—to do the detective work of analyzing usage vs. actual need. This kind of cost-optimization analysis does get done today, here and there, on an ad hoc basis for a handful of applications, but doing so at enterprise scale is something that remains out of reach for most companies.

The complexity is overwhelming: for example, AWS alone has more than 200 cloud services and over 600 instance types available, and has changed its prices 107 times since its launch in 2006. For even your top experts, this can be laborious, time-consuming work, with continual trial-and-error, hit-or-miss attempts.

Spoiler alert: You need AI

AI is crucial to understanding where cost optimization is needed—at least, at enterprise scale and to make an impact. It can take hours, days, sometimes weeks, for even your best people to tune a single application for cost efficiency. AI can do all this analysis automatically by “throwing some math” at all the observability performance and cost data to uncover where there’s waste and inefficiencies.

Overprovisioned resources. This is most often why budgets go off the rails, and the primary source of waste. The size and number of instances and containers is greater than necessary, having been allocated based on perceived need rather than actual usage. Right-sizing cloud resources alone can save an organization 50-60% of its cloud bill.
Instance pricing options. If you understand what resources are needed to run a particular application (how that application interacts with other applications), the DataFinOps team can make informed decisions about when to use on-demand, reserved, or spot instances. Leveraging spot instances can be up to 90% less expensive than on-demand—but you have to have the insights to know which jobs are good candidates.
Bad code. AI understands what the application is trying to do and can tell you that it’s been submitted in a way that’s not efficient. We’ve seen how a single bad join on a multi-petabyte table kept a job running all weekend and wound up costing the company over $500,000.
Intelligent auto-scaling. The cloud’s elasticity is great but comes at a cost, and isn’t always needed. AI analyzes usage trends to help predict when auto-scaling is appropriate—or when rescheduling the job may be a more cost-effective option.
Data tiering. You probably have petabytes of data. But you’re not using all of it all of the time. AI can tell you which datasets are (not) being used, applying cold/warm/hot labels based on age or usage, so you understand which ones haven’t been touched in months yet still sit on expensive storage. Moving cold data to less-expensive options can save 80-90% on storage costs.

5. Empower individuals with self-service optimization

Holding individual users accountable for their own cloud usage and cost is a bedrock principle of DataFinOps. But to do that, you also have to give them the insights and actionable intelligence for them to make better choices.

You have to make it easy for them to do the right thing. They can’t just be thrown a bunch or charts and graphs and metrics, and be expected to figure out what to do. They need quick and easy-to-understand prescriptive recommendations on exactly how to run their applications reliably, quickly, and cost-efficiently.

Ideally, this is where AI can also help. Taking the analyses of what’s really needed vs. what’s been requested, AI could generate advice on exactly where and how to change settings or configurations or code to optimize for cost.

Leveraging millions of data points and hundreds of cost and performance algorithms, AI offers precise, prescriptive recommendations for optimizing costs throughout the modern data stack.

6. Go from reactive to proactive

Optimizing costs reactively is of course highly beneficial—and desperately needed when costs are out of control—but even better is actively governing them. DataFinOps is not a straight line, with a beginning and end, but rather a circular life cycle. Observability propels optimization, whose outcomes become baselines for proactive governance and feed back into observability.

Governance is all about getting ahead of cost issues beforehand rather than after the fact. Data team leaders should implement automated guardrails that give a heads-up when thresholds for any business dimension (projected budget overruns, jobs that exceed a certain size, time, cost) are crossed. Alerts could be triggered whenever a guardrails constraint is violated, notifying the individual user—or sent up the chain of command—that their job will miss its SLA or cost too much money and they need to find less expensive options, rewrite it to be more efficient, reschedule it, etc. Or guardrail violations could trigger preemptive corrective “circuit breaker’ actions to kill jobs or applications altogether, request configuration changes, etc., to rein in rogue users, put the brakes on runaways jobs, and nip cost overruns in the bud.

Controlling particular users, apps, or business units from exceeding certain behaviors has a profound impact on reining in cloud spend.

Conclusion

The volatility, obscurity, and lack of governance over rapidly growing cloud data costs introduce a high degree of unpredictability—and risk—into your organization’s data and analytics operations. Expenses must be brought under control, but reducing costs by halting activities can actually increase business risks in the form of lost revenue, SLA breaches, or even brand and reputation damage. Taking the radical step of going back on-prem may restore control over costs but sacrifices agility and speed-to-market, which also introduces the risk of losing competitive edge.

A better approach, DataFinOps, is to use your cloud data spend more intelligently and effectively. Make sure that your cloud investments are providing business value, that you’re getting the highest possible ROI from your modern data stack. Eliminate inefficiency, stop the rampant waste, and make business-driven decisions—based on real-time data, not guesstimates—about the best way to run your workloads in the cloud.

That’s the driving force behind FinOps for data teams, DataFinOps. Collaboration between engineering, DataOps, finance, and business teams. Holding individual users accountable for their cloud usage (and costs).

But putting DataFinOps principles into practice is a big cultural and organizational shift. Without the right DataFinOps tools, you’ll find it tough sledding to understand exactly where your cloud data spend is going (by whom and why) and identify opportunities to operate more cost-efficiently. And you’ll need AI to help empower individuals to optimize for cost themselves.

Then you can regain control over your cloud data costs, restore stability and predictability to cloud budgets, be able to make strategic investments with confidence, and drastically reduce business risk.

The post Taming Cloud Costs for Data Analytics with FinOps appeared first on Unravel.

3 Takeaways from the 2023 Data Teams Summit

Stephen Lamont — Thu, 02 Feb 2023 13:45:19 +0000

The 2023 Data Teams Summit (formerly DataOps Unleashed) was a smashing success, with over 2,000 participants from 1,600 organizations attending 23 expert-led breakout sessions, panel discussions, case studies, and keynote presentations covering a wide range of thought leadership and best practices.

There were a lot of sessions devoted to different strategies and considerations when building a high-performing data team, how to become a data team leader, where data engineering is heading, and emerging trends in DataOps (asset-based orchestration, data contracts, data mesh, digital twins, data centers center of excellence). And winding as a common theme throughout almost every presentation was that top-of-mind topic: FinOps and how to get control over galloping cloud data costs.

Some of the highlight sessions are available now on demand (no form to fill out) on our Data Teams Summit 2023 page. More are coming soon.

There was a lot to see (and full disclosure: I didn’t get to a couple of sessions), but here are 3 sessions that I found particularly interesting.

Enabling strong engineering practices at Maersk

The fireside chat between Unravel CEO and Co-founder Kunal Agarwal and Mark Sear, Head of Data Platform Optimization at Maersk, one of the world’s largest logistics companies, is entertaining and informative. Kunal and Mark cut through the hype to simplify complex issues in commonsensical, no-nonsense language about:

The “people problem” that nobody’s talking about
How Maersk was able to upskill its data teams at scale
Maersk’s approach to rising cloud data costs
Best practices for implementing FinOps for data teams

Check out their talk here

Maximize business results with FinOps

Unravel DataOps Champion and FinOps Certified Practitioner Clinton Ford and FinOps Foundation Ambassador Thiago Gil explain how and why the emerging cloud financial management discipline of FinOps is particularly relevant—and challenging—for data teams. They cover:

The hidden costs of cloud adoption
Why observability matters
How FinOps empowers data teams
How to maximize business results
The state of production ML

See their session here

Situational awareness in a technology ecosystem

Charles Boicey, Chief Innovation Officer and Co-founder of Clearsense, a healthcare data platform company, explores the various components of a healthcare-centric data ecosystem and how situational awareness in the clinical environment has been transferred to the technical realm. He discusses:

What clinical situational awareness looks like
The concept of human- and technology-assisted observability
The challenges of getting “focused observability” in a complex hybrid, multi-cloud, multi-platform modern data architecture for healthcare
How Clearsense leverages observability in practice
Observability at the edge

Watch his presentation here

How to get more session recordings on demand

To see other session recordings without any registration, visit the Unravel Data Teams Summit 2023 page.
To see all Data Teams Summit 2023 recordings, register for access here.

And please share your favorite takeaways, see what resonated with your peers, and join the discussion on LinkedIn.

The post 3 Takeaways from the 2023 Data Teams Summit appeared first on Unravel.

Unravel Data Joins FinOps Foundation

Christine Della Penna — Tue, 17 Jan 2023 16:46:34 +0000

Unravel Data Joins FinOps Foundation to Help Companies Accelerate Cloud Data Cost Optimization Using AI

Palo Alto, CA – Jan. 18, 2023 – Unravel Data, the first data observability platform built to meet the needs of modern data teams, announced today that it has become a General Member of the FinOps Foundation to help organizations optimize their cloud data costs and implement cost governance for their modern data stacks using AI insights.

“Companies today are faced with a myriad of challenges, not the least of which is controlling and governing their data cloud costs. Many are discovering how quickly cloud spend can spiral out of control, and as a result many data modernization projects are stalling out. Using the FinOps Framework, however, enables organizations to make better choices about their cloud usage and derive the biggest return on their modern data stack investments,” said Kunal Agarwal, CEO of Unravel Data. “Empowering our customers with AI insights to optimize cloud spending and accelerate their digital transformation is paramount, and so we are excited to be joining the FinOps Foundation to further this goal.”

The practice of FinOps, with its collaborative and iterative approach to observability, optimization, and governance can help data teams increase the return on their modern data stack investments by proactively spotting opportunities for efficiency, adding alerts, and automating guardrails. Moreover, through FinOps, data teams can boost data performance, prioritize areas to improve, enable self service, and even automate financial processes to streamline the time and effort required to manage these tasks.

“We are excited to welcome Unravel Data to the foundation. As a leading advocate for tighter integration between data, observability, and FinOps teams, Unravel will bring new ideas and approaches to the practice of managing cloud spend,” said Kevin Emamy, Partner Program Advisor at the FinOps Foundation. “The entire FinOps Foundation community will benefit from their experiences and lessons learned in this new and emerging facet of FinOps.”

Those interested in learning more about how a FinOps approach empowers DataOps and business teams to collaboratively achieve shared business goals are invited to attend the virtual Data Teams Summit session, Maximize Business Results with FinOps, on Jan. 25, 2023. During this session, Clinton Ford, director of product marketing for Unravel, and Thiago Gil, FinOps Foundation ambassador, will address how organizations can apply agile and lean principles using the FinOps framework to boost efficiency, productivity, and innovation throughout their data stack. Learn more or register for the summit.

The FinOps Foundation, part of The Linux Foundation’s non-profit technology consortium, is focused on advancing the people and practice of cloud financial management through education, best practices, and standards.

To learn more about how Unravel is leading the data observability space, visit the new library of self-guided tours.

About Unravel Data
Unravel Data radically transforms the way businesses understand and optimize the performance and cost of their modern data applications – and the complex data pipelines that power those applications. Providing a unified view across the entire data stack, Unravel’s market-leading data observability platform leverages AI, machine learning, and advanced analytics to provide modern data teams with the actionable recommendations they need to turn data into insights. Some of the world’s most recognized brands like Adobe, 84.51˚ (a Kroger company), and Deutsche Bank rely on Unravel Data to unlock data-driven insights and deliver new innovations to market. To learn more, visit www.unraveldata.com.

About the FinOps Foundation
The FinOps Foundation (F2) is a non-profit trade association made up of FinOps practitioners around the world including Atlassian, Autodesk, Gannett, HERE Technologies, Just Eat, Nationwide and Spotify. Grounded in real world stories, expertise, and inspiration for and by FinOps practitioners, the FinOps Foundation is focused on codifying and promoting cloud financial management best practices and standards to help community members and their teams become better at cloud financial management.

Media Contact
Blair Moreland
ZAG Communications for Unravel Data
unraveldata@zagcommunications.com

The post Unravel Data Joins FinOps Foundation appeared first on Unravel.

Sneak Peek into Data Teams Summit 2023 Agenda

Stephen Lamont — Thu, 05 Jan 2023 22:47:29 +0000

The Data Teams Summit 2023 is just around the corner!

This year, on January 25, 2023, we’re taking the peer-to-peer empowerment of data teams one step further, transforming DatOps Unleashed into Data Teams Summit to better reflect our focus on the people—data teams—who unlock the value of data.

Data Teams Summit is an annual, full-day virtual conference, led by data rockstars at future-forward organizations about how they’re establishing predictability, increasing reliability, and creating economic efficiencies with their data pipelines.

Check out full agenda and register
Get free ticket

Join us for sessions on:

DataOps best practices
Data team productivity and self-service
DataOps observability
FinOps for data teams
Data quality and governance
Data modernizations and infrastructure

The peer-built agenda is packed, with over 20 panel discussions and breakout sessions. Here’s a sneak peek at some of the most highly anticipated presentations:

Keynote Panel: Winning strategies to unleash your data team

Great data outcomes depend on successful data teams. Every single day, data teams deal with hundreds of different problems arising from the volume, velocity, variety—and complexity—of the modern data stack.

Learn best practices and winning strategies about what works (and what doesn’t) to help data teams tackle the top day-to-day challenges and unleash innovation.

Breakout Session: Maximize business results with FinOps

In this session, learn why observability matters and how a FinOps approach empowers DataOps and business teams to collaboratively achieve shared business goals. This approach uses the FinOps Framework, taking advantage of the cloud’s variable cost model, and distributing ownership and decision-making through shared visibility to get the biggest return on their modern data stack investments.

See how organizations apply agile and lean principles using the FinOps framework to boost efficiency, productivity, and innovation.

Breakout Session: Going from DevOps to DataOps

DevOps has had a massive impact on the web services world. Learn how to leverage those lessons and take them further to improve the quality and speed of delivery for analytics solutions.

Ali’s talk will serve as a blueprint for the fundamentals of implementing DataOps, laying out some principles to follow from the DevOps world and, importantly, adding subject areas required to get to DataOps—which participants can take back and apply to their teams.

Breakout Session: Becoming a data engineering team leader

As you progress up the career ladder for data engineering, responsibilities shift as you start to become more hands-off and look at the overall picture rather than a project in particular.

How do you ensure your team’s success? It starts with focusing on the team members themselves.

In this talk, Matt Weingarten, a lead Data Engineer at Disney Streaming, will walk through some of his suggestions and best practices for how to be a leader in the data engineering world.

Attendance is free! Sign up here for a free ticket

The post Sneak Peek into Data Teams Summit 2023 Agenda appeared first on Unravel.

How Unravel Helps FinOps Control Data Cloud Costs

Stephen Lamont — Mon, 31 Oct 2022 13:41:28 +0000

As most organizations that have started to run a lot of data applications and pipelines in the cloud have found out, it’s really easy for things to get really expensive, really fast. It’s not unusual to see monthly budget overruns of 40% or more, or for companies to have burned through their three-year data cloud budget by early in year two. Consequently, we’re seeing that cloud migration projects and modernization initiatives are stalling out. Plans to scale up usage of modern platforms (think: Databricks and Snowflake, Amazon EMR, Google BigQuery and Dataproc, Azure) are hitting a wall.

Cloud bills are usually an organization’s biggest IT expense, and the sheer massive size of data workloads is driving most of the cloud bill. Many companies are feeling ambushed by their monthly cloud bills, and simply understanding where the money is going is a challenge—let alone figuring out where and how to rein in those costs and keeping them under control. Capacity/budget forecasting becomes a guessing game. There’s a general lack of individual accountability for waste/abuse of cloud resources. It feels like the Wild West, where everybody is spinning up instances left and right, without enough governance or control over how cloud resources are being used.

Enter the emerging discipline of FinOps. Sometimes called cloud cost management or cloud optimization, FinOps is an evolving cloud financial management practice that, in the words of the FinOps Foundation, “enables organizations to get maximum business value by helping engineering, finance, technology and business teams to collaborate on data-driven spending decisions.”

The principles behind the FinOps lifecycle

It’s important to bear in mind that a FinOps approach isn’t just about slashing costs—although you almost invariably will wind up saving money. It’s about empowering data engineers and business teams to make better choices about their cloud usage and derive the most value from their modern data stack investments.

Controlling costs consists of three iterative phases along a FinOps lifecycle:

Observability: Getting visibility into where the money is going, measuring what’s happening in your cloud environment, understanding what’s going on in a “workload-aware” context
Optimization: Seeing patterns emerge where you can eliminate waste, removing inefficiencies, actually making things better
Governance: Going from reactive problem-solving to proactive problem-preventing, sustaining iterative improvements, automating guardrails, enabling self-service optimization

Each phase builds upon the former, to create a virtuous cycle of continuous improvement and empowerment for individual team members—regardless of expertise—to make better decisions about their cloud usage while still hitting their SLAs. In essence, this shifts the budget left, pulling accountability for controlling costs forward.

How Unravel puts FinOps lifecycle principles into practice

Unravel helps turn this conceptual FinOps framework into practical cost governance reality. You need four things to make this FinOps lifecycle work—all of which Unravel is uniquely able to provide:

You need to capture the right kind of details at a highly granular level from all the various systems in your data stack—horizontally and vertically—from the application down to infrastructure and everything in between.
All this deep information needs to be correlated into a “workload-aware” business context—cost governance is not just an infrastructure issue, and you need to get a holistic understanding of how everything works together: apps, pipelines, users, data, as well as infrastructure resources.
You need to be able to automatically identify opportunities to optimize—oversized resources, inefficient code, data tiering—and then make it easy for engineers to implement those optimizations.
Go from reactive to proactive—not just perpetually scan the horizon for optimization opportunities and respond to them after the fact, but leverage AI to predict capacity needs accurately, implement automated governance guardrails to keep things under control, and even launch corrective actions automatically.

Observability: Understanding costs in context

The first step to controlling costs is to understand what’s happening in your cloud environment—who’s spending what, where, why. The key here is to measure everything with precision and in context. Cloud vendor billing consoles (and third-party cost management tools) give you only an aggregated view of total spend on various services (compute, storage, platform), and monthly cloud bills can be pretty opaque with hundreds of individual line items that, again, don’t go any deeper than service type. There’s nothing that tells you which applications are running, who’s submitting them, which datasets are actually touched, and other contextual details.

To counter such obscurity, many organizations employ some sort of cost-allocation tagging taxonomy to categorize and track resource usage. If you already have such a classification in place, Unravel can easily adopt it without having to reinvent the wheel. If you don’t, it’s simple to implement such an approach in Unravel.

Two things to consider when looking at tagging capabilities:

How deep and atomic is the tagging?
What’s the frequency?

Unravel lets you apply tags at a highly granular level: by department, team, workload, application, even down to the individual job (or sub-part of a job) or specific user. And it happens in real time—you don’t have to wait around for the end of the monthly billing cycle.

These twin capabilities of capturing and correlating highly granular details and delivering real-time information are not merely nice-to-haves, but must-haves when it comes to the practical implementation of the FinOps lifecycle framework. They enable Unravel to:

Track actual spend vs. budget. Know ahead of time whether workload, user, cluster, etc., usage is projected to be on target, is at risk, or has already gone over budget. Preemptively prevent budget overruns rather than getting hit with sticker shock at the end of the month.

Check out the 90-second Automated Budget Tracking interactive demo

Identify the big spenders. Know which projects, teams, applications, tables, queue, users, etc. are consuming the most cloud resources.

Understand trends and patterns. Visualize how the cost of clusters changes over time. Understand seasonality-driven peaks and troughs—are you using a ton of resources only on Saturday night? Same time every day? When is there quiet time?—to identify opportunities for improvement.

Implement precise chargebacks/showbacks. Automatically generate cost-allocation reports down to the penny. Pinpoint who’s spending how much, where, and why. Because Unravel captures all the deep, nitty-gritty details about what’s running across your data stack—horizontally and vertically—and correlates them all in a “workload-aware” business context, you can finally solve the problem of allocating costs of shared resources.

Check out the 90-second Chargeback Report interactive demo

Forecast capacity with confidence. Run regularly scheduled reports that analyze historical usage and trends to predict future needs.

The observability phase is all about empowering data teams with visibility, precise allocation, real-time budgeting information, and accurate forecasting for cost governance. It provides a 360° view of what’s going on in your cloud environment and how much it’s costing you.

Optimization: Having AI do the heavy lifting

If the observability phase tells you what’s happening and why, the optimization phase is about taking that information to identify where you can eliminate waste, remove inefficiencies, or leverage different instance types that are less expensive without sacrificing SLAs.

In theory, that sounds pretty obvious and straightforward. In practice, it’s anything but. First of all, figuring out where there’s waste is no small task. In an enterprise that runs tens (or hundreds) of thousands of jobs every month, there are countless decisions—at the application, pipeline, and cluster level—to make about where, when, and how to run those jobs. And each individual decision carries a price tag.

And the people making those decisions are not experts in allocating resources; their primary concern and responsibility is to make sure the job runs successfully and meets its SLA. Even an enterprise-scale organization could probably count the number with such operational expertise on one hand.

Most cost management solutions can only identify idle clusters. Deployed but no-longer-in-use resources are an easy target but represent only 10% of potential savings. The other 90% lie below the surface.

Digging into the weeds to identify waste and inefficiencies is a laborious, time-consuming effort that these already overburdened experts don’t really have time for.

AI is a must

Yet generating and sharing timely information about how and where to optimize so that individuals can assume ownership and accountability for cloud usage/cost is a bedrock principle of FinOps. This is where AI is mandatory, and only Unravel has the AI-powered insights and prescriptive recommendations to empower data teams to take self-service optimization action quickly and easily.

What Unravel does is take all the full-stack observability information and “throw some math at it”—machine learning and statistical algorithms—to understand application profiles and analyze what resources are actually required to run them vs. the resources that they’re currently consuming.

This is where budgets most often go off the rails: overprovisioned (or underutilized) clusters and jobs due to instances and containers that are based on perceived need rather than on actual usage. What makes Unravel unique in the marketplace is that its AI not only continuously scours your data environment to identify exactly where you have allocated too many or oversized resources but gives you crisp, prescriptive recommendations on precisely how to right-size the resources in question.

Check out the 90-second AI Recommendations interactive demo

But unoptimized applications are not solely infrastructure issues. Sometimes it’s just bad code. Inefficient or problematic performance wastes money, too. We’ve seen how a single bad join on a multi-petabyte table kept a job running all weekend and wound up costing the company over $500,000. Unravel can prevent this from happening in the first place; the AI understands what your app is trying to do and can tell you this app submitted in this fashion is not efficient—pointing you to the specific line of code causing problems.

Check out the 90-second Code-Level Insights interactive demo

Every cloud provider has auto-scaling options. But what should you auto-scale, to what, and when? Because Unravel has all this comprehensive, granular data in “workload-aware” context, the AI understands trends and usage and access to help you predict and see the days of the week and times of day when auto-scaling is appropriate (or find better times to run jobs). Workload heatmaps based on actual usage make it easy to visualize when, where, and how to scale resources intelligently.

You can save 80-90% on storage costs through data tiering, moving less frequently used data to less expensive options. Most enterprises have petabytes of data, and they’re not using all of it all of the time. Unravel shows which datasets are (not) being used, applying cold/warm/hot labels based on age or usage, so you understand which ones haven’t been touched in months yet still sit on expensive storage.

The optimization phase is where you take action. This is the hardest part of the FinOps lifecycle to put into practice and where Unravel shines. The enormity and complexity of today’s data applications/pipelines comprise hundreds of thousands of places where waste, inefficiencies, or better alternatives exist. Rooting them out and then getting the insights to optimize require the two things no company has enough of: time and expertise. Unravel automatically identifies where you could do better—at the user, job, cluster level—and tells you the exact parameters to apply that would improve costs.

Governance: Going from reactive to proactive

Reducing costs reactively is good. Controlling them proactively is better. Finding and “fixing” waste and inefficiencies too often relies on the same kind of manual firefighting for FinOps that bogs down data engineers who need to troubleshoot failed or slow data applications. In fact, resolving cost issues usually relies on the same handful of scarce experts who have the know-how to resolve performance issues. In many respects, cost and performance are flip sides of the same coin—you’re looking at the same kind of granular details correlated in a holistic workload context, only from a slightly different angle.

AI and automation are crucial. Data applications are simply too big, too complex, too dynamic for even the most skilled humans to manage by hand. What makes FinOps for data so difficult is that the thousands upon thousands of cost optimization opportunities are constantly recasting themselves. Unlike software applications, which are relatively static products, data applications are fluid and ever-changing. Data cloud cost optimization is not a straight line with a beginning and end, but rather a circular loop: the AI-powered insights used to gain contextual full-stack observability and actually implement cost optimization are harnessed to give everyone on the data team at-a-glance understanding of their costs, implement automated guardrails and governance policies, and enable non-expert engineers to make expert-level optimizations via self-service. After all, the best way to spend less time firefighting is to avoid there being a fire in the first place.

With Unravel, you can set up customizable governance policy-based automated guardrails that set boundaries for any business dimension (looming budget overruns, jobs that exceed size/time/cost thresholds, etc.). Controlling particular users, apps, or business units from exceeding certain behaviors has a profound impact on reining in cloud spend.
Proactive alerts can be triggered whenever a guardrail constraint is violated. Alerts can be sent to the individual user—or sent up the chain of command—to let them know their job will miss its SLA or cost too much money and they need to find less expensive options, rewrite it to be more efficient, reschedule it, etc.
Governance policy violations can even trigger preemptive corrective actions. Unravel can automatically take “circuit breaker” remediation to kill jobs or applications (even clusters), request configuration changes, etc., to put the brakes on rogue users, runaway jobs, overprovisioned resources, and the like.

The governance phase of the FinOps lifecycle deals with getting ahead of cost issues before the fact rather than afterwards. Establishing control in a highly distributed environment is a challenge for any discipline, but Unravel empowers individual users with self-service capabilities that enables them to take individual accountability for their cloud usage, automatically alerts on potential budget-busting, and can even take preemptive corrective action without any human involvement. It’s not quite self-healing, but it’s the next-best thing—the true spirit of AIOps.

The post How Unravel Helps FinOps Control Data Cloud Costs appeared first on Unravel.

Get Ready for the Next Generation of DataOps Observability

Stephen Lamont — Wed, 05 Oct 2022 00:15:52 +0000

This blog was originally published by Unravel CEO Kunal Agarwal on LinkedIn in September 2022.

I was chatting with Sanjeev Mohan, Principal and Founder of SanjMo Consulting and former Research Vice President at Gartner, about how the emergence of DataOps is changing people’s idea of what “data observability” means. Not in any semantic sense or a definitional war of words, but in terms of what data teams need to stay on top of an increasingly complex modern data stack. While much ink has been spilled over how data observability is much more than just data profiling and quality monitoring, until only very recently the term has pretty much been restricted to mean observing the condition of the data itself.

But now DataOps teams are thinking about data observability more comprehensively as embracing other “flavors” of observability like application and pipeline performance, operational observability into how the entire platform or system is running end-to-end, and business observability aspects such as ROI and—most significantly—FinOps insights to govern and control escalating cloud costs.

That’s what we at Unravel call DataOps observability.

Data teams are getting bogged down

Data teams are struggling, overwhelmed by the increased volume, velocity, variety, and complexity of today’s data workloads. These data applications are simultaneously becoming ever more difficult to manage and ever more business-critical. And as more workloads migrate to the cloud, team leaders are finding that costs are getting out of control—often leading to migration initiatives stalling out completely because of budget overruns.

The way data teams are doing things today isn’t working.

Data engineers and operations teams spend way too much time firefighting “by hand.” Something like 70-75% of their time is spent tracking down and resolving problems through manual detective work and a lot of trial and error. And with 20x more people creating data applications than fixing them when something goes wrong, the backlog of trouble tickets gets longer, SLAs get missed, friction among teams creeps in, and the finger-pointing and blame game begins.

This less-than-ideal situation is a natural consequence of inherent process bottlenecks and working in silos. There are only a handful of experts who can untangle the wires to figure out what’s going on, so invariably problems get thrown “over the wall” to them. Self-service remediation and optimization is just a pipe dream. Different team members each use their own point tools, seeing only part of the overall picture, and everybody gets a different answer to the same problem. Communication and collaboration among the team breaks down, and you’re left operating in a Tower of Babel.

Check out our white paper DataOps Observability: The Missing Link for Data Teams
Download here

Accelerating next-gen DataOps observability

These problems aren’t new. DataOps teams are facing some of the same general challenges as their DevOps counterparts did a decade ago. Just as DevOps united the practice of software development and operations and transformed the application lifecycle, today’s data teams need the same observability but tailored to their unique needs. And while application performance management (APM) vendors have done a good job of collecting, extracting, and correlating details into a single pane of glass for web applications, they’re designed for web applications and give data teams only a fraction of what they need.

System point tools and cloud provider tools all provide some of the information data teams need, but not all. Most of this information is hidden in plain sight—it just hasn’t been extracted, correlated, and analyzed by a single system designed specifically for data teams.

That’s where Unravel comes in.

Data teams need what Unravel delivers—observability designed to show data application/pipeline performance, cost, and quality coupled with precise, prescriptive fixes that will allow you to quickly and efficiently solve the problem and get on to the real business of analyzing data. Our AI-powered solution helps enterprises realize greater return on their investment in the modern data stack by delivering faster troubleshooting, better performance to meet service level agreements, self-service features that allow applications to get out of development and into production faster and more reliably, and reduced cloud costs.

I’m excited, therefore, to share that earlier this week, we closed a $50 million Series D round of funding that will allow us to take DataOps observability to the next level and extend the Unravel platform to help connect the dots from every system in the modern data stack—within and across some of the most popular data ecosystems.

Unlocking the door to success

By empowering data teams to spend more time on innovation and less time firefighting, Unravel helps data teams take a page out of their software counterparts’ playbook and tackle their problems with a solution that goes beyond observability—to not just show you what’s going on and why, but actually tell you exactly what to do about it. It’s time for true DataOps observability.

To learn more about how Unravel Data is helping data teams tackle some of today’s most complex modern data stack challenges, visit: www.unraveldata.com.

The post Get Ready for the Next Generation of DataOps Observability appeared first on Unravel.

Reflections on “The Great Data Debate 2022” from Big Data London

Stephen Lamont — Tue, 04 Oct 2022 13:47:57 +0000

This year’s Big Data LDN (London) was huge. Over 150 exhibitors, 300 expert speakers across 12 technical and business-led conference theaters. It was like being the proverbial kid in a candy store, and I had to make some tough decisions on which presentations to attend (and which I’d miss out on).

One that looked particularly promising was “The Great Data Debate 2022,” panel discussion, hosted by industry analyst and conference chair Mike Ferguson with panelists Benoit Dageville, Co-founder and President of Products at Snowflake, Shinji Kim, Select Star Founder and CEO, Chris Gladwin, Ocient CEO, and Tomer Shiran, Co-founder and Chief Product Officer of Dremio. You can watch the 1 hour Great Data Debate 2022 recording below.

The panel covered a lot of ground, everything from the rise of component-based development, how the software development approach has gate-crashed the data and analytics world, the challenges around integrating all the new tools, best-of-breed vs. single platform, the future of data mesh, metadata standards, data security and governance, and much more.

Sometimes the panelists agreed with each other, sometimes not, but the discussion was always lively. The parts I found most interesting revolved around migrating to the cloud and controlling costs once there.

Moderator Mike Ferguson opened up the debate by asking the panelists how the current economic climate has changed companies’ approach—whether they’re accelerating their move to the cloud, focusing more on cost reduction or customer retention, etc.

All the panelists agreed that more companies are increasingly migrating workloads to the cloud. Said Benoit Dageville: “We’re seeing an acceleration to moving to the cloud, both because of cost—you can really lower your cost—and because you can do much more in the cloud.”

Chris Gladwin added that the biggest challenge among hyperscale companies is that “they want to grow faster and be more efficient.” Shinji Kim echoed this sentiment, though from a different viewpoint, saying that many organisations are looking at how they want to structure the team—focusing more effort on automation or tooling to make everyone more productive in their own role. Tomer Shiran made the point that “a lot of customers now are leveraging data to either save money or increase their revenue. And there’s more focus on people asking if the path of spending they’re on with current data infrastructure is sustainable for the future.”

We at Unravel are also seeing an increased focus on making data teams more productive and on leveraging automation to break down silos, promote more collaboration, reduce toilsome troubleshooting, and accelerate the DataOps lifecycle. But piggy-backing on Tomer’s point: While the numbers certainly bear out that more workloads are indeed moving to the cloud, we are seeing that among more mature data-driven organisations—those that already have 2-3 years of experience running data workloads in the cloud under their belt—migration initiatives are “hitting a wall” and stalling out. Cloud costs are spiraling out of control, and companies find themselves burning through their budgets with little visibility into where the spend is going or ability to govern expenses.

As Mike put it: “As an analyst, I get to talk to CFOs and a lot of them have no idea what the invoice is going to be like at the end of the month. So the question really is, how does a CFO get control over this whole data and analytics ecosystem?”

Chris was first to answer. “In the hyperscale segment, there are a lot of things that are different. Every customer is the size of a cloud, every application is the size of a cloud. Our customers have not been buying on a per usage basis—if you’re hammering away all day on a cluster of clusters, you want a price based on the core. They want to know in advance what it’s going to cost so they can plan for it. They don’t want to be disincented from using the platform more and more because it’ll cost more and more.”

Benoit offered a different take: “Every organisation wants to become really data-driven, and it pushes a lot of computation to that data. I believe the cloud and its elasticity is the most cost-effective way to do that. And you can do much more at lower costs. We have to help the CFO and the organisation at large understand where the money is spent to really control [costs], to define budget, have a way to charge back to the different business units, and be very transparent to where the cost is going. So you have to have what we call cost governance. And we tell all our customers when they start [using Snowflake] that they have to put in place guardrails. It’s not a free lunch.”

Added Shinji: “It’s more important than ever to track usage and monitor how things are actually going, not just as a one-time cost reduction initiative but something that actually runs continuously.”

Benot summed it up by saying, “Providing the data, the monitoring, the governance of costs is a very big focus for all of us [on the panel], at different levels.”

It’s interesting to hear leaders from modern data stack vendors as diverse as Snowflake, Select Star, and Dremio emphasise the need for automated cost governance guardrails. Because nobody does cloud cost governance for data applications and pipelines better than Unravel.

Check out the full The Great Data Debate 2022 panel discussion.

The post Reflections on “The Great Data Debate 2022” from Big Data London appeared first on Unravel.

Unravel Goes on the Road at These Upcoming Events

Stephen Lamont — Thu, 01 Sep 2022 19:41:27 +0000

Join us at an event near you or attend virtually to discover our DataOps observability platform, discuss your challenges with one of our DataOps experts, go under the hood to check out platform capabilities, and see what your peers have been able to accomplish with Unravel.

September 21-22: Big Data LDN (London)

Big Data LDN is the UK’s leading free-to-attend data & analytics conference and exhibition, hosting leading data and analytics experts, ready to arm you with the tools to deliver your most effective data-driven strategy. Stop by the Unravel booth (stand #724) to see how Unravel is observability designed specifically for the unique needs of today’s data teams.

And be sure to stop by the Unravel booth at 5PM on Day 1 for the Data Drinks Happy Hour for drinks and snacks (while supplies last!)

October 5-6: AI & Big Data Expo – North America (Santa Clara)

The world’s leading AI & Big Data event returns to Santa Clara as a hybrid in-person and virtual event, with more than 5,000 attendees expected to join from across the globe. The expo will showcase the most cutting-edge technologies from 250+ speakers sharing their unparalleled industry knowledge and real-life experiences, in the forms of solo presentations, expert panel discussions and in-depth fireside chats.

And don’t miss Unravel Co-Founder and CEO Kunal Agarwal’s feature presentation on the different challenges facing different AI & Big Data team members and how multidimensional observability (performance, cost, quality) designed specifically for the modern data stack can help.

October 10-12: Chief Data & Analytics Officers (CDAO) Fall (Boston)

The premier in-person gathering for data & analytics leaders in North America, CDAO Fall offers focus tracks on data infrastructure, data governance, data protection & privacy; analytics, insights, and business intelligence; and data science, artificial intelligence, and machine learning. Exclusive industry summit days for data and analytics professionals in Financial Services, Insurance, Healthcare, and Retail/CPG.

October 14: DataOps Observability Conf India 2022 (Bengaluru)

India’s first DataOps observability conference, this event brings together data professionals to collaborate and discuss best practices and trends in the modern data stack, analytics, AI, and observability.

Join leading DataOps observability experts to:

Understand what DataOps is and why it’s important
Learn why DataOps observability has become a mission-critical need in the modern data stack
Discover how AI is transforming DataOps and observability

November 1-3: ODSC West (San Francisco)

The Open Data Science Conference (ODSC) is essential for anyone who wants to connect to the data science community and contribute to the open source applications it uses every day. A hybrid in-person/virtual event, ODSC West features 250 speakers, with 300 hours of content, including keynote presentations, breakout talk sessions, hands-on tutorials and workshops, partner demos, and more.

Sneak peek into what you’ll see from Unravel

Want a taste of what we’ll be showing? Check out our 2-minute guided-tour interactive demos of our unique capabilities. Explore features like:

Full-stack data pipeline observability
Automated root cause analysis, at both the job and pipeline level
Job-level AI optimization recommendations
Automated cloud cluster optimizations
Pinpoint-precise chargeback reports
Automated budget tracking
Proactive cost governance alerts
Cloud migration workload fit reports
Automated cluster discovery

Explore all our interactive guided tours here.

The post Unravel Goes on the Road at These Upcoming Events appeared first on Unravel.

Amazon EMR cost optimization and governance

Unravel Data — Thu, 04 Aug 2022 15:07:23 +0000

There are now dozens of AWS cost optimization tools that exist today. Here’s the purpose-built one for AWS EMR: Begin monitoring immediately to gain control of your AWS EMR costs and continuously optimize resource performance.

What is Amazon EMR (Elastic MapReduce)?

Amazon EMR is a cloud big data platform for running large-scale distributed data processing jobs, interactive SQL queries, and machine learning (ML) applications using open-source analytics frameworks such as Apache Spark, Apache Hive, and Presto.

Based on the workload and the application type, EMR can process a huge amount of data by using EC2 instances running the Hadoop File System (HDFS) and EMRFS based on AWS S3. Based on the workload type, these EC2 instances can be configured with any instance types of on-demand and/or spot market kind.

AWS EMR is a great platform, as more and more workloads get added to it then understanding pricing can be a challenge. It’s challenging to govern the cost and easy to lose track of aspects of your monthly spend. In this article, we are sharing tips for governing and optimizing your AWS EMR costs, and resources.

Amazon EMR costs

With multiple choices in selecting instance types and configuring the EMR cluster, understanding pricing of EMR service can become cumbersome and difficult. And because the EMR service inherently utilizes other AWS services (EC2, EBS, EMR, others) so the usage cost of all these services too gets factored into the cost bill.

Best practices for optimizing the cost of your AWS EMR cluster

Here are the list of best practices/techniques for monitoring and optimizing the cost of your EMR cluster:

1. Always tag your resources
Tags is a label consisting of key-value pairs, which allows one to assign metadata to their AWS resources. And, providing one the ability to easily manage, identify, organize, search for, and filter resources. Thus, it is important to give a meaningful and purpose built tags
For example: Create tags to categorize resources by purpose, owner, department, or other criteria as shown below

2. Pick the right cluster type
AWS EMR offers two cluster types – permanent and transient.

For transient clusters, the compute unit is decoupled from the storage. HDFS on local storage is best used for caching the intermediate results and EMRFS is the final destination for storing persistent data in AWS S3. Once the computation is done and the results are stored safely in AWS S3, the resources on transient clusters can be reclaimed.

For permanent clusters, the data in HDFS is stored in EBS volumes, and can not easily be shared outside of the clusters. Small file issues in Hadoop Name Nodes will still be present just as the on-premise Hadoop clusters.

3. Size your cluster appropriately
Undersized or oversized clusters are what you need to absolutely avoid. EMR platform provides you with auto-scaling capabilities, however, it is important to first factor in the right-sizing aspect for your clusters so as to avoid higher costs and workload execution inefficiencies. To anticipate these issues, you can calculate the number and type of nodes that will be needed for the workloads.
Master node: As the computational requirements are low for this node so it can be a single node.
Core node: As these nodes perform the data processing and storing of data in the HDFS so it is important to right size these nodes. As per Amazon guiding principle, you can multiply the number of nodes by the EBS storage capacity of each node.

For example, if you define 10 core nodes to process 1 TB of data, and you have selected m5.xlarge instance type (with 64 GiB of EBS storage), you have 10 nodes*64 GiB, or 640 GiB of capacity. Based on the HDFS replication factor of three, your data size is replicated three times in the nodes, so 1 TB of data requires a capacity of 3 TB.

For above scenario, you have two options:
a. As you have only 640 GiB here so to run your workload optimally you must increase the number of nodes or change the instance type until you have a capacity of 3 TB.
b. Alternatively, switching your instance type from m5.xlarge to an m5.4xlarge instance type and selecting 12 instances provides enough capacity.

12 instances * 256 GiB = 3072 GiB = 3.29 TB available

Task node: As these nodes only run tasks and do not store data so to calculate the number of task nodes, you only need to estimate the memory usage. As this memory capacity can be distributed between the core and task nodes, it is easy to calculate the number of task nodes by subtracting the core node memory.

As per Amazon provided best practices guidelines, you need to multiply the memory needed by three.
For example, suppose that you have 28 processes of 20 GiB each then your total memory requirements would be as follows:

3*28 processes*20 GiB of memory = 1680 GiB of memory

For this example, your core nodes have 64 GiB of memory (m5.4xlarge instances), and your task nodes have 32 GiB of memory (m5.2xlarge instances). Your core nodes provide 64 GiB * 12 nodes = 768 GiB of memory, which is not enough in this example. To find the shortage, subtract this memory from the total memory required: 1680 GiB – 768 GiB = 912 GiB. You can set up the task nodes to provide the remaining 912 GiB of memory. Divide the shortage memory by the instance type memory to obtain the number of task nodes needed.

912 GiB / 32 GiB = 28.5 task nodes
4. Based on your workload size, always pick the right instance type and size

Let us take an example of Task node where suppose that you have 28 processes of 20 GiB each then your total memory requirements would be as follows:
3*28 processes*20 GiB of memory = 1680 GiB of memory

For this example, suppose your core nodes have 64 GiB of memory (m5.4xlarge instances), and your task nodes have 32 GiB of memory (m5.2xlarge instances). Your core nodes provide 64 GiB * 12 nodes = 768 GiB of memory, which is not enough in this example.

To find the shortage, subtract this memory from the total memory required: 1680 GiB – 768 GiB = 912 GiB. You can set up the task nodes to provide the remaining 912 GiB of memory. Divide the shortage memory by the instance type memory to obtain the number of task nodes needed.

912 GiB / 32 GiB = 28.5 task nodes

5. Use autoscaling as needed
Based on your workload size, Amazon EMR can programmatically scale out applications like Apache Spark and Apache Hive to utilize additional nodes for increased performance and scale in the number of nodes in your cluster to save costs when utilization is low.

For example, you can assign minimum, maximum, on-demand limit and maximum core node to dynamically scale up and down based on your running workload.

6. Always have a cluster termination policy set
When you add an auto-termination policy to a cluster, you specify the amount of idle time after which the cluster should automatically shut down. This ability allows you to orchestrate cluster cleanup without the need to monitor and manually terminate unused clusters.

You can attach an auto-termination policy when you create a cluster, or add a policy to an existing cluster. To change or disable auto-termination, you can update or remove the policy.

7. Monitor cost with cost explorer

To manage and meet your costs within your budget, you need to diligently monitor your costs. One tool that AWS offers you here is AWS Cost Explorer, which allows you to visualize, understand, and manage your AWS costs and usage over time.

With Cost Explorer, you can build custom applications to directly access and query your costs and usage data, build interactive and ad-hoc analytics reports over a daily or monthly granularity. You can even create a forecast by selecting a future time range for your reports, estimate your AWS bill and set alarms and budgets based on predictions.

Unravel can help!

Without doubt, AWS helps you manage your EMR clusters and its costs with the above listed pathways. And, Cost Explorer is a great tool to use for monitoring your monthly bill, however, that all does come with a price where you have to spent your precious time checking and monitoring things manually or writing custom scripts to fetch the data and then run that data by your data science teams and finance ops for detailed analysis.

Further, the data provided by the Cost Explorer for your EMR cluster costs is not in real-time (has a turn around of 24 hours delay). And, it is also difficult to access your EMR cluster cost usage with other services costs included. However, not to worry, there is a better solution available today — a dataops observability product from the company Unravel Data! — which frees you completely from worrying all about your EMR cluster management and costs as it gives you the real-time view, holistic and fully automated way to manage your clusters!

AWS EMR Cost Management is made easy with Unravel Data

Although there are many tools offered by AWS as well other companies to manage your EMR cluster costs, Unravel stands out with its key offerings of providing you a single pane of glass and ease of use.

Unravel provides an automated observability for your modern data stack!

Unravel’s purpose-built observability for modern data stacks helps you stop firefighting issues, control costs, and run faster data pipelines, all monitored and observed via a single pane of glass.

One unique value that Unravel provides is Chargeback details for the EMR clusters in real-time, where a detailed cost breakdown is provided for your services (EMR, EC2, and EBS volumes) for each configured AWS account. In addition, you get a holistic view of your cluster with respect to resources utilization, chargeback and instance health, with automated Artificial Intelligence based delivered cluster cost-saving recommendations and suggestions.

AWS EMR Monitoring with Unravel’s DataOps Observability

Unravel 4.7.4 has the capability to holistically monitor your EMR cluster. It collects and monitors a range of data points for various KPIs and metrics, with which it then builds a knowledge base to derive the resource and cost saving insights and recommendation decisions.

AWS EMR chargeback and showback

The image below shows the cost breakdown for EMR, EC2 and EBS services

Monitoring AWS EMR cost trends

For your EMR cluster usage, it is important to see how the costs are trending based on your usage and workload size. Unravel helps you with understanding your costs via its chargeback page. Our agents are constantly fetching all the relevant metrics used for analyzing the cluster costs usage and resource utilization, showing you the instant chargeback view in real-time. These collected metrics are further feeded into our AI engine to give you the recommended insights.

The image below shows the cost trends per cluster type, avg costs and total costs

Complete monitoring AWS EMR insights

As seen in the above image, Unravel has analyzed the resources utilization (both memory and CPU) of the clusters and analyzed the configured instance types used for your cluster. Further, based on your executed workload size, Unravel has come up with a set of recommendations to help you save costs by downsizing your node instance types.

Do you want to lower your AWS EMR cost?

Avoid overspending on AWS EMR. If you’re not sure how to lower your AWS EMR cost, or simply don’t have time, the Unravel’s DataOps Observability tool can help you save cost.
Schedule a free consultation.
Create your free account today.
Watch video: 5 Best Practices for Optimizing Your Big Data Costs With Amazon EMR

The post Amazon EMR cost optimization and governance appeared first on Unravel.

Why Legacy Observability Tools Don’t Work for Modern Data Stacks

Stephen Lamont — Fri, 13 May 2022 13:13:01 +0000

Whether they know it or not, every company has become a data company. Data is no longer just a transactional byproduct, but a transformative enabler of business decision-making. In just a few years, modern data analytics has gone from being a science project to becoming the backbone of business operations to generate insights, fuel innovation, improve customer satisfaction, and drive revenue growth. But none of that can happen if data applications and pipelines aren’t running well.

Yet data-driven organizations find themselves caught in a crossfire: their data applications/pipelines are more important than ever, but managing them is more difficult than ever. As more data is generated, processed, and analyzed in an increasingly complex environment, businesses are finding the tools that served them well in the past or in other parts of their technology stack simply aren’t up to the task.

Modern data stacks are a different animal

Would you want an auto mechanic (no matter how excellent) to diagnose and fix a jet engine problem before you took flight? Of course not. You’d want an aviation mechanic working on it. Even though the basic mechanical principles and symptoms — engine trouble — are similar, automobiles and airplanes are very different under the hood. The same is true with observability for your data application stacks and your web application stacks. The process is similar, but they are totally different animals.

At first glance, it may seem that the leading APM monitoring and observability tools like Datadog, New Relic, Dynatrace, AppDynamics, etc., do the same thing as a modern data stack observability platform like Unravel. And in the sense that both capture and correlate telemetry data to help you understand issues, that’s true. But one is designed for web apps, while the other for modern data pipelines and applications.

Observability for the modern data stack is indeed completely different from observability for web (or mobile) applications. They are built and behave differently, face different types of issues for different reasons, requiring different analyses to resolve problems. To fully understand, troubleshoot, and optimize (for both performance and cost) data applications and pipelines, you need an observability platform that’s built from the ground up to tackle the specific complexities of the modern data stack. Here’s why.

What’s different about modern data applications?

First and foremost, the whole computing framework is different for data applications. Data workloads get broken down into multiple, smaller, often similar parts each processed concurrently on a separate node, with the results re-combined upon completion — parallel processing. And this happens at each successive stage of the workflow as a whole. Dependencies within data applications/pipelines are deep and layered. It’s crucial that everything — execution time, scheduling and orchestration, data lineage, and layout — be in sync.

In contrast, web applications are a tangle of discrete request-response services processed individually. Each service does its own thing and operates relatively independently. What’s most important is the response time of each service request and how that contributes to the overall response time of a user transaction. Dependencies within web apps are not especially deep but are extremely broad.

Web apps are request-response; data apps process in parallel.

Consequently, there’s a totally different class of problems, root causes, and remediation for data apps vs. web apps. When doing your detective work into a slow or failed app, you’re looking at a different kind of culprit for a different type of crime, and need different clues (observability data). You need a whole new set of data points, different KPIs, from distinct technologies, visualized in another way, and correlated in a uniquely modern data stack–specific context.

The flaw with using traditional APM tools for modern data stacks

What organizations who try to use traditional APM for the modern data stack find is that they wind up getting only a tiny fraction of the information they need from a solution like Dynatrace or Datadog or AppDynamics, such as infrastructure and services-level metrics. But over 90% of the information data teams need is buried in places where web APM simply doesn’t go — you need an observability platform specifically designed to dig through all the systems to get this data and then stitch it together into a unified context.

This is where the complexity of modern data applications and pipelines rears its head. The modern data stack is not a single system, but a collection of systems. You might have Kafka or Kinesis or Data Factory for data ingestion, some sort of data lake to store it, then possibly dozens of different components for different types of processing: Druid for real-time processing, Databricks for AI/ML processing, BigQuery or Snowflake for data warehouse processing, another technology for batch processing — the list goes on. So you need to capture deep information horizontally across all the various systems that make up your data stack. But you also need to capture deep information vertically, from the application down to infrastructure and everything in between (pipeline definitions, data usage/lineage, data types and layout, job-level details, execution settings, container-level information, resource allocation, etc.).

Cobbling it together manually via “swivel chair whiplash,” jumping from screen to screen, is a time-consuming, labor-intensive effort that can take hours — even days — for a single problem. And still there’s a high risk that it won’t be completely accurate. There is simply too much data to make sense of, in too many places. Trying to correlate everything on your own, whether by hand or with a homegrown jury-rigged solution, requires two things that are always in short supply: time and expertise. Even if you know what you’re looking for, trolling through hundreds of log files is like looking for a needle in a stack of needles.

An observability platform purpose-built for the modern data stack can do all that for you automatically. Trying to make traditional APM observe data stacks is simply using the wrong tool for the job at hand.

DevOps APM vs. DataOps observability in practice

With the growing complexity in today’s modern data systems, any enterprise-grade observability solution should do 13 things:

Capture full-stack data — both horizontally and vertically — from the various systems that make up the stack, including engines, schedulers, services, and cloud provider
Capture information about all data applications: pipelines, warehousing, ETL/ELT, machine learning models, etc.
Capture information about datasets, lineage, users, business units, computing resources, infrastructure, and more
Correlate, not just aggregate, data collected into meaningful context
Understand all application/pipeline dependencies on data, resources, and other apps
Visualize data pipelines end to end from source to output
Provide a centralized view of your entire data stack, for governance, usage analytics, etc.
Identify baseline patterns and detect anomalies
Automatically analyze and pinpoint root causes
Proactively alert to prevent problems
Provide recommendations and remedies to solve issues
Automate resolution or self-healing
Show the business impact

While the principles are the same for data app and web app observability, how to go about this and what it looks like are markedly dissimilar.

Everything starts with the data — and correlating it

If you don’t capture the right kind of telemetry data, nothing else matters.

APM solutions inject agents that run 24×7 to monitor the runtime and behavior of applications written in .NET, Java, Node.js, PHP, Ruby, Go, and dozens of other languages. These agents collect data on all the individual services as they snake through the application ecosystem. Then APM stitches together all the data to understand which services the application calls and how the performance of each discrete service call impacts performance of the overall transaction. The various KPIs revolve around response times, availability (up/down, green/red), and the app users’ digital experience. The volume of data to be captured is incredibly broad, but not especially deep.

APM is primarily concerned with response times and availability. Here, Datadog shows red/green status and aggregated metrics.

Here, AppDynamics shows the individual response times for various interconnected services.

The telemetry details to be captured and correlated for data applications/pipelines, on the other hand, need to be both broad and extremely deep. A modern data workload comprises hundreds of jobs, each broken down into parallel-processing parts, with each part executing various tasks. And each job feeds into hundreds of other jobs and applications not only in this particular pipeline but all the other pipelines in the system.

Today’s pipelines are built on an assortment of distributed processing engines; each might be able to monitor its own application’s jobs but not show you how everything works as a whole. You need to see details at a highly granular level — for each sub-task within each sub-part of each job — and then marry them together into a single pane of glass that comprises the bigger picture at the application, pipeline, platform, and cluster levels.

DataOps observability (here, Unravel) looks at completely different metrics at the app level . .

. . . as well as the pipeline level.

Let’s take troubleshooting a slow Spark application as an example. The information you need to investigate the problem lives in a bunch of different places, and the various tools for getting this data give you only some of what you need, not all.

The Spark UI can tell you about the status of individual jobs but lacks infrastructure and configuration details and other information to connect together a full pipeline view. Spark logs help you retrospectively find out what happened to a given job (and even what was going on with other jobs at the same time) but don’t have complete information about resource usage, data partitioning, container configuration settings, and a host of other factors that can affect performance. And, of course, Spark tools are limited to Spark. But Spark jobs might have data coming in from, say, Kafka and run alongside a dozen other technologies.

Conversely, platform-specific interfaces (Databricks, Amazon EMR, Dataproc, BigQuery, Snowflake) have the information about resource usage and the status of various services at the cluster level, but not the granular details at the application or job level.Having all the information specific for data apps is a great start, but it isn’t especially helpful if it’s not all put into context. The data needs to be correlated, visualized, and analyzed in a purposeful way that lets you get to the information you need easily and immediately.

Then there’s how data is visualized and analyzed

Even the way you need to look at a data application environment is different. A topology map for a web application shows dependencies like a complex spoke-and-wheel diagram. When visualizing web app environments, you need to see the service-to-service interrelationships in a map like this:

How Dynatrace visualizes service dependencies in a topology map.

With drill-down details on service flows and response metrics:

Dynatrace drill-down details

For a modern data environment, you need to see how all the pipelines are interdependent and in what order. The view is more like a complex system of integrated funnels:

A modern data estate involves many interrelated application and pipeline dependencies (Source: Sandeep Uttamchandani)

You need full observability into not only how all the pipelines relate to one another, but all the dependencies of multiple applications within each pipeline . . .

An observability platform purpose-built for modern data stacks provides visibility into all the individual applications within a particular pipeline

. . . with granular drill-down details into the various jobs within each application. . .

. . and the sub-parts of each job processing in parallel . . .

How things get fixed

Monitoring and observability tell you what’s going on. But to understand why, you need to go beyond observability and apply AI/ML to correlate patterns, identify anomalies, derive meaningful insights, and perform automated root cause analysis. “Beyond observability” is a continuous and incremental spectrum, from understanding why something happened to knowing what to do about it to automatically fixing the issue. But to make that leap from good to great, you need ML models and AI algorithms purpose-built for the task at hand. And that means you need complete data about everything in the environment.

The best APM tools have some sort of AI/ML-based engine (some are more sophisticated than others) to analyze millions of data points and dependencies, spot anomalies, and alert on them.

For data applications/pipelines, the type of problems and their root causes are completely different than web apps. The data points and dependencies needing to be analyzed are completely different. The patterns, anomalous behavior, and root causes are different. Consequently, the ML models and AI algorithms need to be different.

In fact, DataOps observability needs to go even further than APM. The size of modern data pipelines and the complexities of multi-layered dependencies — from clusters to platforms and frameworks to applications to jobs within applications to sub-parts of those jobs to the various tasks within each sub-part — could lead to a lot of trial-and-error resolution effort even if you know what’s happening and why. What you really need to know is what to do.

An AI-driven recommendation engine like Unravel goes beyond the standard idea of observability to tell you how to fix a problem. For example, if there’s one container in one part of one job that’s improperly sized and so causing the entire pipeline to fail, Unravel not only pinpoints the guilty party but tells you what the proper configuration settings would be. Or another example is that Unravel can tell you exactly why a pipeline is slow and how you can speed it up. This is because Unravel’s AI has been trained over many years to understand the specific intricacies and dependencies of modern data stacks.

AI recommendations tell you exactly what to do to optimize for performance.

Business impact

Sluggish or broken web applications cost organizations money in terms of lost revenue and customer dissatisfaction. Good APM tools are able to put the problem into a business context by providing a lot of details about how many customer transactions were affected by an app problem.

As more and more of an organization’s operations and decision-making revolve around data analytics, data pipelines that miss SLAs or fail outright have an increasingly significant (negative) impact on the company’s revenue, productivity, and agility. Businesses must be able to depend on their data applications, so their applications need to have predictable, reliable behavior.

For example: If a fraud prevention data pipeline stops working for a bank, it can cause billions of dollars in fraudulent transactions going undetected. Or a slow healthcare analysis pipeline may increase risk for patients by failing to provide timely responses. Measuring and optimizing performance of data applications and pipelines correlates directly to how well the business is performing.

Businesses need proactive alerts when pipelines deviate from their normal behavior. But going “beyond observability” would tell them automatically why this is happening and what they can do to get the application back on track. This allows businesses to have reliable and predictable performance.

There’s also an immediate bottom-line impact that businesses need to consider: maximizing their return on investment and controlling/optimizing cloud spend. Modern data applications process a lot of data, which usually consumes a large amount of resources — and the meter is always running. This means the cloud bills can rack up fast.

To keep costs from spiraling out of control, businesses need actionable intelligence on how best to optimize their data pipelines. An AI recommendations engine can take all the profile and other key information it has about applications and pipelines and identify where jobs are overprovisioned or could be tuned for improvement. For example: optimizing code to remove inefficiencies, right-sizing containers to avoid wastage, providing the best data partitioning based on goals, and much more.

AI recommendations pinpoint exactly where and how to optimize for cost.

AI recommendations and deep insights lay the groundwork for putting in place some automated cost guardrails for governance. Governance is really all about converting the AI recommendations and insights into impact. Automated guardrails (per user, app, business unit, project) would alert operations teams about unapproved spend, potential budget overruns, jobs that run over a certain time/cost threshold, and the like. You can then proactively manage your budget, rather than getting hit with sticker shock after the fact.

In a nutshell

Application monitoring and observability solutions like Datadog, Dynatrace, and AppDynamics are excellent tools for web applications. Their telemetry, correlation, anomaly detection, and root cause analysis capabilities do a good job of helping you understand, troubleshoot, and optimize most areas of your digital ecosystem — one exception being the modern data stack. They are by design built for general-purpose observability of user interactions.

In contrast, an observability platform for the modern data stack like Unravel is more specialized. Its telemetry, correlation, anomaly detection, root cause analysis capabilities — and in the case of Unravel uniquely, its AI-powered remediation recommendations, automated guardrails, and automated remediation — is by design built specifically to understand, troubleshoot, and optimize modern data workloads.

Observability is all about context. Traditional APM provides observability in context for web applications, but not for data applications and pipelines. That’s not a knock on these APM solutions. Far from it. They do an excellent job at what they were designed for. They just weren’t built for observability of the modern data stack. That requires another kind of solution designed specifically for a different kind of animal.

The post Why Legacy Observability Tools Don’t Work for Modern Data Stacks appeared first on Unravel.

Building vs. Buying Your Modern Data Stack: A Panel Discussion

Stephen Lamont — Thu, 21 Apr 2022 19:55:09 +0000

One of the highlights of the DataOps Unleashed 2022 virtual conference was a roundtable panel discussion on building versus buying when it comes to your data stack. Build versus buy is a question for all layers of the enterprise infrastructure stack. But in the last five years — even in just the last year alone — it’s hard to think of a part of IT that has seen more dramatic change than that of the modern data stack.

These transformations shape how today’s businesses engage and work with data. Moderated by Lightspeed Venture Partners’ Nnamdi Iregbulem, the panel’s three conversation partners — Andrei Lopatenko, VP of Engineering at Zillow; Gokul Prabagaren, Software Engineering Manager at Capital One; and Aaron Richter, Data Engineer at Squarespace — weighed in on the build versus buy question and walked us through their thoughts:

What motivates companies to build instead of buy?
How do particular technologies and/or goals affect their decision?

These issues and other considerations were discussed. A few of the highlights follow, but the entire session is available on demand here.

What are the key variables to consider when deciding whether to build or buy in the data stack?

Gokul: I think the things which we probably consider most are what kind of customization a particular product offers or what we uniquely need. Then there are the cases in which we may need unique data schemas and formats to ingest the data. We must consider how much control we have of the product and also our processing and regulatory needs. We have to ask how we will be able to answer those kinds of questions if we are building in-house or choosing to adopt an outsourced product.

Aaron: Thinking from the organizational perspective, there are a few factors that come from just purchasing or choosing to invest in something. Money is always a factor. It’s going to depend on the organization and how much you’re willing to invest.

Beyond that a key factor is the expertise of the organization or the team. If a company has only a handful of analysts doing the heavy-lifting data work, to go in and build an orchestration tool would take them away from their focus and their expertise of providing insights to the business.

Andrei: Another important thing to consider is the quality of the solution. Not all the data products on the market have high quality from different points of view. So sometimes it makes sense to build something, to narrow the focus of the product. Compatibility with your operations environment is another crucial consideration when choosing build versus buy.

What’s the more compelling consideration: saving headcount or increasing productivity of the existing headcount?

Aaron: In general, everybody’s oversubscribed, right? Everybody always has too much work to do. And we don’t have enough people to accomplish that work. From my perspective, the compelling part is, we’re going to make you more efficient, we’re going to give you fewer headaches, and you’ll have fewer things to manage.

Gokul: I probably feel the same. It depends more on where we want to invest and if we’re ready to change where we’re investing: upfront costs or running costs.

Andrei: And development costs: do we want to buy this, or invest in building? And again, consider the human equation. It’s not just the number of people in your headcount. Maybe you have a small number of engineers, but then you have to invest more of their time into data science or data engineering or analytics. Saving time is a significant factor when making these choices.

How does the decision matrix change when the cloud becomes part of the consideration set in terms of build versus buy?

Gokul: I feel like it’s trending towards a place where it’s more managed. That may not be the same question as build or buy. But it skews more towards the manage option, because of that compatibility, where all these things are available within the same ecosystem.

Aaron: I think about it in terms of a cloud data warehouse: some kind of processing tool, like dbt; and then some kind of orchestration tool, like Airflow or Prefect; and there’s probably one pillar on that side, where you would never think to build it yourself. And that’s the cloud data warehouse. So you’re now kind of always going to be paying for a cloud vendor, whether it’s Snowflake or BigQuery or something of that nature.

So you already have your foot in the door there, and you’re already buying, right? So then that opens the door now to buying more things, adding things on that integrate really easily. This approach helps the culture shift. If a culture is very build-oriented, this allows them to be more okay with buying things.

Andrei: Theoretically you want to have your infrastructure independent on cloud, but it never happens, for multiple reasons. Firstly, cloud company tools make integration work much easier. Second, of course, once you have to think about multi-cloud, you must address privacy and security concerns. In principle, it’s possible to be independent, but you’ll often run into a lot of technical problems. There are multiple different factors when cloud becomes key in deciding what you will make and what tools to use.

See the entire Build vs. Buy roundtable discussion on demand
Watch now

The post Building vs. Buying Your Modern Data Stack: A Panel Discussion appeared first on Unravel.

Managing Costs for Spark on Amazon EMR

Unravel Data — Tue, 28 Sep 2021 20:43:11 +0000

The post Managing Costs for Spark on Amazon EMR appeared first on Unravel.

Managing Costs for Spark on Databricks

Unravel Data — Fri, 17 Sep 2021 20:51:08 +0000

The post Managing Costs for Spark on Databricks appeared first on Unravel.

Managing Cost & Resources Usage for Spark

Unravel Data — Wed, 08 Sep 2021 20:55:29 +0000

The post Managing Cost & Resources Usage for Spark appeared first on Unravel.

Effective Cost and Performance Management for Amazon EMR

Unravel Data — Wed, 28 Apr 2021 22:06:57 +0000

The post Effective Cost and Performance Management for Amazon EMR appeared first on Unravel.

Moving Big Data and Streaming Data Workloads to Google Cloud Platform

Unravel Data — Fri, 05 Feb 2021 22:24:18 +0000

The post Moving Big Data and Streaming Data Workloads to Google Cloud Platform appeared first on Unravel.

Cost-Effective, High-Performance Move to Cloud

Unravel Data — Thu, 05 Nov 2020 21:38:49 +0000

The post Cost-Effective, High-Performance Move to Cloud appeared first on Unravel.

Optimizing big data costs with Amazon EMR & Unravel

Unravel Data — Sat, 25 Jul 2020 19:16:35 +0000

The post Optimizing big data costs with Amazon EMR & Unravel appeared first on Unravel.

EMR Cost Optimization

Unravel Data — Wed, 22 Jul 2020 19:17:56 +0000

The post EMR Cost Optimization appeared first on Unravel.

Accelerate and Reduce Costs of Migrating Data Workloads to the Cloud

Unravel Data — Wed, 31 Jul 2019 19:40:29 +0000

Today, Unravel announced a new cloud migration assessment offer to accelerate the migration of data workloads to Microsoft Azure, Amazon AWS, or Google Cloud Platform. Our latest offer fills a significant gap in the cloud journey, equips enterprises with the tools to deliver on their cloud strategy, and provides the best possible transition with insights and guidance before, during, and after migration. Full details on the assessment and business value are detailed below in our announcement below.

So, why now?

The rapid increase in data volume and variety has driven organizations to rethink enterprise infrastructures and focus on longer-term data growth, flexibility, and cost savings. Current, on-prem solutions are too complicated, inflexible, and are not delivering on expected value. Data is not living up to its promise.

As an alternative, organizations are looking to cloud services like Azure, AWS, and Google Cloud to provide the flexibility to accommodate modern capacity requirements and elasticity. Unfortunately, organizations are often challenged by unexpected costs and a lack of data and insights to ensure a successful migration process. If left unaddressed, organizations will struggle with the complexity of these projects that don’t fulfill their expectations and frequently result in significant cost overruns.

The cloud migration assessment offer provides details of the source environment and applications running on it, identifies workloads suitable for the cloud, and computes the anticipated hourly costs. It offers granular metrics, as well as broader insights, that eliminate transition complexity and deliver migration success.

Customers can be confident that they’re migrating the right data apps, configuring them properly in the cloud, meeting performance service level agreements, and minimizing costs. Unravel can provide an alternative to what is frequently a manual effort fraught with guesswork and errors.

The two approaches can be characterized per the diagram below

Still unsure how the migration assessment will provide value to your business? Drop us a line to learn more about the offer – or download a sample cloud migration assessment report here.

————-

Read on to learn more about today’s news from Unravel.

Unravel Introduces Cloud Migration Assessment Offer to Reduce Costs and Accelerate the Transition of Data Workloads to Azure, AWS or Google Cloud

New Offer Builds a Granular Dependency Map of On-Premises Data Workloads and Provides Detailed Insights and Recommendations for the Best Transition to Cloud

PALO ALTO, Calif. – July, 31, 2019 —Unravel Data, the only data operations platform providing full-stack visibility and AI-powered recommendations to drive more reliable performance in modern data applications, today announced a new cloud migration assessment offer to help organizations move data workloads to Azure, AWS or Google Cloud faster and with lower cost. Unravel has built a goal-driven and adaptive solution that uniquely provides comprehensive details of the source environment and applications running on it, identifies workloads suitable for the cloud and determines the optimal cloud topology based on business strategy , and computes the anticipated hourly costs. The offer also provides actionable recommendations to improve application performance and enables cloud capacity planning and chargeback reporting, as well as other critical insights.

“Managing the modern data stack on-premises is complex and requires expert technical talent to troubleshoot most problems. That’s why more enterprises are moving their data workloads to the cloud, but the migration process isn’t easy , as there’s little visibility into costs and configurations,” said Kunal Agarwal, CEO, Unravel Data. “Unravel’s new cloud migration assessment offer delivers actionable insights and visibility so organizations no longer have to fly blind. No matter where an organization is in its cloud adoption and migration journey, now is the time to accelerate strategic thinking and execution, and this offering ensures the fastest, most cost effective and valuable transition for the full journey-to-cloud lifecycle.”

“Companies have major expectations when they embark on a journey to the cloud. Unfortunately, organizations that migrate manually often don’t fulfill these expectations as the process of transitioning to the cloud becomes more difficult and takes longer than anticipated. And then once there, costs rise higher than forecasted and apps are difficult to optimize,” said Enterprise Strategy Group senior analyst Mike Leone. “This all results from the lack of insight into their existing data apps on-premises and how they should map those apps to the cloud. Unravel’s new offer fills a major gap in the cloud journey, equipping enterprises with the tools to deliver on their cloud goals.”

The journey to cloud is technically complex and aligning business outcomes with a wide array of cloud offerings can be challenging. Unravel’s cloud migration assessment offer takes the guesswork and error-prone manual processes out of the equation to deliver a variety of critical insights. The assessment enables organizations to:

Discover current clusters and detailed usage to make an effective and informed move to the cloud
Identify and prioritize specific application workloads that will benefit most from cloud-native capabilities, such as elastic scaling and decoupled storage
Define the optimal cloud topology that matches specific goals and business strategy, minimizing risks or costs. Users get specific instance types recommendations on the amount of storage needed with the option to choose between local attached and object storage
Obtain the hourly costs expected to incur when moving to the cloud, allowing users to compare and contrast the costs for different cloud providers and services and for different goals
Compare costs for different cloud options (across IaaS and Managed Hadoop/Spark PaaS services). Includes the ability to override default on-demand prices to incorporate volume discounts users may have received
Optimize cloud storage tiering choices for hot, warm, and cold data

The Unravel cloud assessment service encompasses four phases. The first phase is a discovery meeting in which the project is scoped, stakeholders identified and KPIs defined. Then during technical discovery, Unravel works with customers to define use cases, install the product and begin gathering workload data. Following, is the initial readout, as enterprises receive a summary of their infrastructure and workloads along with fresh insights and recommendations for cloud migration. Then comes the completed assessment, including final insights, recommendations and next steps.

Unravel is building a rapidly expanding ecosystem of partners to provide a portfolio of data operations and migration services utilizing the Unravel Data Operations Platform and cloud migration assessment offer.

Enterprises can find a sample cloud migration assessment report here.

About Unravel Data
Unravel radically simplifies the way businesses understand and optimize the performance of their modern data applications – and the complex pipelines that power those applications. Providing a unified view across the entire stack, Unravel’s data operations platform leverages AI, machine learning, and advanced analytics to offer actionable recommendations and automation for tuning, troubleshooting, and improving performance – both today and tomorrow. By operationalizing how you do data, Unravel’s solutions support modern data stack leaders, including Kaiser Permanente, Adobe, Deutsche Bank, Wayfair, and Neustar. The company is headquartered in Palo Alto, California, and is backed by Menlo Ventures, GGV Capital, M12, Point72 Ventures, Harmony Partners, Data Elite Ventures, and Jyoti Bansal. To learn more, visit unraveldata.com.

Copyright Statement
The name Unravel Data is a trademark of Unravel Data. Other trade names used in this document are the properties of their respective owners.

PR Contact
Jordan Tewell, 10Fold
unravel@10fold.com
1-415-666-6066

The post Accelerate and Reduce Costs of Migrating Data Workloads to the Cloud appeared first on Unravel.