Author: Diggibyte_57
-
Handling CDC in Databricks: Custom MERGE vs. DLT APPLY CHANGES
Change data capture (CDC) is crucial for keeping data lakes synchronized with source systems. Databricks supports CDC through two main approaches: Custom MERGE operation (Spark SQL or PySpark) Delta Live Tables (DLT) APPLY CHANGES, a declarative CDC API This blog explores both methods, their trade-offs, and demonstrates best practices for production-grade pipelines in Databricks. Custom…
-
Reducing Dataset Size Without Sacrificing Insight
Introduction: Power BI empowers organizations to turn data into actionable insights. However, as datasets grow in complexity and volume, performance bottlenecks, slow report loads, and publishing restrictions can arise. One common misconception is that reducing dataset size means compromising on analysis or losing valuable data. But with smart modeling and design strategies, you can shrink…
-
Why Database Lookups Are Slow—and How Bloom Filters Supercharge Performance
High-scale applications like Twitter, LinkedIn, and Gmail handle millions of users every day. Making sure everything runs smoothly is important, especially for simple tasks like checking if a username is available. But with so many users, even small tasks can slow down the system. This is where Bloom Filters come in. They’re a smart, memory-efficient…
-
Understanding Instant Backoff vs Exponential Backoff
When working with systems and networks, temporary failures are inevitable. Services may go down briefly, servers can become overloaded, or network connections may be disrupted. Instead of giving up on the operation, applications usually retry. But how and when to retry makes a huge difference in performance, stability, and reliability. Two widely used retry strategies…
-
Strategy One: The Power of AI + BI in One Unified Platform
In today’s data-driven world, businesses need tools that not only analyse data but also communicate insights intuitively and intelligently. Strategy One delivers on this need by combining the precision of traditional Business Intelligence (BI) with the cognitive capabilities of Artificial Intelligence (AI), providing a truly unified, scalable, and intelligent platform. Let’s explore how this synergy…
-
Simplifying Flutter State Management with Provider, Riverpod & Bloc
In Flutter, state management plays a vital role in how the app behaves, responds, and scales. As the UI rebuilds based on user actions or data changes, managing that state effectively becomes critical. With the growing popularity of Flutter, several libraries and patterns have emerged to manage the state. Among them, Provider, Riverpod, and Bloc…
-
Understanding RELATED and RELATEDTABLE Functions in Power BI
Data modeling is a foundational skill in Power BI, and mastering DAX functions that operate across related tables is essential for creating powerful and efficient reports. Two of the most useful functions for working with relationships in Power BI are RELATED and RELATEDTABLE. In this blog, we will explore what these functions do, when to…
-
Building Modern Angular Apps with PrimeNG 18 and Angular 18
When Angular 18 was released, developers were excited about its performance boosts and stability improvements. But what really takes app development to the next level is PrimeNG 18 — a rich UI component library built specifically for Angular. If you’re tired of reinventing UI elements or piecing together multiple libraries, PrimeNG gives you everything in…
-
Top 8 Benefits of Data Science Services for Businesses
Right now, businesses do not rely on traditional reports or gut feelings to make good decisions. Instead, data science services are transforming how organisations function, innovate and optimise. By leveraging the artificial intelligence power, statistical modelling and machine learning, businesses get real-time insights that were previously impossible. Partnering with a data science services company offers…
-
End-to-End Ingestion of 400+ MySQL Tables with Databricks Delta Live Tables
Ingesting and managing data from more than 400 MySQL tables on recurring schedules is a complex challenge. Traditional approaches often lead to pipelines that are difficult to scale, hard to maintain, and prone to failure when handling schema changes or scheduling dependencies. To address these challenges, we designed and implemented a configuration-driven ingestion framework using…