Aleksandar BrkljacHow to Optimize Snowflake Costs and Boost Performance with CachingWant to drive up efficiency and cut down costs in your Snowflake setup? Here’s a secret: use caching. Reusing results from similar queries…Jun 22, 2023Jun 22, 2023
Aleksandar BrkljacinTowards DevEfficient Data Analysis with Snowflake’s Data Sampling TechniquesData is the new gold, but it can also be overwhelmingly abundant. When you’re dealing with large datasets, extracting meaningful insights…May 29, 2023May 29, 2023
Aleksandar BrkljacExploring Snowflake’s Fail-safe Feature for Data SecurityIn the world of cloud-based data warehousing, Snowflake stands out with its unique features aimed at providing high performance…May 25, 2023May 25, 2023
Aleksandar BrkljacSolving the NoSuchMethodError Exception in Spark-Minio IntegrationWhen integrating Apache Spark and Minio in a local environment, you may encounter an unusual exception that stops you in your tracks. In…May 15, 2023May 15, 2023
Aleksandar BrkljacBuilding a Real-Time Traffic Monitoring Pipeline with Spark Streaming, Kafka, and Time-Series DBI’ll walk you through a project I recently worked on — a data pipeline for traffic monitoring, incorporating both batch and real-time data…May 12, 2023May 12, 2023
Aleksandar BrkljacUnlock the Secrets of Big Data: Avro, Parquet, and ORC Exposed!Let’s explore the world of big data file formats as we dive deep into Apache Avro, Parquet, and ORC, uncovering their unique features…Apr 11, 2023Apr 11, 2023
Aleksandar BrkljacOptimizing Data Pipelines with Apache Spark and Delta Lake: A Practical Guide for Data EngineersAs data grows in size and complexity, it becomes increasingly essential for data engineers to optimize their data pipelines to ensure fast…Mar 19, 2023Mar 19, 2023
Aleksandar BrkljacUnleashing the Power of Nested JSON: A Guide to Transforming JSON Data into CSVIn this article, I share my recent experience of wrangling a complex nested JSON structure into a neat and tidy CSV format. Join me on a…Feb 24, 2023Feb 24, 2023
Aleksandar BrkljacBuilding a Resilient Data Infrastructure: Best Practices for Fault-Tolerant SystemsIn this article, we’ll explore some of the key principles of fault tolerance in data engineering, and show how they can be implemented…Feb 15, 2023Feb 15, 2023
Aleksandar BrkljacAWS Glue vs Spark on Kubernetes — Which One is Right for You?Decide between AWS Glue and Spark on Kubernetes for ETL with ease. I compared their features, benefits, and use cases in this…Feb 10, 2023Feb 10, 2023