News Aggregator


From HTTP to Kafka: A Custom Source Connector

Aggregated on: 2025-09-10 15:14:52

Recently, I came across an interesting scenario: one application had a cron job constantly polling an API for active offers, just to refresh a Redis cache that powered the offer view. So, I started thinking—isn’t there a better way to handle this? Or at least a way to offload such repetitive tasks outside the application itself? That’s when it hit me: this pattern looks way too similar to the CDC flows we already implement with Kafka Connect JDBC source connectors. So why not apply the same idea to HTTP? After a bit of digging, I found the answer was yes—we can definitely do it. But there’s a catch. The official Confluent HTTP source connector requires a license, and most open-source alternatives are either too complex or don’t quite match the use case.

View more...

Vibe Coding: Conversational Software Development — Part 4: Guiding AI Through Iteration

Aggregated on: 2025-09-10 14:14:52

Welcome to the fourth and final post in my Vibe Code series. In the previous article, I explained how system prompts can steer AI behaviour by setting initial expectations and boundaries. But if you have worked on even a mildly complex application, you will know that the first draft is never the final version. You always have some sort of UX tweaks, performance enhancements, or new feature requirements. That is where task-level prompting comes into play. Once your foundation is in place, the next step is to provide instructions to AI to build, improve, and polish step by step. In this blog post, I will share several approaches that have worked well for me.

View more...

Using Arrow Flight SQL to Improve Data Transfer Performance in Apache Doris

Aggregated on: 2025-09-10 13:14:52

Data analyst Xiao Hua rubs his sore eyes, staring blankly at the computer screen. He can't help but complain, "This data export is so slow!" Indeed, waiting for MySQL protocol to transfer large volumes of data feels like trying to drink a barrel of water through a straw — when will it ever end?

View more...

An Introduction to Artificial Intelligence: Neural Networks, NLP, and Word Embeddings

Aggregated on: 2025-09-10 12:14:52

Part 1: The Very Basics How Does It Really Work? In recent years, artificial intelligence (AI) has become a buzzword, especially with the emergence of tools like ChatGPT. However, despite the widespread conversation about AI, not everyone fully understands what it is. AI tools today can process text, generate images and videos, write code, and automate tasks. Some enthusiasts predict a future where AI replaces programmers, creates entire products independently, and potentially leaves many of us without jobs. While these apocalyptic scenarios seem exaggerated to me, I believe AI is best viewed as a powerful tool that, when used wisely, can make us more efficient and productive. For example, AI can generate small pieces of code, understand simple or vague questions, and provide quick answers — almost like a more advanced version of Google. It can feel like magic, but it’s not (or at least, not entirely).

View more...

Tuples and Records (Part 3): Potential ECMAScript Proposals

Aggregated on: 2025-09-10 11:14:52

In Part 1, we introduced JavaScript’s Tuples and Records, highlighting their role as immutable data structures that bring predictability, performance, and safety into everyday development. In Part 2, we explored migration strategies, covering how to transition existing codebases, reduce reliance on third-party libraries, and adopt Tuples and Records incrementally. Now, in Part 3, we’ll look beyond today’s capabilities and explore what’s on the horizon. While Tuples and Records already provide strong foundations, their current feature set is intentionally minimal. To make them even more useful in real-world applications, the JavaScript community is actively discussing potential ECMAScript proposals and speculative enhancements.

View more...

Using ChartMuseum as a Helm Repository

Aggregated on: 2025-09-09 20:14:51

ChartMuseum is an open-source, self-hosted Helm Chart repository server that enables users to store and manage Helm charts efficiently. Helm is the standard package manager for Kubernetes, allowing developers to deploy applications seamlessly. While Helm provides public repositories like Artifact Hub, organizations often require private and secure repositories for managing their Helm charts internally. ChartMuseum fills this gap by offering a lightweight and flexible solution. ChartMuseum provides a robust API that allows users to interact with it programmatically, making it an essential tool for automated CI/CD pipelines. It is written in Go and can be deployed as a standalone binary, within a container, or as a Kubernetes deployment.

View more...

Tutorial: RAG at Scale With Vector Databases vs. Lakehouse Architectures

Aggregated on: 2025-09-09 19:14:51

Retrieval-Augmented Generation (RAG) is quickly becoming the standard enterprise pattern for deploying large language models (LLMs). Instead of relying solely on pretraining, RAG enriches prompts with fresh, domain-specific information. The result? More accurate answers, fewer hallucinations, and outputs that enterprises can trust. But building RAG at enterprise scale is tricky. You’re not embedding a few PDFs anymore—you’re embedding billions of rows from databases, log streams, or knowledge repositories. That leads to a critical architectural question:

View more...

Understanding Table Statistics in SQL Server: Importance, Performance Impact, and Practical Examples

Aggregated on: 2025-09-09 18:14:51

In SQL Server, table statistics are metadata objects that store information about the data distribution within one or more columns of a table. These statistics are crucial for the query optimizer, which uses them to estimate the number of rows that a query's predicates will return. This estimation, known as cardinality estimation, is the foundation of a good execution plan. For example, if a query filters on a column with a skewed data distribution (i.e., some values appear much more frequently than others), the optimizer can use statistics to choose a more efficient access method, such as a clustered index scan over a non-clustered index seek, avoiding a costly lookup operation. As a DBA, I believe keeping statistics up to date is paramount for maintaining optimal query performance. Stale statistics, which don't accurately reflect changes in the underlying data (e.g., due to frequent INSERT, UPDATE, or DELETE operations), can lead the optimizer to make poor cardinality estimates. This results in inefficient execution plans that could use more system resources and run longer than necessary. Regularly updating statistics either manually with UPDATE STATISTICS or by allowing the database engine's automatic update feature (AUTO_UPDATE_STATISTICS) to do so ensures the query optimizer has the most accurate information available. This proactive maintenance helps prevent performance degradation and ensures that queries continue to run efficiently as the data in the database evolves.

View more...

Blockchain-Based Authentication: The Future of Secure Identity Verification

Aggregated on: 2025-09-09 17:14:51

Traditional authentication methods — passwords, centralized databases, and third-party identity providers — are plagued by security breaches, identity theft, and data privacy concerns. Blockchain-based authentication offers a decentralized, tamper-proof, and more secure alternative.   In this deep dive, we’ll explore:  

View more...

How to Use Jenkins Effectively With ECS/EKS Cluster

Aggregated on: 2025-09-09 16:14:51

In modern DevOps workflows, Jenkins is the cornerstone for continuous integration and continuous deployment (CI/CD) pipelines. Because of its flexibility and wide-ranging plugins available, it's indispensable in the automation of build, test, and deployment processes.  On the other hand, AWS provides Elastic Container Service and Elastic Kubernetes Service as powerful managed services for deploying and managing containerized applications. This article explores how Jenkins can be effectively integrated with both ECS and EKS clusters to optimize the CI/CD process.

View more...

Probably Secure: A Look at the Security Concerns of Deterministic vs Probabilistic Systems

Aggregated on: 2025-09-09 15:14:51

Would you rather have determined that you are in fact secure, or are you willing to accept that you are "probably" doing things securely? This might seem like a silly question on the surface, after all, audits don't work on probability; they work on documented facts. We are in an era of rapidly advancing artificial intelligence-driven applications across our organizations, covering a wide variety of use cases. We all love LLMs. There are numerous benefits to applying AI effectively, but like any technology, if we misapply it, we risk opening the door to new dangers. As we further adopt these tools and apply AI in novel ways to our workflows across all departments, it is likely a good idea to take a step back and think through how these systems work, which boils down to most likely getting a correct answer versus repeatedly getting the same expected results based on the same specific inputs.

View more...

Secure Your Spring Boot Apps Using Keycloak and OIDC

Aggregated on: 2025-09-09 14:14:51

In this blog, we will take a closer look at Spring Security, specifically in combination with Keycloak using OpenID Connect, all supported with examples and unit tests. Enjoy! Introduction Many applications are supported by means of authentication and authorization. However, it is also something software developers find difficult to grasp. In this blog, Spring Security is introduced, which is Spring's solution for adding security to your Spring applications. By means of examples and unit tests, you will learn the annotations and Spring classes. The end goal is to set up an application using OpenID Connect in combination with Keycloak. An introduction to OpenID Connect and Keycloak can be found in a previous blog. It is advised to read this blog if you are not yet familiar with the concepts.

View more...

Is Your AI a Psychopath?

Aggregated on: 2025-09-09 13:14:51

The "Whack-A-Mole" Problem: Why We're Losing the AI Bug War You just added the newest LLM to your main product. The demos were excellent, but now the support tickets are coming in fast. The customer service chatbot's AI is giving strangely passive-aggressive answers. You spend a whole day coming up with a clever meta-prompt to make it more "friendly." Yes! But a week later, you find out that it's now "helpfully" making up product features that don't exist, which confuses many users. You just started the "whack-a-mole" game for fixing AI bugs. Given the structure of the game, the outcome is inevitable.

View more...

Cloud Automation Excellence: Terraform, Ansible, and Nomad for Enterprise Architecture

Aggregated on: 2025-09-09 12:14:51

Enterprise cloud architecture demands sophisticated orchestration of infrastructure, configuration, and workload management across diverse computing platforms. The traditional approach of manual provisioning and siloed tool adoption has become a bottleneck for organizations seeking cloud-native agility while maintaining operational excellence. This article explores the strategic integration of three complementary automation technologies: Terraform for infrastructure provisioning, Ansible for configuration management, and HashiCorp Nomad, which serves as a lightweight workload orchestrator, managing application deployment, scaling, and scheduling across diverse infrastructure environments with minimal operational overhead. Unlike monolithic solutions, this ecosystem approach leverages specialized tools that excel in their respective domains while maintaining platform-agnostic capabilities across AWS, Azure, Google Cloud, IBM Cloud, and hybrid environments.

View more...

Toward Explainable AI (Part 8): Bridging Theory and Practice—SHAP: Powerful, But Can We Trust It?

Aggregated on: 2025-09-09 11:14:51

Series reminder: This series explores how explainability in AI helps build trust, ensure accountability, and align with real-world needs, from foundational principles to practical use cases. Previously, in Part VII: SHAP: Bringing Clarity to Financial Decision-Making.

View more...

The Value Gap After Go-Live: The Agile Advantage in Tech Transformation

Aggregated on: 2025-09-08 19:14:51

Robust change management practice is pivotal for the success and survival of enterprises in the ever-changing technology landscape and technology-driven business world. However,  the success of Change Management practices in modern enterprises is not limited to the successful deployment of technological solutions. Technology implementation is only the first step to build the foundation in the broader change management lifecycle, which encompasses adoption, integration, and the realization of sustained value.  Despite investing millions into modernization initiatives, large  organizations often experience transformation fatigue, stalled adoption, and disillusioned teams.

View more...

How to Do Image Recognition With CNNs on the COCO Dataset — a Practical, Step-By-Step Guide

Aggregated on: 2025-09-08 18:29:51

Short summary: This guide walks you from environment setup to a working PyTorch example that trains a Convolutional Neural Network (a pretrained ResNet) to recognize which object categories are present in a COCO image (multi-label image recognition). You’ll learn how to load COCO annotations, build a multi-label dataset, train with BCEWithLogitsLoss, evaluate average precision, and run inference. Run code snippet on a machine with Python + PyTorch. Why COCO and What This Tutorial Does The MS-COCO dataset is a large-scale dataset for object detection, segmentation and captioning; it contains hundreds of thousands of images and 80 common object categories (people, cars, cups, etc.). It’s a standard benchmark for object-level tasks, and we’ll reuse its annotations to turn detection-style labels into a multi-label image recognition task: for each image, predict which of the 80 categories appear. This is a practical way to use a CNN backbone (ResNet) and practice multi-label learning on a real dataset.

View more...

Replacing LEADTOOLS Scanner With AWS Textract (Step-by-Step Migration)

Aggregated on: 2025-09-08 17:29:51

Replacing LEADTOOLS Scanner with AWS Textract service is a strategic move if you’re aiming to leverage cloud-native, scalable, and AI-powered document processing.  By transitioning from custom or any other tools to AWS, you can take full advantage of its equivalent services. This end-to-end migration offers numerous benefits, including improved efficiency, cost-effectiveness, and advanced capabilities. 

View more...

Stop Your GenAI From Burning Cash in Production

Aggregated on: 2025-09-08 16:29:51

Every developer who's deployed GenAI to production knows this moment. The feature works great. Users love it. Then the cloud bill arrives. Your harmless chatbot just cost more than your entire infrastructure. That RAG pipeline you built? It's eating tokens like there's no tomorrow. Welcome to the reality of production GenAI, where every API call has a price tag.

View more...

Building an AI-Powered Insurance Q and A Assistant With RAG and Snowflake Cortex

Aggregated on: 2025-09-08 15:29:56

In the insurance industry, there are vast amounts of data stored in documents like policies, claim details, and FAQs. Providing answers to customers' queries quickly and accurately is crucial for satisfaction and efficiency. The objective of this project is to develop an AI-powered Q&A assistant using Retrieval Augmented Generation (RAG) and Snowflake Cortex Search. RAG (Retrieval-Augmented Generation) integrates large language models with external information retrieval. Upon the user asking a question, the system brings back candidate documents from a knowledge base. The documents act as context to the LLM to generate a proper and informative response. This project demonstrates how to build a robust and effective insurance Q&A assistant by combining the strengths of Retrieval Augmented Generation (RAG) with Snowflake Cortex Search. Using Snowflake's semantic search capability, we can quickly retrieve contextually relevant information from an insurance document knowledge base. The retrieved context is then used as input to a Large Language Model (LLM) to generate accurate and informative answers to user queries.

View more...

Tuples and Records (Part 2): JavaScript Migration Guide

Aggregated on: 2025-09-08 14:14:51

In Part 1 of this series, we explored JavaScript’s Tuples and Records, two immutable data structures designed to improve performance, predictability, and developer experience. We covered their purpose, syntax, key benefits, and where they fit best in modern applications. Now, in Part 2, we’ll shift the focus to migration strategies. Transitioning from traditional objects and arrays to Tuples and Records isn’t just a syntax change. It requires careful planning to avoid unintended side effects and maximize performance gains. In this article, we’ll walk through step-by-step guidance on identifying suitable use cases, incrementally refactoring, ensuring library compatibility, and benchmarking results, so you can confidently adopt these features in production code.

View more...

Slimming Down Docker Images: Base Image Choices and The Power of Multi-Stage Builds

Aggregated on: 2025-09-08 13:14:51

Introduction Let's talk about an uncomfortable truth: most of us are shipping Docker images that are embarrassingly large. If you're deploying ML models, there's a good chance your containers are over 2GB. Mine were pushing 3GB until recently. The thing is, we know better. We've all read the best practices. But when you're trying to get a model into production, it's tempting to just FROM pytorch/pytorch and call it a day. This article walks through the practical reality of optimizing Docker images, including the trade-offs nobody mentions.

View more...

API Design First: AsyncAPI in .Net

Aggregated on: 2025-09-08 12:14:51

In modern distributed systems, event-driven architectures have become mainstream. While RESTful APIs have well-established design-first practices with OpenAPI/Swagger, event-driven architectures often lack similar standardization. For any team building event-driven systems (in general) with Kafka, the initial promise of decoupling and resilience can quickly be overshadowed by chaos. Without a contract, producers and consumers drift apart, leading to runtime errors, documentation nightmares, and endless debates over topic names and message schemas. AsyncAPI aims to solve these challenges, but its current tooling ecosystem has gaps, particularly for .NET developers working with Kafka, Schema Registry, and Infrastructure as Code practices.  This article shares an opinionated path of bridging this gap by creating a custom AsyncAPI template that generates production-ready Kafka clients in C#.

View more...

Getting Started With ClickHouse for AI/ML in Python

Aggregated on: 2025-09-08 11:14:51

As artificial intelligence (AI) and machine learning (ML) workloads grow in complexity and volume, traditional databases often struggle to meet the performance needs of large-scale, real-time analytics. ClickHouse, a high-performance, column-oriented OLAP (Online Analytical Processing) database designed to handle petabyte-scale data with lightning-fast query execution, offers a compelling solution for data engineers and ML practitioners alike. Its unique columnar storage, vectorized execution, and support for distributed deployments make it highly suitable for processing massive datasets generated by IoT devices, web platforms, and large-scale enterprise applications. By enabling both high-speed querying and efficient storage, ClickHouse allows organizations to analyze and act on data in near real-time, which is critical for dynamic AI/ML pipelines and feature engineering tasks. In this article, we’ll explore how to get started with ClickHouse and Python for building fast, scalable AI/ML pipelines.

View more...

Building a Platform Abstraction for AWS Networks Using Crossplane

Aggregated on: 2025-09-05 20:14:49

Crossplane helps platform engineers develop abstractions for developers. It is an open-source, multicloud control plane that handles interactions with cloud providers’ APIs for you. In this post, I’ll show how developers can create an AWS network (VPC, Subnet, etc.) with just a single YAML request to the Kubernetes API.

View more...

How to Run Selenium Tests on Selenium Grid 4 With Jenkins and Docker Compose

Aggregated on: 2025-09-05 19:44:49

Selenium WebDriver, Selenium Grid 4, Jenkins, and Docker Compose are popular and well-known tools. When combined, these tools are a powerful combination for web automation testing. The combination of these tools can help us set up an on-demand local infrastructure, enabling us to spin up the environment as needed for running our web automation tests at scale. Consider a scenario where we need to run multiple web automation tests on different browsers to verify the functionality and stability of the web application. Combining Selenium Grid 4 with Docker Compose can help set up browsers with a single command, allowing us to perform the required test execution smoothly with Jenkins Jobs.

View more...

Meta Prompting for Agile Practitioners

Aggregated on: 2025-09-05 18:14:49

TL; DR: Meta Prompting We’ve all been there: You’re preparing for the next Retrospective, and you turn to ChatGPT for help. “Give me some Retrospective ideas,” you type. What do you get back? Generic templates you’ve seen a hundred times before: Set the Stage, Gather Data, Generate Insights, Decide What to Do, and Close the Retrospective. (Kudos to Esther Derby and Diana Larsen for the format!)  The problem isn’t the AI. 

View more...

Making String Search Easier Across Databases

Aggregated on: 2025-09-05 17:29:49

Searching for information in applications is rarely as simple as matching an exact string. Users don’t always remember the full text; instead, they rely on fragments. When buying a product online, for instance, they might type only the brand (“Samsung”) or only the model (“Galaxy S24”), but rarely both together. In financial systems, the same happens when looking up a transaction by just part of the description. This type of partial search has become crucial for modern systems. In e-commerce, it drives product discovery. In finance, it helps locate records quickly. And in countless other domains, it shapes how people interact with data. To meet this demand, databases have evolved to provide capabilities that go beyond strict equality, allowing queries that can check whether text contains, starts with, or ends with a given fragment.

View more...

Measuring What Matters: A Strategic Lens on Transformation Metrics

Aggregated on: 2025-09-05 16:14:49

"Only 16% of digital transformations improve performance and sustain gains in the long term." — McKinsey, 2021 Transformation efforts often falter not for lack of ambition but for lack of clarity. Metrics—when used well—serve as navigational tools that align teams, validate progress, and reveal true impact. When misused, they become noise, breeding vanity and confusion.

View more...

The Role of Data Governance in Data Strategy: Part 4

Aggregated on: 2025-09-05 15:29:49

In the previous articles of this series, we explored the importance of data governance in managing enterprise data effectively (Part 1), how BigID supports data governance, particularly for data privacy, security, and classification (Part 2), and the role of Data Subject Access Rights (DSAR) in protecting individual privacy (Part 3). Together, these concepts emphasized the importance of visibility, control, and accountability in modern data governance. In this fourth installment, we shift focus to another critical pillar of data governance: Data Retention. While organizations often think of retention simply as storing data for later use, the reality is far more complex. Done right, data retention ensures compliance, cost efficiency, and stronger security. Done poorly, it creates unnecessary risks, ranging from legal exposure and privacy violations to spiraling storage costs and an expanded cybersecurity attack surface.

View more...

Change Data Capture for Apache Phoenix Stream

Aggregated on: 2025-09-05 14:29:49

Apache Phoenix is an open-source, SQL skin over Apache HBase that enables lightning-fast OLTP (Online Transactional Processing) operations on petabytes of data using standard SQL queries. Phoenix helps combine the scalability of NoSQL with the familiarity and power of SQL - the best of both worlds. Apache Phoenix provides Change Data Capture (CDC) with PHOENIX-7001. The CDC design in Phoenix leverages the write-optimized Uncovered Index as well as Max Lookback features. The changes are captured in the time-ordered event of row level modifications.

View more...

Build a RAG Application With LangChain and Local LLMs Powered by Ollama

Aggregated on: 2025-09-05 13:14:49

Local large language models (LLMs) provide significant advantages for developers and organizations. Key benefits include enhanced data privacy, as sensitive information remains entirely within your own infrastructure, and offline functionality, enabling uninterrupted work even without internet access. While cloud-based LLM services are convenient, running models locally gives you full control over model behavior, performance tuning, and potential cost savings. This makes them ideal for experimentation before running production workloads. The ecosystem for local LLMs has matured significantly, with several excellent options available, such as Ollama, Foundry Local, Docker Model Runner, and more. Most popular AI/agent frameworks, including LangChain and LangGraph, provide integration with these local model runners, making it easier to integrate them into your projects.

View more...

DevOps as a Platform: How to Help Developers Ship Faster Without the Chaos

Aggregated on: 2025-09-05 12:29:49

Imagine you're an engineer trying to ship a new feature. You need a pipeline to build, deploy, and test your code. You need infrastructure to run it. You need to check permissions, secrets, and compliance boxes. If your company doesn’t have a standardized DevOps setup, you’re probably setting all that up yourself — or copying it from the last project and hoping for the best. Now multiply that by 50 teams. Welcome to DevOps chaos.

View more...

Toward Explainable AI (Part 7): Bridging Theory and Practice—SHAP: Bringing Clarity to Financial Decision-Making

Aggregated on: 2025-09-05 11:14:49

Series reminder: This series explores how explainability in AI helps build trust, ensure accountability, and align with real-world needs, from foundational principles to practical use cases. Previously, in Part VI: What LIME Shows, and What It Leaves Out, Strengths and limits of local explanations.

View more...

How to Use AI to Enhance Scrum Ceremonies

Aggregated on: 2025-09-04 19:29:49

Among the Agile methodologies, Scrum is the main tool for software development that advances openness, adaptability, and ongoing learning. Scrum ceremonies include the sprint planning, daily stand-up, sprint review, and sprint retrospective. These ceremonies are structured events that drive collaboration, alignment, and delivery. Per Gartner’s report, among the 80% of organizations performing agile development, 87% use Scrum, which makes it the most popular implementation. Gartner defines artificial intelligence as applying advanced analysis and logic-based techniques, including ML, to interpret events, support and automate decisions, and take actions. As per another Gartner report, forecast assumptions are:

View more...

Protecting PII in LLM Applications: A Complete Guide to Data Anonymization

Aggregated on: 2025-09-04 18:14:49

Organizations want to leverage the power of LLMs like GPT or PaLM to solve business problems, but they're rightfully hesitant about sending sensitive data—especially Personally Identifiable Information (PII)—over the internet to third-party hosted models. This article explores a powerful mitigation technique using anonymization and de-anonymization to protect sensitive data while still enabling effective LLM usage in enterprise environments.

View more...

CI/CD in the Age of Supply Chain Attacks: How to Secure Every Commit

Aggregated on: 2025-09-04 17:14:49

The digital infrastructure we've built resembles a house of cards. One compromised dependency, one malicious commit, one overlooked vulnerability and the entire edifice comes tumbling down. In March 2024, security researchers discovered something terrifying: a backdoor lurking within XZ Utils, a compression library so ubiquitous it had infiltrated thousands of Linux distributions worldwide. The attack vector? A meticulously orchestrated supply chain compromise that turned the very foundation of open-source development against itself. This wasn't an anomaly. It was a wake-up call.

View more...

Building AI Agents? 5 Critical Questions to Ask Before You Automate

Aggregated on: 2025-09-04 16:29:49

Agentic AI is a game changer and a hot topic in nearly every boardroom conversation today. There’s no doubt that companies not think AI-first risk becoming irrelevant. But before you rush to build or deploy an AI agent, it’s important to pause and ask some tough questions.  Not every problem requires an AI agent, and without the right foundation, especially around data, agentic AI can quickly become a costly and risky mistake. Rushing headlong into AI adoption without thoughtful planning often leads to wasted resources and missed opportunities. Here are five critical questions every organization should ask before automating with agentic AI.

View more...

The Endless Cycle of Manual K8s Cost Optimization Is Costing Organizations More Than They Realize

Aggregated on: 2025-09-04 15:14:49

Developers and DevOps teams working in Kubernetes tend to focus primarily on performance and pay little attention to the cost side of things. When workloads are running smoothly and meeting SLAs, budget considerations often take a backseat until some external force (normally in the form of the finance team) demands that it’s optimized.  However, the reality is that ignoring cost until finance steps in only leads to inefficiency and wasted resources, and eventually, quite a lot of work on cost optimization. This cycle, where costs are forgotten and then aggressively optimized under pressure, drains considerable time and energy that could be better spent on other strategic initiatives.

View more...

Take AI Out of Your Silos—Why Team Experience Trumps Developer Tools

Aggregated on: 2025-09-04 14:14:49

You wouldn't try to design a product by focusing on one piece at a time. The best products in the world solve problems by reimagining, not by incremental improvements. So why, with AI as the largest technical innovation in recent history, are so many products focusing on improving each step individually, when we complete work in teams? You may be saying “What are you talking about? AI is redefining industries right now!” I hope to convince you that many AI implementations are myopic, local optimizations.

View more...

Active Learning and Human-in-the-Loop for NLP Annotation and Model Improvement

Aggregated on: 2025-09-04 13:29:48

Natural language processing (NLP) models depend heavily on data, but obtaining high-quality labeled data at scale is one of the biggest hurdles. It quickly becomes clear that throwing more raw data at an NLP problem doesn't really help much - it’s the labeled data that really drives improvement. This is where active learning and a human-in-the-loop approach become invaluable. They help us prioritize which data to label, involve human expertise at critical points, and continuously improve models in production.  In this article, we’ll talk about what active learning is, how to implement a human-in-the-loop workflow for NLP annotation, and why this approach accelerates model improvement.

View more...

Engineering for Uptime: Observability, Testing, and the Road to Rock-Solid Back-End Services

Aggregated on: 2025-09-04 12:14:49

Background A single mobile tap can trigger a number of events behind the scenes — API calls to microservices, messages/events sent through queues, writes to databases, and retries on transient failures — all before it returns with a success… or an error toast. The user doesn’t see this complexity. They don’t know about your autoscaling policy, cache hit ratios, or dependency graphs. They only know whether their ride was hailed, their payment went through, or their food order was confirmed. And when things go wrong, it’s that hidden complexity that determines how gracefully your system recovers. That’s why reliability can’t just be the SRE team’s job anymore. It’s a shared responsibility — one that should be embedded in the day-to-day decisions of every back-end engineer. From the way we design systems to how we write alerts, ship code, and handle incidents, reliability is engineered — not wished into existence.

View more...

CI/CD Is Not Enough: Stop Missing Test Failures With Intelligent Notifications

Aggregated on: 2025-09-04 11:14:48

The Visibility Gap in Enterprise Testing Modern test automation has matured. CI/CD pipelines are well-orchestrated, test coverage is high, and nightly regressions run like clockwork. But even with all this structure, one subtle problem still persists: nobody knows when things fail — at least not fast enough, or by the right people. Here’s how this usually plays out:

View more...

Developing a Nationwide Real-Time Telemetry Analytics Platform Using Google Cloud Platform and Apache Airflow

Aggregated on: 2025-09-03 20:29:48

In my tenure at TELUS, I was assigned a prominent project requiring substantial technical expertise: the development of a telemetry analytics platform that could analyze data in real-time from over 100,000 set-top boxes (STBs) deployed throughout Canada. The objective was not just about scale; it aimed to assist teams to make quicker operational decisions and enhance the experience for millions of customers. Initially, I recognized the outdated data infrastructure as a bottleneck, obstructing the data from reaching the teams who required it the most. This article portrays the methodologies we employed to modernize our infrastructure using Google Cloud Platform (GCP), Apache Airflow, and Infrastructure-as-Code tools to surmount the obstacles and deliver a future-proof solution. The Predicament: Ancient Bottlenecks and Unseen Black Spots Prior to this revamp, we predominantly relied on segregated and batch-oriented data pipelines incapable of supporting real-time diagnostics. Key concerns encompassed:

View more...

DSLs vs. Libraries: Evaluating Language Design in the GenAI Era

Aggregated on: 2025-09-03 19:29:48

Programming languages are the fundamental tools used to shape the digital world. Every developer has to choose at some point in their careers between general-purpose languages such as Python, Java, and C# and specialized domain-specific languages like SQL, CSS, or XAML. But with the evolution of AI the lines are getting blurred. We are observing shifts in not only how we write code but the definitions of productivity, maintainability, and innovation are beginning to change as well. As a result, the conventional trade-offs between DSLs and libraries are changing, and long-standing issues like expressiveness, integration complexity, and learning curves are being approached from new perspectives. The Traditional DSL vs Library Paradigm General-Purpose Languages (GPLs) are very versatile. They are packed with extensive libraries that allow developers to tackle problems across multiple domains. But this flexibility comes at the cost of writing more code and the need for significant domain knowledge to implement specialized solutions effectively. 

View more...

Observability for the Invisible: Tracing Message Drops in Kafka Pipelines

Aggregated on: 2025-09-03 18:14:48

When an event drops silently in a distributed system, it is not a bug, it is an architectural blind spot. In high-scale messaging platforms, particularly those serving real-time APIs like WhatsApp Business or IoT command chains, telemetry failures are often mistaken for application errors. But the root cause lies deeper: observability gaps in event streams. This article explores how backend engineers and DevOps teams can detect, debug, and prevent message loss in Kafka-based streaming pipelines using tools like OpenTelemetry, Fluent Bit, Jaeger, and dead-letter queues. If your distributed messaging system handles millions of events, this guide outlines exactly how to make those events accountable.

View more...

Simple Efficient Spring/Kafka Datastreams

Aggregated on: 2025-09-03 17:14:48

I had the opportunity to work with Spring Cloud Data Flow streams and batches. The streams work in production and perform well. The main streams used Debezium to send the database deltas to Soap endpoints or provided Soap endpoints to write into the database. The events where send via Kafka. Spring Cloud Data Flow also provides a application to manage the streams and jobs. The streams are build with a data source and a data sink that are separate applications and are decoupled by the events send via Kafka. Stream 1 has a Debezium source and sends the database deltas via Kafka to the sink that transforms the event into a soap request to the application. Stream 2 receives a soap request from the application and sends an event to Kafka. The sink receives the event and creates the database entries for the event.

View more...

Understanding Zero-Copy

Aggregated on: 2025-09-03 16:14:48

In the realm of high-performance computing and network applications, efficient data handling is important. Traditional Input/Output (I/O) operations often involve redundant data copies, creating performance bottlenecks that can limit throughput and increase latency. Zero-copy is a powerful optimization technique that minimizes or eliminates these unnecessary data movements, leading to significant performance gains. Traditional Input/Output Path Consider a common scenario: an application needs to read a file from disk and transmit it over a network. In a traditional I/O model, this seemingly straightforward operation entails a series of data copies:

View more...

Understanding Apache Spark Join Types

Aggregated on: 2025-09-03 15:29:48

In this article, we are going to discuss three essential joins of Apache Spark. The data frame or table join operation is most commonly used for data transformations in Apache Spark. With Apache Spark, a developer can use joins to merge two or more data frames according to specific (sortable) keys. Writing a join operation has a straightforward syntax, but occasionally the inner workings are obscured. Apache Spark internal API suggests several algorithms for joins and selects one. A basic join operation could become costly if you do not know what these core algorithms are or which one Spark uses.

View more...

Container Security Essentials: From Images to Runtime Protection

Aggregated on: 2025-09-03 14:29:48

Container security is all about making sure you run an image that is exceptionally low in vulnerability and malware. I would love to say having zero vulnerabilities, but it is rarely possible in the real world. In the worst case, you at least want to address critical to medium vulnerabilities to have a good night's sleep and avoid potential compromise from bad actors. You could also think of container security like peeling an onion, where each layer adds resilience against potential threats. As part of this article, we will learn what the different steps are that we could take to increase the overall safety of the container infrastructure.

View more...