News Aggregator


How to Gracefully Deal With Contention

Aggregated on: 2025-11-28 20:11:14

The Problem Statement When multiple clients, processes, or threads compete for a limited number of resources simultaneously, causing degraded turnaround time and performance, the system enters a state called contention. This is the most common problem in systems that handle high traffic volumes. Without graceful dealing, contention leads to race conditions and an inconsistent state. Example Scenario Consider buying flight tickets online. There is only one seat available on the flight. Alice and Bob both want this seat and click "Book Now" at exactly the same time.

View more...

The Illusion of Deep Learning: Why "Stacking Layers" Is No Longer Enough

Aggregated on: 2025-11-28 19:11:14

Have we reached the limit of what we can achieve with our current AI models? At the very heart of the race for parameters and power conducted by Big Tech players, a fundamental question emerges: Do our AIs truly understand the changing world, or are they simply reciting a frozen past? In the study shared by the Google Research team in their paper "Nested Learning: The Illusion of Deep Learning Architectures" (1), the finding is unequivocal. According to them, our large language models (LLMs) suffer from "anterograde amnesia syndrome." Like patient Henry Molaison, a famous clinical case (2), who was incapable of forming new memories after his operation, our models, once their training is complete, are frozen.

View more...

RAG Applications with Vertex AI

Aggregated on: 2025-11-28 18:11:14

Most organizations experimenting with generative AI face a common bottleneck: their LLMs can chat nicely, but they do not consistently know the company’s own data. A customer wants to know a policy clause, or an engineer asks a question about a system diagram, and the model makes something up or simply provides an ambiguous, incomplete response. This won’t work in industries such as healthcare, financial services, or insurance where accuracy is critical. What we want is the creative power of LLMs, but also the ability to reliably know our organization’s stuff.  Here, we will explore how Retrieval-Augmented Generation (RAG) gives us those solutions.

View more...

Is TOON the Next Lightweight Hero in Event Stream Processing With Apache Kafka?

Aggregated on: 2025-11-28 17:11:14

The data serialization format is a key factor when dealing with stream processing, as it decides how efficiently the data is forwarded on the wire and optimized internally in order to be stored, understood, and processed by a distributed system. The data serialization format is core to stream processing in that it directly influences the speed, reliability, scalability, and maintainability of the entire pipeline. Choosing the right one can eliminate expensive lock-ins and ensure that our streaming infrastructure remains stable as data volume and intricacy evolve.  In a stream-processing platform where millions of events per second must be handled with low latency by ingestion systems such as Apache Kafka and processing engines like Flink or Spark, reducing CPU usage is important, as it depends on efficient data formats.

View more...

Next-Gen AI-Based QA: Why Data Integrity Matters More Than Ever

Aggregated on: 2025-11-28 16:11:14

Artificial intelligence has changed the way we work across different industries. From chatbots that quickly resolve customer issues to systems that detect equipment failures before they occur, automation is now a standard practice. As these smart systems become more independent, one question keeps emerging: how much can we trust the data behind them?  Data integrity may not make the news often, but it supports every AI-driven process. When data is inconsistent, incomplete, or biased, even the best algorithms can fail. In an automated setup, those failures don’t just stay small; they grow, causing flawed predictions, distorted insights, or even unethical results. Bias, safety, disinformation, copyright, and alignment are big problems with AI thus robust data quality matters ever than before. 

View more...

From Repetition to Reusability: How Maven Archetypes Save Time

Aggregated on: 2025-11-28 15:11:14

Within the discipline of software engineering, practitioners are frequently encumbered by the monotonous ritual of initializing identical project scaffolds — configuring dependencies, establishing directory hierarchies, and reproducing boilerplate code prior to engaging in substantive problem‑solving. Although indispensable, such preliminary tasks are inherently repetitive, susceptible to human error, and inimical to efficiency.  Maven, a cornerstone of the Java build ecosystem, furnishes an elegant mechanism to mitigate this redundancy through the construct of archetypes. An archetype functions as a canonical blueprint, enabling the instantaneous generation of standardized project structures aligned with organizational conventions. By engineering bespoke archetypes, development teams can institutionalize consistency, accelerate delivery, and reallocate intellectual effort toward innovation rather than procedural repetition.

View more...

Level Up Your API Design: 8 Principles for World-Class REST APIs

Aggregated on: 2025-11-28 14:11:14

You’ve probably built a “REST API” before. But what does “RESTful” truly mean? It’s not just about using JSON and HTTP. It’s a spectrum, best described by the Richardson Maturity Model (RMM). Level 0 (The Swamp): Using HTTP as a transport system for remote procedure calls (RPC). Think of a single /api endpoint where all operations are POST requests. Level 1 (Resources): Introducing the concept of resources. Instead of one endpoint, you have multiple URIs like /users and /orders. Level 2 (HTTP Verbs): Using HTTP methods (GET, POST, PUT, DELETE) and status codes (2xx, 4xx) to operate on those resources. This is where most “REST” APIs live. Level 3 (Hypermedia  —  HATEOAS): The “holy grail” of REST. The API’s responses include links (hypermedia) that tell the client what they can do next. The client navigates your API by discovering these links, not by hard-coding URLs. The eight principles I’m sharing today are a blend of my own production experience and the pragmatic wisdom from industry-leading guides like Zalando’s. These should help you move your APIs up this maturity ladder, creating designs that are more robust, scalable, and easier to use.

View more...

Five Nonprofit & Charity APIs That Make Due Diligence Way Less Painful for Developers

Aggregated on: 2025-11-28 13:11:13

I learned this lesson the hard way. A few years back, I built a donation platform I thought was bulletproof. The design? Slick. Payments? Smooth. I figured, “Alright, I’ve nailed it.”

View more...

Running Istio in Production: Five Hard-Won Lessons From Cloud-Native Teams

Aggregated on: 2025-11-28 12:11:13

Istio has established itself as a popular, trusted, and powerful service mesh platform. It complements Kubernetes with powerful features such as security, observability, and traffic management with no code changes. Istio’s several key features strengthen cloud-native and distributed systems, ensuring consistency, security, and resilience across diverse environments.  Istio has also recently graduated under the Cloud Native Computing Foundation (CNCF), along with other projects like Kubernetes. In this article, we will cover Istio's best practices for building a production-grade service mesh layer that offers secure, resilient, and durable performance.

View more...

Building a Simple MCP Server and Client: An In-Memory Database

Aggregated on: 2025-11-27 20:11:13

If you've been diving into the world of AI-assisted programming or tool-calling protocols, you might have come across Model Context Protocol (MCP). MCP is an open-source standard for connecting AI applications to external systems. It is a lightweight framework that lets you expose functions as "tools" to language models, enabling seamless interaction between AI agents and your code. Think of it as a bridge that turns your functions into callable endpoints for models. In this post, we’ll build a basic in-memory database server using MCP, with code samples to extend and learn from. We'll dissect the code step by step, and by the end, you'll have a working prototype. Plus, I'll ask you to extend it with update, delete, and drop functionalities. Let's turn your terminal into a mini SQL playground!

View more...

How To Restore a Deleted Branch In Azure DevOps

Aggregated on: 2025-11-27 19:11:13

Human error is one of the most common causes of data loss or breaches. In the ITIC report, they state that 64 % of downtime incidents have their roots in human errors. If you think that in SaaS environments all your data is safe, you need to think once again. All SaaS providers, including Microsoft, follow the shared responsibility model, which states that the service provider is responsible for the accessibility of its infrastructure and services, while a user is responsible for their data availability, including backup and disaster recovery.

View more...

Mastering Fluent Bit: Controlling Logs with Fluent Bit on Kubernetes (update to Part 4)

Aggregated on: 2025-11-27 18:11:13

NOTE: This is a special update to the original Controlling Logs with  Fluent Bit on Kubernetes (Part 4) article published previously. The issue requiring this update arose over the weekend when I discovered that Broadcom, who acquired VMWare, who were the custodians of the Bitnami catalog, did something not so nice to all of us.

View more...

Solving Real-Time Event Correlation in Distributed Systems

Aggregated on: 2025-11-27 17:11:13

Modern digital platforms operate as distributed ecosystems — microservices emitting events, APIs exchanging data, and asynchronous communication becoming the norm. In such environments, correlating events across multiple sources in real time becomes a critical requirement. Think of payments, orders, customer metadata, IoT sensors, logistics tracking — all flowing continuously.

View more...

Run LLMs Locally Using Ollama

Aggregated on: 2025-11-27 16:56:13

Over the past few months, I’ve increasingly shifted my LLM experimentation from cloud APIs to running models directly on my laptop. The reason is simple: local inference has matured to the point where it’s fast, private, offline-friendly, and surprisingly easy to set up. Tools like Ollama have lowered the barrier dramatically. Instead of wrestling with GPU drivers, manually downloading weights, or wiring up custom runtimes, you get a single lightweight tool that can run models such as Llama 3.1, Mistral, Phi-3, DeepSeek R1, Gemma, and many others, all with minimal configuration.

View more...

AWS Airflow vs Step Functions: The Data Engineering Orchestration Dilemma

Aggregated on: 2025-11-27 16:11:13

There's a moment in every data engineering project when you realize your growing collection of batch jobs, data transformations, and scheduled tasks needs proper orchestration. You've probably duct-taped together some Lambda functions with CloudWatch Events, maybe written a few shell scripts with cron jobs, and now you're looking at AWS, wondering: should I go with Managed Airflow (MWAA) or Step Functions? I've seen teams make both choices, and here's the truth: neither is universally "better." The right answer depends on what you're actually building, who's maintaining it, and how your data engineering team thinks about workflows.

View more...

Automating FastAPI Deployments With a GitHub Actions Pipeline

Aggregated on: 2025-11-27 15:56:13

Deploying FastAPI apps manually gets old fast. You SSH into a server, pull the latest code, restart the service, and hope nothing breaks. Maybe you remember to run tests first. Maybe you don't. One forgotten environment variable or skipped test, and your API is down. Users get 500 errors. You're frantically SSHing back in to fix it.

View more...

Optimizing Trino Performance With Materialized Views in a Data Lake

Aggregated on: 2025-11-27 15:11:13

In this article, I share how we improved the performance of our Trino-based data lake by using materialized views. Our service evolved from a dual-storage system built on HBase and Elasticsearch to a simplified, cost-efficient data lake architecture powered by Iceberg, Spark Streaming, and Trino. The transition brought significant advantages but also unexpected performance challenges that we solved through careful use of Trino’s materialized views. Business Description Our service receives data from a Kafka source on three different topics and inserts it into HBase and Elasticsearch. HBase was used for get-by-ID operations, while Elasticsearch handled GraphQL-style search queries. HBase is known for excellent insert performance and fast get-by-ID operations, and Elasticsearch provides powerful full-text search capabilities. Over time, however, we realized that we were not using most of Elasticsearch’s advanced search features. Maintaining both systems was costly, and the operational complexity of supporting two clusters — HBase and Elasticsearch — was high. We decided to migrate to a modern data lake architecture to improve scalability and cost efficiency.

View more...

The Fake "Multi" in Multi-Tenant: When SaaS Tenancy Models Backfire at Scale

Aggregated on: 2025-11-27 13:11:13

One SaaS, Many Users, One Big Lie Your “multi-tenant” SaaS architecture is probably a single-tenant app with commitment issues. That sounds harsh until you look at the actual implementation. Customer A gets one deployment with hardcoded settings. Customer B gets the same codebase, but now wrapped in a flag-laden logic bomb. By the time you reach customer C, your team has a 60-page Confluence doc titled “How to onboard a new tenant without waking the VP of Engineering.”

View more...

How to Push Docker Images to AWS Elastic Container Repository Using GitHub Actions

Aggregated on: 2025-11-26 20:11:13

GitHub Actions enables the CI/CD, short for continuous integration or continuous deployment, process to build, test, and deploy the code through the workflows within the same GitHub repository. GitHub Actions builds images and pushes them to cloud providers such as AWS and Docker Hub. We can choose the different OS platforms, Windows or Linux, to run the workflows. In this article, we will demonstrate how we can streamline the build and deploy process to push Docker Images to AWS ECR, short for Elastic Container Repository, by using GitHub Actions.

View more...

Top 5 Best Practices for Building Dockerized MCP Servers

Aggregated on: 2025-11-26 19:11:13

The Model Context Protocol (MCP) is changing how we build software. It provides the "API" for large language models (LLMs) to interact with the real world. This lets an AI agent query a database, read a file, or call a third-party service. This new capability brings new challenges. MCP servers, the back-end tools the AI uses, are not traditional microservices. Their user is a non-deterministic AI, and they often need access to sensitive systems.  How do we build, deploy, and secure these servers reliably? The clear answer is Docker. The entire MCP ecosystem, including Docker's own MCP Toolkit and Catalog, is built around containerization. Running your MCP servers in Docker is not just a good idea; it is a necessary best practice. This article covers five key principles for building production-ready, Dockerized MCP servers.

View more...

Overview Of Observability As Code

Aggregated on: 2025-11-26 18:11:13

Observability as Code  It is a practice where monitoring, logging, alerting, and observability configurations are defined, managed, and deployed using code-based approaches rather than manual configuration through dashboards or UIs. Core Concept Typically, engineers manually set up alerts for monitoring on the web console. However, with Observability as Code, engineers write code (typically YAML, JSON, or domain-specific languages) that declaratively define:

View more...

Rethinking the Software Supply Chain for Agents

Aggregated on: 2025-11-26 17:56:13

A recent MIT study reported that only about 5% of GenAI applications are creating real, measurable business value. In my opinion, that’s not a failure of ambition. If anything, most teams are experimenting aggressively. The issue is that the underlying systems we use to deliver software haven’t adapted to what AI actually is. It has become incredibly easy to build a prototype or demo. A few prompt tweaks, an API call, and you can show something impressive. But turning that prototype into something you can trust in production is a different challenge. That part requires real engineering: reliability, consistency, versioning, monitoring, and guardrails. The problem is that the tools and workflows we’ve relied on for years were never designed to support systems that change their behavior over time.

View more...

Building a Local RAG App With a UI, No Vector DB Required

Aggregated on: 2025-11-26 17:11:13

Generative AI, LLMs, and RAG have been at the forefront of technological innovation and discussion. Retrieval-augmented generation (RAG) has emerged as a powerful pattern for building LLM applications that can reason over your data, reducing hallucinations and providing up-to-date, contextually relevant answers. Most of the time, I found the RAG tutorials involve a dedicated vector database like Pinecone, Weaviate, or Chroma. These are fantastic for production systems, but what should I use for local development, rapid prototyping, or smaller-scale applications? The overhead of setting up, managing, and paying for a database service is not a better choice when you just want to build something.

View more...

Building AI Agents With Semantic Kernel: A Practical 101 Guide

Aggregated on: 2025-11-26 16:56:13

AI agents are evolving beyond traditional chatbots, taking on complex problem-solving tasks that demand deep contextual understanding and intelligent reasoning. Developers today aim to build systems that can not only respond intelligently but also act autonomously — combining domain knowledge, business logic, and specialized tools to create decision-making agents tailored to specific problems. Achieving this requires a powerful orchestrator capable of coordinating models, tools, and workflows seamlessly.  Microsoft’s Semantic Kernel (SK) provides exactly that: a lightweight framework that bridges large language models (LLMs) with your own code, data, and APIs. In this article, we'll build a simple AI agent and explore the key components that make it work.

View more...

Scaling Identity Governance Without Connectors: The LDAP Directory IGA Integration Pattern

Aggregated on: 2025-11-26 16:11:13

In Identity Governance and Administration (IGA), connectors help keep user accounts, roles, and access permissions in sync across your applications.   What if you don’t deploy a connector? What about legacy and cloud applications that don’t support SCIM, or systems handled by third-party vendors that don’t allow inbound connections?

View more...

LLMOps Under the Hood: Docker Practices for Large Language Model Deployment

Aggregated on: 2025-11-26 15:56:13

Large language models (LLMs) are everywhere — powering chatbots, copilots, and AI-driven apps across industries. But if you’ve ever tried to run one outside of a managed service, you know the pain: gigabytes of model weights, conflicting Python dependencies, fragile CUDA versions, and a GPU setup that only seems to work on your machine. This is where Docker shines. By packaging the entire environment — code, libraries, and drivers — into a container, you can run an LLM anywhere, whether it’s your laptop, a cloud GPU node, or a Kubernetes cluster. Containers give you reproducibility, portability, and isolation: exactly what’s needed for the messy world of LLMOps.

View more...

Securing Converged AI-Blockchain Systems: Introducing the MAESTRO 7-Layer Framework

Aggregated on: 2025-11-26 12:11:13

Introduction When an AI trading agent exploits a smart contract vulnerability, financial firms can lose millions in seconds. In 2024 alone, more than $1.42 billion vanished through smart contract exploits, with AI-enhanced systems showing particularly troubling weaknesses that traditional security frameworks simply cannot address. As blockchain and AI technologies converge, they create entirely new attack surfaces that existing methodologies like STRIDE and MITRE ATT&CK weren’t designed to handle. Through my experience securing enterprise systems processing trillions in assets, I developed the MAESTRO framework — Multi-Agent Environment, Security, Threat, Risk, and Outcome — as a practical, seven-layer approach specifically designed for AI-blockchain convergence.

View more...

Breaking the Chains of the GIL in Python 3.14

Aggregated on: 2025-11-25 20:26:12

For years, developers working in Python have wrestled with a strange paradox: great productivity and ecosystem breadth, but limited multicore throughput in many scenarios. The culprit? The Global Interpreter Lock (GIL). Put simply: in CPython, only one native thread may execute Python bytecode at a time. For IO-bound tasks, this is often fine, but for CPU-bound or highly concurrent workflows, this constraint has been a persistent bottleneck.  I have experienced this frustration many times - you design a multithreaded service, spin up 16 threads on a 32-core machine expecting massive throughput, and then watch in horror as CPU utilization flatlines at 100% (effectively one core). You are then forced to switch to multiprocessing, pay the heavy overhead of inter-process communication, or rewrite critical paths in Rust or C++. All this complexity just to get true parallelism.

View more...

Vector Databases in Action: Building a RAG Pipeline for Code Search and Documentation

Aggregated on: 2025-11-25 19:26:12

Imagine typing "authentication with JWT tokens" and instantly finding every relevant code snippet across your entire codebase, regardless of variable names or exact phrasing. That's the promise of vector databases combined with retrieval-augmented generation (RAG). After implementing this architecture across multiple production systems, I've learned that the real challenge isn't the theory; it's the practical decisions that make or break your implementation. Traditional keyword search fails spectacularly with code. A developer searching for "validate user input" won't find functions named sanitize_request_data() or check_payload_integrity(), even though they're semantically identical. Vector databases solve this by understanding meaning, not just matching strings. When combined with RAG, they transform how development teams interact with their codebases.

View more...

Revamping Real-Time Data Ingestion for Scalable Media Intelligence

Aggregated on: 2025-11-25 18:26:12

In the era of 24/7 media and constant digital noise, the ability to process and act on real-time information is crucial. For any system designed to monitor, classify, and enhance media content, scalable ingestion pipelines are the backbone. This blog outlines a re-engineered real-time ingestion pipeline that successfully scaled to handle over 8 million articles per day, demonstrating a shift from traditional ETL models to AI-augmented streaming architectures. The Problem Space: High-Velocity Media Streams Media monitoring platforms must absorb diverse content formats from countless providers and categorize them in near real time. Traditional monolithic systems or batch ETL jobs fail to meet such latency and reliability demands.

View more...

Integrating Lakeflow Connect With PostgreSQL: A Developer’s Complete Hands-On Guide From the Field

Aggregated on: 2025-11-25 17:26:12

Modern data teams want reliable, incremental, near real-time ingestion from PostgreSQL into Databricks Unity Catalog without building costly and fragile CDC jobs, custom pipelines, or manual ETL orchestration. That’s where Lakeflow Connect solves the issue by providing developers with a unified, low-overhead ingestion framework that handles extraction, CDC, schema syncing, and table creation inside Unity Catalog automatically. This post walks through how I have set up Lakeflow Connect with PostgreSQL, including:

View more...

How to Test POST Requests With REST Assured Java for API Testing: Part I

Aggregated on: 2025-11-25 16:26:12

REST Assured is a popular API test automation framework in Java. Software teams widely use it for efficiently validating RESTful web services with minimal setup. It simplifies the process of sending requests, verifying responses, and handling JSON or XML paåyloads. With its rich syntax and integration support for tools like TestNG and Maven, REST Assured enables robust, maintainable, and scalable API testing.

View more...

Integrating Node.js Applications With MCP Servers

Aggregated on: 2025-11-25 15:26:12

Modern applications are rarely standalone. Most need to talk to databases, APIs, or even AI-driven agents that make real-time decisions. As this landscape grows, developers need a common way to connect apps, services, and data without constantly reinventing the wheel. That’s where MCP (Model Context Protocol) comes in.  In this post, we will look at what MCP is, why it matters, and how you can integrate a simple Node.js service with an MCP server. We will also walk through a working example, i.e, a user info microservice, and see what the setup and output look like.

View more...

Dark Deployments and Feature Flags: 2025's DevOps Superpower

Aggregated on: 2025-11-25 14:26:12

Speed kills. That's what they used to say about reckless driving, but in 2025's cutthroat software world, it's become the unofficial motto of DevOps teams everywhere. The difference? Now it's uncontrolled speed that's doing the killing — along with careers, user trust, and quarterly revenue targets. We've all been there. That sinking feeling when your "quick hotfix" brings down production at 3 AM. The awkward silence in the war room as error rates spike and customer support tickets flood in. The post-mortem meetings where everyone tries to figure out how a one-line change managed to crater the entire checkout flow.

View more...

Effective Strategies for AWS Cost Optimization

Aggregated on: 2025-11-25 13:26:12

Amazon Web Services (AWS) provides a robust and flexible cloud platform that delivers significant cost optimization for its customers. It's crucial to manage and optimize costs effectively to maximize the value of your investment. This article provides proven tips and techniques for optimizing AWS costs, including monitoring usage, setting budgets, and leveraging cost-effective services. 

View more...

Taming Async Chaos: Architecture Patterns for Reliable Event-Driven Systems

Aggregated on: 2025-11-25 12:26:12

Why Go Event-Driven? In a world where every user clicks, IoT sensor ping, and AI model request or update demands a near-instantaneous response, traditional synchronous request/response patterns begin to break. Event-driven architectures (EDA) offer a compelling solution:

View more...

DevSecConflict: How Google Project Zero and FFmpeg Went Viral For All the Wrong Reasons

Aggregated on: 2025-11-24 21:41:12

Security research isn’t a stranger to controversy. The small community of dedicated niche security teams, independent researchers, and security vendors working on new products finds vulnerabilities in software and occasionally has permission to find and exploit them. This security industry has always had a fraught relationship with the law and the terms of service of the organisations they target, as notoriety is prioritized over legalities. Regardless of the true motives of security researchers, it is difficult to argue that this vulnerability hunting is done with no genuine desire to improve security, in addition to producing a conference talk or two.  To avoid legal threats, many researchers opt to avoid commercial software, products, and applications and instead turn their attention to open source. Open-source teams welcome contributions to improve security, offer transparency through pull requests, and are used throughout the industry. Where closed-source software may respond with a legal threat, open source responds with an enthusiastic thank-you, allowing security researchers to make an impact and talk about their work.

View more...

When Chatbots Go Rogue: Securing Conversational AI in Cyber Defense

Aggregated on: 2025-11-24 20:41:12

The evolution of conversational AI has introduced another dimension of interaction between businesses and users on the internet. AI chatbots have become an inseparable part of the digital ecosystem, which is no longer restricted to customer service or personalized suggestions. Chatbots have the potential to share sensitive data, break user trust, and even create an entry point to cyberattacks. This renders the security of conversational AI a matter of urgent concern to enterprises that embrace AI chatbot development services for websites.

View more...

How to Build Real-Time Transaction Monitoring Systems With Streaming Data

Aggregated on: 2025-11-24 19:41:12

The rate at which financial transactions are conducted in the United States has been increasing at a tremendous rate over the last few years. Billions of dollars flow through networks in seconds as digital wallets, instant peer-to-peer payments, and online banking emerge. This presents opportunity as well as risk. Real-time transaction monitoring systems with streaming data have become a security and compliance priority in financial institutions due to the importance of detecting suspicious behavior in real-time. Although the traditional monitoring model typically assumes the use of batch processes to analyze the data hours after it was captured, the recent threat landscape does not tolerate any speed. In the year 2023, the Federal Trade Commission reported that the American people had lost over $10 billion to fraud, the highest recorded amount. Even one day of delay can be converted into millions of avoidable losses. Streaming data technologies offer a way forward, where financial systems will be able to process and assess transactions in real-time.

View more...

Building a Retrieval-Augmented Generation (RAG) System in Java With Spring AI, Vertex AI, and BigQuery

Aggregated on: 2025-11-24 18:41:11

Retrieval-augmented generation (RAG) is quickly becoming one of the most powerful design patterns for AI applications. It bridges the gap between general-purpose large language models (LLMs) and your specific enterprise data. In this article, we’ll walk through how to build a complete RAG pipeline in Java using Spring Boot, Vertex AI’s Gemini embeddings, Apache PDFBox, and BigQuery Vector Search. You will see how to do the following, wrapped in a Spring Boot app with a simple web UI:

View more...

Creating an MCP Client With Spring AI

Aggregated on: 2025-11-24 17:41:11

MCP servers extend the functionality of a large language model (LLM). Inference engines allow you to define the MCP servers, but often you will need to write an MCP client yourself. In this blog, you will learn how to do so using Spring AI. Enjoy! Introduction In a previous post, you learnt how to create an MCP server using Spring Boot and Spring AI. The MCP server provides four tools:

View more...

The Right to Be Forgotten in Event-Driven Data Products

Aggregated on: 2025-11-24 16:41:11

Companies operating under regulatory frameworks such as GDPR, CPPA, and other privacy laws increasingly find themselves under pressure to enforce strict and timely retention and deletion policies for customer data. These regulations not only require data to be stored securely but also mandate that organizations delete customer information wherever it exists when the customer requests deletion, upon customer exit, or when retention windows expire. In practice, this "right to be forgotten" means that every copy, transformation, and analytical artifact referencing that individual must be identified and purged within defined timeframes. In many organizations, especially those with complex analytical ecosystems, this is far more difficult than it sounds. Operational systems often have clear deletion semantics, but analytical platforms (data lakes, warehouses, dashboards, machine learning pipelines) tend to accumulate replicated and transformed copies of customer data that sit outside the operational flow. When a deletion occurs upstream, downstream analytical systems may not hear about it, may not react in time, or may not have a reliable mechanism to trace lineage and apply deletions consistently. This gap creates compliance risk, audit failures, and fragmentation across analytical domains.

View more...

Architectural Evidence in Enterprise Java: Making Domain-Driven Design Visible

Aggregated on: 2025-11-24 15:41:11

One subtle challenge in software architecture is that architectural thinking can feel detached from the codebase. We draw diagrams, define layers, identify responsibilities, and craft a coherent structure — yet the moment implementation begins, those architectural ideas fade into the background. Over time, systems drift not because developers ignore design, but because the code itself provides almost no way to express that design. This tension is well documented. In Just Enough Software Architecture, George Fairbanks argues that programming languages lack constructs for directly representing architectural concepts. Java lets us model types, fields, methods, and packages, but offers no native way to encode ideas such as “presentation layer,” “domain logic,” “aggregate root,” or “infrastructure boundary.” Without these cues in the code, architecture becomes optional, verbal, and fragile.

View more...

When Leadership Blocks Your Pre-Mortem

Aggregated on: 2025-11-24 14:41:11

TL;DR: The Pre-Mortem Leadership resistance to your pre-mortem reveals whether your organization’s operating model prioritizes comfortable narratives over preventing failure. This article shows you how to diagnose cultural dysfunction and decide which battles to fight. The Magic Of Risk Mitigation Without Passing Blame There’s a risk technique that takes 60 minutes, costs nothing, and surfaces problems other planning methods miss. It’s been field-tested for nearly two decades. Teams that use it catch catastrophic issues while there’s still time to act.

View more...

Why the MITRE ATT&CK Framework Actually Works

Aggregated on: 2025-11-21 21:11:10

The alert goes off at 2:17 p.m. You count yourself lucky that this one’s in the afternoon, not morning. You drop what you’re doing, open the console, and start digging in.

View more...

Building Multimodal Agents with Google ADK — Practical Insights from My Implementation Journey

Aggregated on: 2025-11-21 14:11:10

The landscape of AI agents is rapidly evolving, moving far beyond the early days of simple language models and retrieval-augmented generation (RAG). The diagram below depicts the evolution in the last two years: AI Agents Evolution

View more...

Revolutionizing Supply Chain Optimization with AI-Driven Constraint Programming

Aggregated on: 2025-11-21 13:11:10

Optimizing the supply chain has been no easy task. Success entails making decisions on the firm's supply, production, inventory, and logistics functions, all in the real world where demand is not fixed, suppliers are not infinite, and disturbances are not rare. Conventional optimization approaches are suitable for steady-state scenarios but fail in dynamic and complex environments. AI and CP are the technologies that are changing the way of supply chain management and enabling organizations to build adaptive, efficient and resilient supply networks. However, in this article I will explain how AI constraint programming works, the how of it, and how AI professionals can develop high-level applications using it.

View more...

Software Testing in the AI Era - Evolving Beyond the Pyramid

Aggregated on: 2025-11-21 12:11:10

If the 1990s was the internet era and the 2010s was the smartphones era, then it's clear that the 2020s will be defined by Large Language Models (LLMs) and AI tools. In a decade where nearly every field is being defined by advancements in AI, software testing is no exception. Software testing is a fundamental building block in the foundation upon which software quality assurance is built. The development of testing techniques has a long and storied history stretching back to the early days of software development itself. 

View more...

Deploying a Serverless Application on Google Cloud

Aggregated on: 2025-11-20 20:11:10

Deploying a serverless application is a modern approach to building scalable and cost-efficient software without managing the underlying infrastructure. This blog will walk you through the process with a practical Python example deployed on Google Cloud Functions, one of the major cloud providers offering serverless capabilities. What Is Serverless Deployment? Serverless deployment means that developers write code without worrying about servers or infrastructure. The cloud provider dynamically manages the resource allocation, scaling, and availability of the functions. You are billed only for the actual execution time of your code, making it highly cost-effective and efficient. Serverless architectures promote modular, event-driven development, perfect for microservices or APIs.

View more...

Advanced Usage of Decodable in Swift: Handling Dynamic Keys

Aggregated on: 2025-11-20 19:11:10

When your backend sends responses that don't follow a consistent structure, Swift's Decodable system can begin to reveal its limitations. It expects structure. Predictability. Stability. However, real-world APIs — especially those powering social feeds, content backends, or any CMS-driven application — rarely fit that mold. This article takes a look under the hood of Swift's decoding system. The goal isn't to memorize recipes, but to understand what's really happening so you can build decoding logic that scales with the unpredictable nature of your APIs.

View more...