DZone Spotlight

Saturday, June 1 View All Articles »

Top Articles of the Week

Explore the top-performing articles from the past week — a diverse range of engaging and insightful content.

You Can Shape Trend Reports — Participate in DZone Research Surveys + Enter the Raffles!

By Caitlin Candelmo

Hello, DZone Community! We have several surveys in progress as part of our research for upcoming Trend Reports. We would love for you to join us by sharing your experiences and insights (anonymously if you choose) — readers just like you drive the content that we cover in our Trend Reports. you can find details for each research survey below Over the coming months, we will compile and analyze data from hundreds of respondents; results and observations will be featured in the "Key Research Findings" of our Trend Reports. Data Engineering Research As a continuation of our annual data-related research, we're consolidating our database, data pipeline, and data and analytics scopes into a single 12-minute survey that will guide help the narratives of our July Database Systems Trend Report and data engineering report later in the year. Our 2024 Data Engineering Survey explores: Database types, languages, and use cases Distributed database design + architectures Data observability, security, and governance Data pipelines, real-time processing, and structured storage Vector data and databases + other AI-driven data capabilities Join the Data Engineering Research You'll also have the chance to enter the $500 raffle at the end of the survey — five random people will be drawn and will receive $100 each (USD)! Cloud and Kubernetes Research This year, we're combining our annual cloud native and Kubernetes research into one 10-minute survey that dives further into these topics as they relate to both one another and at the intersection of security, observability, AI, and more. DZone's research will be informing these Trend Reports: May – Cloud Native: Championing Cloud Development Across the SDLC September – Kubernetes in the Enterprise Our 2024 Cloud Native Survey covers: Microservices, container orchestration, and tools/solutions Kubernetes use cases, pain points, and security measures Cloud infrastructure, costs, tech debt, and security threats AI for release management + monitoring/observability Join the Cloud Native Research Don't forget to enter the $750 raffle at the end of the survey! Five random people will be selected to each receive $150 (USD). Your responses help inform the narrative of our Trend Reports, so we truly cannot do this without you. Stay tuned for each report's launch and see how your insights align with the larger DZone Community. We thank you in advance for your help! —The DZone Publications team More

Trend Report

Cloud Native

Cloud native has been deeply entrenched in organizations for years now, yet it remains an evolving and innovative solution across the software development industry. Organizations rely on a cloud-centric state of development that allows their applications to remain resilient and scalable in this ever-changing landscape. Amidst market concerns, tool sprawl, and the increased need for cost optimization, there are few conversations more important today than those around cloud-native efficacy at organizations.Google Cloud breaks down "cloud native" into five primary pillars: containers and orchestration, microservices, DevOps, and CI/CD. For DZone's 2024 Cloud Native Trend Report, we further explored these pillars, focusing our research on learning how nuanced technology and methodologies are driving the vision for what cloud native means and entails today. The articles, contributed by experts in the DZone Community, bring the pillars into conversation via topics such as automating the cloud through orchestration and AI, using shift left to improve delivery and strengthen security, surviving observability challenges, and strategizing cost optimizations.

Refcard #395

Open Source Migration Practices and Patterns

By Nuwan Dias

CORE

Open Source Migration Practices and Patterns

Refcard #171

MongoDB Essentials

By Abhishek Gupta

CORE

Feature Flags for Coordinated Spring API and Mobile App Rollouts

As part of our FinTech mobile application deployment, we usually faced challenges coordinating the releases across backend APIs and mobile applications on iOS and Android platforms whenever we released significant new features. Typically, we will first deploy the backend APIs, as they are quick to deploy along with the database changes. Once the backend APIs are deployed, we publish the mobile applications for both iOS and Android platforms. The publishing process often takes time. The mobile application gets approved within a few hours and sometimes within a few days. If we raise the tickets with stores, the SLA (Service Level Agreement) for those tickets will span multiple days. The delays we saw were predominantly with Android in the last year or so. Once the back-end APIs are deployed, they initiate new workflows related to the new features. These could be new screens or a new set of data. However, the mobile application version available at that time for both platforms is not ready to accept these new screens as the newer app version has not been approved yet and would be in the store review process. This inconsistency can lead to a poor user experience which can manifest in various ways, such as the app not functioning correctly, the application crashing, or displaying an oops page or some internal errors. This can be avoided by implementing feature flags on the backend APIs. Feature Flags The feature flags are the configurations stored in the database that help us turn specific features on or off in an application without requiring code changes. By wrapping new functionality behind feature flags, we can deploy the code to production environments while keeping the features hidden from end-users until they are ready to be released. Once the newer versions of the mobile apps are available, we enable these new features from the database so that backend APIs can orchestrate the new workflows or data for the new set of features. Additionally, we need to consider that both iOS and Android apps would get published at different times, so we need to ensure that we have platform-specific feature flags. In our experience, we have seen iOS apps get approved in minutes or hours, and Android apps sometimes take a day to a few hours. In summary, backend APIs need to orchestrate new workflows and data when the corresponding platform application's latest version is available in the app store. For existing users who have the app installed already, we force a version upgrade at the app launch. To avoid version discrepancy issues during the new feature rollout, we follow a coordinated release strategy using feature flags, as explained below. Coordinated Release Backend APIs Release With Feature Flags Off We first deploy the backend APIs with feature flags with the value Off for all the platforms. Typically, when we create the feature flags, we keep the default value as Off or 0. Mobile Application Publishing The mobile application teams for iOS and Android submit the latest validated version to the App Store and Play Store, respectively. The respective teams monitor the publishing process for rejections or clarifications during the review process. Enable New Feature Once the respective mobile application team confirms that the app has been published, then we enable the new feature for that platform. Monitoring After the new feature has been enabled across the platforms, we monitor the production environment for backend APIs for any errors and mobile applications for any crashes. If any significant issue is identified, we turn off the feature entirely across all platforms or specific platforms, depending on the type of the issue. This allows us to instantaneously roll back a new feature functionality, minimizing the impact on user experience. Feature Flags Implementation in Spring Boot Application Feature Service Below is an example of a FeatureServiceV1Impl Spring service in the Spring Boot application, which handles feature flags configuration. We have defined the bean's scope as the request scope. This ensures a new service instance is created for each HTTP request, thus ensuring that the updated configuration data is available for all new requests. The initializeConfiguration method is annotated with @PostConstruct, meaning it is called after the bean's properties have been set. This method fetches the configuration data from the database when the service is first instantiated for each request. With request scope, we only fetch the feature flags configuration from the database once. If there are feature checks at multiple places while executing that request, there would be only one database call to fetch the configuration. This service's main functionality is to check whether a specific feature is available. It does this by checking the feature flag configuration values from the database. In the example below, the isCashFlowUWAvailable method checks if the "Cash Flow Underwriting" feature is available for a given origin (iOS, Android, or mobile web app). Java @RequestScope @Service @Qualifier("featureServiceV1") public class FeatureServiceV1Impl implements FeatureServiceV1 { private final Logger logger = LoggerFactory.getLogger(this.getClass()); private List<Config> configs; @Autowired ConfigurationRepository configurationRepository; @PostConstruct private void initializeConfiguration() { logger.info("FeatureService::initializeConfiguration - Initializing configuration"); if (configs == null) { logger.info("FeatureService::initializeConfiguration - Fetching configuration"); GlobalConfigListRequest globalConfigListRequest = new GlobalConfigListRequest("ICW_API"); this.configs = this.configurationRepository.getConfigListNoError(globalConfigListRequest); } } @Override public boolean isCashFlowUWAvailable(String origin) { boolean result = false; try { if (configs != null && configs.size() > 0) { if (origin.toLowerCase().contains("ios")) { result = this.isFeatureAvailableBasedOnConfig("feature_cf_uw_ios"); } else if (origin.toLowerCase().contains("android")) { result = this.isFeatureAvailableBasedOnConfig("feature_cf_uw_android"); } else if (origin.toLowerCase().contains("mobilewebapp")) { result = this.isFeatureAvailableBasedOnConfig("feature_cf_uw_mobilewebapp"); } } } catch (Exception ex) { logger.error("FeatureService::isCashFlowUWAvailable - An error occurred detail error:", ex); } return result; } private boolean isFeatureAvailableBasedOnConfig(String configName) { boolean result = false; if (configs != null && configs.size() > 0) { Optional<Config> config = Optional .of(configs.stream().filter(o -> o.getConfigName().equals(configName)).findFirst()).orElse(null); if (config.isPresent()) { String configValue = config.get().getConfigValue(); if (configValue.equalsIgnoreCase("1")) { result = true; } } } return result; } } Consuming Feature Service We will then reference and auto-wire the FeatureServiceV1 in the controller or other service in the Spring Boot application, as shown below. We annotate the FeatureServiceV1 with the @Lazy annotation. The @Lazy annotation will ensure that the FeatueServiceV1 is instantiated when the FeatrueServiceV1 method is invoked from particular methods of the controller or service. This will prevent the unnecessary loading of the feature-specific database configurations if any other method of the controller or service is invoked where the feature service is not referenced. This helps improve the application start-up time. Java @Autowired @Lazy private FeatureServiceV1 featureServiceV1; We then leverage FeatureServiceV1 to check the availability of the feature and then branch our code accordingly. Branching allows us to execute feature-specific code when available or default to the normal path. Below is an example of how to use the feature availability check and to branch the code: Java if (this.featureServiceV1.isCashFlowUWAvailable(context.origin)) { logger.info("Cashflow Underwriting Path"); // Implement the logic for the Cash Flow Underwriting path } else { logger.info("Earlier Normal Path"); // Implement the logic for the normal path } Here’s how we can implement this conditional logic in a controller or service method: Java @RestController @RequestMapping("/api/v1/uw") public class UnderwritingController { @Autowired @Lazy private FeatureServiceV1 featureServiceV1; @RequestMapping("/loan") public void processLoanUnderwriting(RequestContext context) { if (this.featureServiceV1.isCashFlowUWAvailable(context.origin)) { logger.info("Cashflow Underwriting Path"); // Implement the logic for the Cash Flow Underwriting path } else { logger.info("Earlier Normal Path"); // Implement the logic for the normal path } } } Conclusion Feature flags play is important, particularly when coordinating releases across multiple platforms. In our case, we have four channels: two native mobile applications (iOS and Android), a mobile web application (browser-based), and an iPad application. Feature flags help in smooth and controlled rollouts, minimizing disruptions to the user experience. They ensure that new features are only activated when the corresponding platform-specific latest version of the application is available in the app stores.

By Amol Gote

CORE

Profiling Big Datasets With Apache Spark and Deequ

In today's data-driven environment, mastering the profiling of large datasets with Apache Spark and Deequ is crucial for any professional dealing with data analysis, SEO optimization, or similar fields requiring a deep dive into digital content. Apache Spark offers the computational power necessary for handling vast amounts of data, while Deequ provides a layer for quality assurance, setting benchmarks for what could be termed 'unit tests for data'. This combination ensures that business users gain confidence in their data's integrity for analysis and reporting purposes. Have you ever encountered challenges in maintaining the quality of large datasets or found it difficult to ensure the reliability of data attributes used in your analyses? If so, integrating Deequ with Spark could be the solution you're looking for. This article is designed to guide you through the process, from installation to practical application, with a focus on enhancing your workflow and outcomes. By exploring the functionalities and benefits of Deequ and Spark, you will learn how to apply these tools effectively in your data projects, ensuring that your datasets not only meet but exceed quality standards. Let's delve into how these technologies can transform your approach to data profiling and quality control. Introduction to Data Profiling With Apache Spark and Deequ Understanding your datasets deeply is crucial in data analytics, and this is where Apache Spark and Deequ shine. Apache Spark is renowned for its fast processing of large datasets, which makes this famous tool indispensable for data analytics. Its architecture is adept at handling vast amounts of data efficiently, which is critical for data profiling. Deequ complements Spark by focusing on data quality. This synergy provides a robust solution for data profiling, allowing for the identification and correction of issues like missing values or inconsistencies, which are vital for accurate analysis. What exactly makes Deequ an invaluable asset for ensuring data quality? At its core, Deequ is built to implement 'unit tests for data', a concept that might sound familiar if you have a background in software development. These tests are not for code, however; they're for your data. They allow you to set specific quality benchmarks that your datasets must meet before being deemed reliable for analysis or reporting. Imagine you're handling customer data. With Deequ, you can easily set up checks to ensure that every customer record is complete, that email addresses follow a valid format, or that no duplicate entries exist. This level of scrutiny is what sets Deequ apart—it transforms data quality from a concept into a measurable, achievable goal. The integration of Deequ with Apache Spark leverages Spark's scalable data processing framework to apply these quality checks across vast datasets efficiently. This combination does not merely flag issues; it provides actionable insights that guide the correction process. For instance, if Deequ detects a high number of incomplete records in a dataset, you can then investigate the cause—be it a flaw in data collection or an error in data entry—and rectify it, thus enhancing the overall quality of your data. Below is a high-level diagram (Source: AWS) that illustrates the Deequ library's usage within the Apache Spark ecosystem: Setting up Apache Spark and Deequ for Data Profiling To begin data profiling with Apache Spark and Deequ, setting up your environment is essential. Ensure Java and Scala are installed, as they are prerequisites for running Spark, which you can verify through Spark's official documentation. For Deequ, which works atop Spark, add the library to your build manager. If you're using Maven, it's as simple as adding the Deequ dependency to your pom.xml file. For SBT, include it in your build.sbt file, and make sure it matches your Spark version. Python users, you're not left out. PyDeequ is your go-to for integrating Deequ's capabilities into your Python environment. Install it with pip using the following commands: Python pip install pydeequ After installation, conduct a quick test to ensure everything is running smoothly: Python import pydeequ # Simple test to verify installation print(pydeequ.__version__) This quick test prints the installed version of PyDeequ, confirming that your setup is ready for action. With these steps, your system is now equipped to perform robust data quality checks with Spark and Deequ, paving the way for in-depth data profiling in your upcoming projects. Practical Guide To Profiling Data With Deequ Once your environment is prepared with Apache Spark and Deequ, you're ready to engage in the practical side of data profiling. Let’s focus on some of the key metrics that Deequ provides for data profiling —Completeness, Uniqueness, and Correlation. First is Completeness; this metric ensures data integrity by verifying the absence of null values in your data. Uniqueness identifies and eliminates duplicate records, ensuring data distinctiveness. Finally, Correlation quantifies the relationship between two variables, providing insights into data dependencies. Let’s say you have a dataset from IMDb with the following structure: Python root |-- tconst: string (nullable = true) |-- titleType: string (nullable = true) |-- primaryTitle: string (nullable = true) |-- originalTitle: string (nullable = true) |-- isAdult: integer (nullable = true) |-- startYear: string (nullable = true) |-- endYear: string (nullable = true) |-- runtimeMinutes: string (nullable = true) |-- genres: string (nullable = true) |-- averageRating: double (nullable = true) |-- numVotes: integer (nullable = true) We'll use the following Scala script to profile the dataset. This script will apply various Deequ analyzers to compute metrics such as the size of the dataset, the completeness of the 'averageRating' column, and the uniqueness of the 'tconst' identifier. Scala import com.amazon.deequ.analyzers._ import com.amazon.deequ.AnalysisRunner import org.apache.spark.sql.SparkSession val spark = SparkSession.builder() .appName("Deequ Profiling Example") .getOrCreate() val data = spark.read.format("csv").option("header", "true").load("path_to_imdb_dataset.csv") val runAnalyzer: AnalyzerContext = { AnalysisRunner .onData(data) .addAnalyzer(Size()) .addAnalyzer(Completeness("averageRating")) .addAnalyzer(Uniqueness("tconst")) .addAnalyzer(Mean("averageRating")) .addAnalyzer(StandardDeviation("averageRating")) .addAnalyzer(Compliance("top rating", "averageRating >= 7.0")) .addAnalyzer(Correlation("numVotes", "averageRating")) .addAnalyzer(Distinctness("tconst")) .addAnalyzer(Maximum("averageRating")) .addAnalyzer(Minimum("averageRating")) .run() } val metricsResult = successMetricsAsDataFrame(spark, runAnalyzer) metricsResult.show(false) Executing this script provides a DataFrame output, which reveals several insights about our data: From the output, we observe: The dataset has 7,339,583 rows. The tconst column's complete distinctness and uniqueness at 1.0 indicates every value in the column is unique. The averageRating spans from a minimum of 1 to a maximum of 10, averaging at 6.88 with a standard deviation of 1.39, highlighting the data's rating variation. A completeness score of 0.148 for the averageRating column reveals that only about 15% of the dataset's records have a specified average rating. Analyzing the relationship between numVotes and averageRating through the Pearson correlation coefficient, which stands at 0.01, indicates an absence of correlation between these two variables, aligning with expectations. These metrics equip us with insights to navigate your dataset's intricacies, supporting informed decisions and strategic planning in data management. Advanced Applications and Strategies for Data Quality Assurance Data quality assurance is an ongoing process, vital for any data-driven operation. With tools like Deequ, you can implement strategies that not only detect issues but also prevent them. By employing data profiling on incremental data loads, we can detect anomalies and maintain consistency over time. For instance, utilizing Deequ’s AnalysisRunner, we can observe historical trends and set up checks that capture deviations from expected patterns. For example, if the usual output of your ETL jobs is around 7 million records, a sudden increase or decrease in this count could be a telltale sign of underlying issues. It’s crucial to investigate such deviations as they may indicate problems with data extraction or loading processes. Utilizing Deequ’s Check function allows you to verify compliance with predefined conditions, such as expected record counts, to flag these issues automatically. Attribute uniqueness, crucial in data integrity, also requires constant vigilance. Imagine discovering a change in the uniqueness score of a customer ID attribute, which should be unwaveringly unique. This anomaly could indicate duplicate records or data breaches. Timely detection through profiling using Deequ's Uniqueness metric will help you maintain the trustworthiness of your data. Historical consistency is another pillar of quality assurance. Should the 'averageRating' column, which historically fluctuates between 1 and 10, suddenly exhibit values outside this range, which raises questions. Is this a data input error or an actual shift in user behavior? Profiling with Deequ helps you discern the difference and take appropriate measures. The AnalysisRunner can be configured to track the historical distribution of 'averageRating' and alert you to any anomalies. Business Use Case for Anomaly Detection Using Aggregated Metric From Deequ Consider a business use case where a process is crawling the pages of websites and it requires a mechanism to identify if the crawling process is working as expected or not. In order to place an anomaly detection in this process, we can use the Deequ library to identify record counts at particular intervals and use it for advanced anomaly detection techniques. For e.g., a crawl is identifying 9500 to 10500 pages daily on a website over a period of 2 months. In this case, if the crawl range goes above or below this range we may like to raise an alert to the team. The diagram below displays the daily calculated record count of pages seen on the website. Using basic statistical techniques like rate of change (records change on a day-to-day basis), one can see that the changes always oscillate around zero as shown in the image below. The diagram below displays the normal distribution of the rate of change and based on the shape of the bell curve it is evident that the anticipated change for this data point is around 0% with a standard deviation of 2.63%. This indicates that for this website the page addition/deletion follows a range of around -5.26% to +5.25% with 90% confidence. Based on this indicator, one can set up a rule on the page record count to raise an alert, if the change range does not follow this guideline. This is a basic example of using the statistical method over data to identify anomalies over aggregated numbers. Based on the historic data availability and factors such as seasonality etc., methodology such as Holt-Winters Forecasting can be used for efficient anomaly detection. The fusion of Apache Spark and Deequ emerges as a powerful combo that will help you elevate the integrity and reliability of your datasets. Through the practical applications and strategies demonstrated above, we've seen how Deequ not only identifies but prevents anomalies, ensuring the consistency and accuracy of your precious data. So, if you want to unlock the full potential of your data, I advise you to leverage the power of Spark and Deequ. With this toolset, you will safeguard your data's quality and dramatically enhance your decision-making processes, and your data-driven insights will be both robust and reliable.

By Akshay Jain

Transforming Proprietary Trace Context to W3C Trace Context

Microservices-based architecture splits applications into multiple independent deployable services, where each service provides a unique functionality. Every architectural style has pros and cons. One of the challenges of micro-service architecture is complex debugging/troubleshooting. Distributed Tracing In a microservice world, distributed tracing is the key to faster troubleshooting. Distributed tracing enables engineers to track a request through the mesh of services and therefore help them troubleshoot a problem. To achieve this, a unique identifier, say, trace-id, is injected right at the initiation point of a request, which is usually an HTTP load balancer. As the request hops through the different components (third-party apps, service-mesh, etc.), the same trace-id should be recorded at every component. This essentially requires propagation of the trace-id from one hop to another. Over a period, different vendors adopted different mechanisms to define the unique identifier (trace-id), for example: Zipkin B3 headers Datadog tracing headers Google proprietary trace context AWS proprietary trace context Envoy request id W3C trace context An application can adopt one of these available solutions as per the need. Accordingly, the relevant header (e.g., x-cloud-trace-context if Google proprietary trace context is adopted) should get injected at the request initiation and thereafter same value should get propagated to each of the components involved in the request lifecycle to achieve distributed tracing. W3C Trace Context Standard As the microservice world is evolving, there is a need to have a standard mechanism for trace propagation. Consider the case when two different applications that adopted two different trace propagation approaches, are used together. Since they use two different headers for trace propagation, distributed tracing gets broken when they communicate. To address such problems, it is recommended to use the W3C trace context across all components. W3C trace context is the standard that is being adopted by all major vendors for supporting cross-vendor distributed traces. Problem: Broken Traces OpenTelemetry supports the W3C trace context header "traceparent" propagation using auto-instrumentation. This means, as an application developer, I need not write any code in my application for trace context propagation when I instrument it with OpenTelemetry. For example, if I have a Java application, I can instrument it as shown below: java -javaagent:opentelemetry-javaagent.jar -Dotel.service.name=app-name -jar app.jar The traceparent header will now be automatically generated/propagated by the instrumented Java application. However, when my application, instrumented using OpenTelemetry, gets deployed behind GCP or AWS HTTP load balancer, my expectation to visualize the complete trace starting from the load balancer fails. This is because GCP HTTP Load Balancer supports their proprietary trace context header "X-Cloud-Trace-Context". See GCP documentation for more details. AWS Elastic Load Balancer supports their proprietary trace context header "X-Amzn-Trace-Id". See AWS documentation for more details. My application generates and logs the W3C traceparent header. So, the unique-identifier generated by the GCP/AWS load balancer is not propagated further by my application. This is the typical problem of broken traces, also described above. So how can a developer leverage the out-of-the-box OpenTelemetry trace context propagation functionality? Solution: GCP Trace Context Transformer We have solved this problem by transforming the GCP/AWS proprietary trace context header (X-Cloud-Trace-Context/ X-Amzn-Trace-Id) to the W3C trace context header (traceparent). Service mesh is a key component in a distributed system to enforce organization policies consistently across all the applications. One of the popular service mesh, Istio, can help in solving our problem. The diagram below elaborates on the solution: A common Trace-Id value across all the logs generated from the load balancer, istio-ingress gateway, istio-sidecar, and application logs helps in stitching all the logs for a request processing. Istio allows you to extend the data-plane behavior by writing custom logic using either Lua or WASM. We have extended the istio-ingress gateway by injecting a Lua filter. This filter extracts the trace-id and span-id from X-Cloud-Trace-Context and creates the traceparent request header using these values. Note: For the sake of simplicity, below filter code is built only for the GCP "X-Cloud-Trace-Context". One can write a similar filter for AWS "X-Amzn-Trace-Id". While adopting the filter in your infrastructure, don't forget to choose the right namespace and workloadSelector label. This filter has been tested on Istio 1.20.1 version. YAML apiVersion: networking.istio.io/v1alpha3 kind: EnvoyFilter metadata: name: gcp-trace-context-transformer-gateway namespace: istio-system spec: workloadSelector: labels: istio: ingressgateway configPatches: - applyTo: HTTP_FILTER # http connection manager is a filter in Envoy match: context: GATEWAY patch: operation: INSERT_BEFORE value: name: envoy.filters.http.lua typed_config: "@type": "type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua" inlineCode: | function envoy_on_request(request_handle) local z = request_handle:headers():get("traceparent") if z ~= nil then return end local x = request_handle:headers():get("X-Cloud-Trace-Context") if x == nil then return end local y = string.gmatch(x, "%x+") local traceid = y() if (traceid == nil) then return end local spanid = y() if (spanid == nil) then return end local traceparent = string.format("00-%s-%s-01", traceid, spanid) request_handle:headers():add("traceparent", traceparent) end function envoy_on_response(response_handle) return end Alternate Solution: Custom GCP Trace Context Propagator Another possible solution could be extending OpenTelemetry to support the propagation of GCP proprietary trace context. One implementation exists on GitHub but, alas, it is still in alpha state (at the time of publishing this article). Further, this solution will only work for GCP environments, similar propagators will be needed for different cloud providers (AWS, etc).

By Arvind Bharti

Understanding and Learning NoSQL Databases With Java: Three Key Benefits

In today's rapidly evolving technological landscape, it is crucial for any business or application to efficiently manage and utilize data. NoSQL databases have emerged as an alternative to traditional relational databases, offering flexibility, scalability, and performance advantages. These benefits become even more pronounced when combined with Java, a robust and widely-used programming language. This article explores three key benefits of understanding and learning NoSQL databases with Java, highlighting the polyglot philosophy and its efficiency in software architecture. Enhanced Flexibility and Scalability One significant benefit of NoSQL databases is their capability to handle various data models, such as key-value pairs, documents, wide-column stores, and graph databases. This flexibility enables developers to select the most suitable data model for their use case. When combined with Java, a language renowned for its portability and platform independence, the adaptability of NoSQL databases can be fully utilized. Improved Performance and Efficiency Performance is a crucial aspect of database management, and NoSQL databases excel in this area because of their distributed nature and optimized storage mechanisms. When developers combine these performance-enhancing features with Java, they can create applications that are not only efficient but also high-performing. Embracing the Polyglot Philosophy The polyglot philosophy in software development encourages using multiple languages, frameworks, and databases within a single application to take advantage of each one's strengths. Understanding and learning NoSQL databases with Java perfectly embodies this approach, offering several benefits for modern software architecture. Leveraging Eclipse JNoSQL for Success With NoSQL Databases and Java To fully utilize NoSQL databases with Java, developers can use Eclipse JNoSQL, a framework created to streamline the integration and management of NoSQL databases in Java applications. Eclipse JNoSQL supports over 30 databases and is aligned with Jakarta NoSQL and Jakarta Data specifications, providing a comprehensive solution for modern data handling needs. Eclipse JNoSQL: Bridging Java and NoSQL Databases Eclipse JNoSQL is a framework that simplifies the interaction between Java applications and NoSQL databases. With support for over 30 different NoSQL databases, Eclipse JNoSQL enables developers to work efficiently across various data stores without compromising flexibility or performance. Key features of Eclipse JNoSQL include: Support for Jakarta Data Query Language: This feature enhances the power and flexibility of querying across databases. Cursor pagination: Processes large datasets efficiently by utilizing cursor-based pagination rather than traditional offset-based pagination NoSQLRepository: Simplifies the creation and management of repository interfaces New column and document templates: Simplify data management with predefined templates Jakarta NoSQL and Jakarta Data Specifications Eclipse JNoSQL is designed to support Jakarta NoSQL and Jakarta Data specifications, standardizing and simplifying database interactions in Java applications. Jakarta NoSQL: This comprehensive framework offers a unified API and a set of powerful annotations, making it easier to work with various NoSQL data stores while maintaining flexibility and productivity. Jakarta Data: This specification provides an API for easier data access across different database types, enabling developers to create custom query methods on repository interfaces. Introducing Eclipse JNoSQL 1.1.1 The latest release, Eclipse JNoSQL 1.1.1, includes significant enhancements and new features, making it a valuable tool for Java developers working with NoSQL databases. Key updates include: Support to cursor pagination Support to Jakarta Data Query Fixes several bugs and enhances performance For more details, visit the Eclipse JNoSQL Release 1.1.1 notes. Practical Example: Java SE Application With Oracle NoSQL To illustrate the practical use of Eclipse JNoSQL, let's consider a Java SE application using Oracle NoSQL. This example showcases the effectiveness of cursor pagination and JDQL for querying. The first pagination method we will discuss is Cursor pagination, which offers a more efficient way to handle large datasets than traditional offset-based pagination. Below is a code snippet demonstrating cursor pagination with Oracle NoSQL. Java @Repository public interface BeerRepository extends OracleNoSQLRepository<Beer, String> { @Find @OrderBy("hop") CursoredPage<Beer> style(@By("style") String style, PageRequest pageRequest); @Query("From Beer where style = ?1") List<Beer> jpql(String style); } public class App4 { public static void main(String[] args) { var faker = new Faker(); try (SeContainer container = SeContainerInitializer.newInstance().initialize()) { BeerRepository repository = container.select(BeerRepository.class).get(); for (int index = 0; index < 100; index++) { Beer beer = Beer.of(faker); // repository.save(beer); } PageRequest pageRequest = PageRequest.ofSize(3); var page1 = repository.style("Stout", pageRequest); System.out.println("Page 1"); page1.forEach(System.out::println); PageRequest pageRequest2 = page1.nextPageRequest(); var page2 = repository.style("Stout", pageRequest2); System.out.println("Page 2"); page2.forEach(System.out::println); System.out.println("JDQL query: "); repository.jpql("Stout").forEach(System.out::println); } System.exit(0); } } In this example, BeerRepository efficiently retrieves and paginates data using cursor pagination. The style method employs cursor pagination, while the jpql method demonstrates a JDQL query. API Changes and Compatibility Breaks in Eclipse JNoSQL 1.1.1 The release of Eclipse JNoSQL 1.1.1 includes significant updates and enhancements aimed at improving functionality and aligning with the latest specifications. However, it's important to note that these changes may cause compatibility issues for developers, which need to be understood and addressed in their projects. 1. Annotations Moved to Jakarta NoSQL Specification Annotations like Embeddable and Inheritance were previously included in the Eclipse JNoSQL framework. In the latest version, however, they have been relocated to the Jakarta NoSQL specification to establish a more consistent approach across various NoSQL databases. As a result, developers will need to update their imports and references to these annotations. Java // Old import import org.jnosql.mapping.Embeddable; // New import import jakarta.nosql.Embeddable; The updated annotations can be accessed at the Jakarta NoSQL GitHub repository. 2. Unified Query Packages To simplify and unify the query APIs, SelectQuery and DeleteQuery have been consolidated into a single package. Consequently, specific query classes like DocumentQuery, DocumentDeleteQuery, ColumnQuery, and ColumnDeleteQuery have been removed. Impact: Any code using these removed classes will no longer compile and must be refactored to use the new unified classes. Solution: Refactor your code to use the new query classes in the org.eclipse.jnosql.communication.semistructured package. For example: Java // Old usage DocumentQuery query = DocumentQuery.select().from("collection").where("field").eq("value").build(); // New usage SelectQuery query = SelectQuery.select().from("collection").where("field").eq("value").build(); Similar adjustments will be needed for delete queries. 3. Migration of Templates Templates such as ColumnTemplate, KeyValueTemplate, and DocumentTemplate have been moved from the Jakarta Specification to Eclipse JNoSQL. Java // Old import import jakarta.nosql.document.DocumentTemplate; // New import import org.eclipse.jnosql.mapping.document.DocumentTemplate; 4. Default Query Language: Jakarta Data Query Language (JDQL) Another significant update in Eclipse JNoSQL 1.1.1 is the adoption of Jakarta Data Query Language (JDQL) as the default query language. JDQL provides a standardized way to define queries using annotations, making it simpler and more intuitive for developers. Conclusion The use of a NoSQL database is a powerful asset in modern applications. It allows software architects to employ polyglot persistence, utilizing the best persistence capability in each scenario. Eclipse JNoSQL assists Java developers in implementing these NoSQL capabilities into their applications.

By Otavio Santana

CORE

How to Quickly Create and Easily Configure a Local Redis Cluster

Context Do you crave hands-on experience with Redis clusters? Perhaps you're eager to learn its intricacies or conduct targeted testing and troubleshooting. A local Redis cluster empowers you with that very control. By setting it up on your own machine, you gain the freedom to experiment, validate concepts, and delve deeper into its functionality. This guide will equip you with the knowledge to quickly create and manage a Redis cluster on your local machine, paving the way for a productive and insightful learning journey. Install Redis The first step would be to install a Redis server locally. Later cluster creation commands will use Redis instances as building blocks and combine them into a cluster. Mac The easiest way would be to install using Homebrew. Use the following command to install Redis on your Macbook. Shell brew install redis Linux Use the following command to install. Shell sudo apt update sudo apt install redis-server From the Source If you need a specific version then you can use this method of installation. For this, you can use the following steps: Download the latest Redis source code from the official website. Unpack the downloaded archive. Navigate to the extracted directory in your terminal. Run the following commands: Shell make sudo make install Create Cluster One Time Steps Clone the git repository Go to the directory where you cloned the repository Then go to the following directory Shell cd <path to local redis repository>/redis/utils/create-cluster Modify create-cluster with the path to your Redis-server Shell vi create-cluster Replace BIN_PATH="$SCRIPT_DIR/../../src/" with BIN_PATH="/usr/local/bin/" Steps to Create/Start/Stop/Clean Cluster These steps are used whenever you need to use a Redis Cluster. Start the Redis Instances Shell ./create-cluster start Create the Cluster Shell echo "yes" | ./create-cluster create Tip You can create an alias and add it to the shell configuration files (~/.bashrc or ~./zshrc) Example: Shell open ~/.zshrc Add the following to this file. Shell alias cluster_start="./create-cluster start && echo "yes" | ./create-cluster create" Open a new terminal and run the following. Shell source ~/.zshrc Now you use “cluster_start” in the command line and it will start and create the cluster for you. Stop the Cluster Shell ./create-cluster stop Clean Up Clears previous cluster data for a fresh start. Shell ./create-cluster clean Tip Similarly, you can create an alias as below to stop the cluster and clean the cluster data files. Shell alias cluster_stop="./create-cluster stop && ./create-cluster clean” How To Create the Cluster With a Custom Number of Nodes by Default By default cluster-create script creates 6 nodes with 3 primaries and 3 replicas. For some special testing or troubleshooting if you need to change the number of nodes you can modify the script instead of manually adding nodes. Shell vi create-cluster Edit the following to the desired number of nodes for the cluster. NODES=6 Also, by default, it creates 1 replica for a primary. You can change that as well by changing the value in the same script (create-cluster) to the desired value. REPLICAS=1 Create Cluster With Custom Configuration Redis provides various options to customize the configuration to configure Redis servers the way you want. All those are present in the redis.conf file. In order to customize those with the desired options follow these steps: Edit the redis.conf With Desired Configurations Shell cd <path to local redis repository>/redis/redis.conf Edit the create-cluster Script Shell vi create-cluster Modify the command in the start and restart options of the script to add the following ../../redis.conf Before Modification Shell $BIN_PATH/redis-server --port $PORT --protected-mode $PROTECTED_MODE --cluster-enabled yes --cluster-config-file nodes-${PORT}.conf --cluster-node-timeout $TIMEOUT --appendonly yes --appendfilename appendonly-${PORT}.aof --appenddirname appendonlydir-${PORT} --dbfilename dump-${PORT}.rdb --logfile ${PORT}.log --daemonize yes ${ADDITIONAL_OPTIONS} After Modification Shell $BIN_PATH/redis-server ../../redis.conf --port $PORT --protected-mode $PROTECTED_MODE --cluster-enabled yes --cluster-config-file nodes-${PORT}.conf --cluster-node-timeout $TIMEOUT --appendonly yes --appendfilename appendonly-${PORT}.aof --appenddirname appendonlydir-${PORT} --dbfilename dump-${PORT}.rdb --logfile ${PORT}.log --daemonize yes ${ADDITIONAL_OPTIONS} For reference please see the snippet below after modification in the start option: References GitHub Redis Redis.git

By Rahul Chaturvedi

Microservices: Consistent State Propagation With Debezium Engine

This article outlines the main challenges of state propagation across different (micro)services together with architectural patterns for facing those challenges, providing an actual implementation using the Debezium Engine. But first of all, let's define some of the basic concepts that will be used during the article. (Micro)service: I personally do not like this term. I prefer talking about bounded contexts and their internal domain/sub-domains. Running the bounded context and/or its domains as different runtimes (aka (micro)services) is a deployment decision, meaning that domain and sub-domains within a bounded context must be decoupled independently of how they are deployed (microservices vs modular monolith). For the sake of simplicity, in this article, I will use the word "microservice" to represent isolated runtimes that need to react to changes in the state maintained by other microservices. State: Within a microservice, its bounded context, domain, and subdomains define an internal state in the form of aggregates (entities and value objects). In this article, changes in the microservice's state shall be propagated to another runtime. State propagation: In a scenario where the application runtimes (i.e., microservices) are decoupled, independently of how they are deployed (Pod in Kubernetes, Docker container, OS service/process), state propagation represents the mechanism for ensuring that state mutation happened in a bounded context is communicated to the downstream bounded contexts. Setting the Stage As a playground for this article, we will use two bounded contexts defined as part of an e-learning platform: Course Management: Managing courses, including operations for course authoring Rating System: Enable platform users to rate courses Between both contexts, there is an asynchronous, event-driven, relationship, so that whenever a course is created in the Course Management bounded context, an event is published and eventually received by the Rating System, which adapts the inbound event to its internal domain model using an anti-corruption layer (ACL) and creates an empty rating for that Course. The next image outlines this (simplified) context mapping: Bounded Context Mapping of the two services The Challenge The scenario is apparently pretty simple: we can just publish a message (CourseCreated) to a message broker (e.g., RabbitMQ) whenever a Course is created, and by having the Rating System subscribed to that event, it will eventually be received and processed by the downstream service. However, it is not so simple, and we have several "what-if" situations, like: What if the Course creation transaction is eventually rolled back (e.g., the database does not accept the operation) but the message is correctly published to the message broker? What if the Course is created and persisted, but the message broker does not accept the message (or it is not available)? In a nutshell, the main challenge here is how to ensure, in a distributed solution, that both the domain state mutation and the corresponding event publication happen in a consistent and atomic operation so that both or none should happen. There are certainly solutions that can be implemented and message producer or message consumer sides to try to solve these situations, like retries, publishing compensation events, and manually reverting the write operation in the database. However, most of them require the software engineers to have too many scenarios in mind, which is error-prone and reduces codebase maintainability. Another alternative is implementing a 2-phase-commit solution at the infrastructure level, making the deployment and operations of the underlying infrastructure more complex, and most likely, forcing the adoption of expensive commercial solutions. During the rest of the article, we will focus on a solution based on the combination of two important patterns in distributed systems: Transactional Outbox and Change Data Capture, providing a reference implementation that will allow software engineers to focus on what really matters: providing domain value. Applicable Architecture Patterns As described above, we need to ensure that state mutation and the publication of a domain event are atomic operations. This can be achieved by the combination of two patterns which are nicely explained by Chris Richardson in his must-read Microservices Patterns book; therefore, I will not explain them in detail here. Transactional Outbox Pattern Transactional Outbox focuses on persisting both state mutation and corresponding event(s) in an atomic database operation. In our case, we will leverage the ACID capabilities of a relational database, with one transaction that includes two write operations, one in the domain-specific table, and another that persists the events in an outbox (supporting) table. This ensures that we will achieve a consistent state of both domains and the corresponding events. This is shown in the next figure: Transactional Outbox Pattern Change Data Capture With Transaction Log Tailing Once the events are available in the Outbox table, we need a mechanism to detect new events stored in the Outbox (Change Data Capture or CDC) and publish them to external consumers (Transaction Log Tailing Message Relay). The Message Relay is responsible for detecting (i.e., CDC) new events available in the outbox table and publishing those events for external consumers via message broker (e.g., RabbitMQ, SNS + SQS) or event stream (e.g., Kafka, Kinesis). There are different Change Data Capture (CDC) techniques for the Message Relay to detect new events available in the outbox. In this article, we will use the Log Scanners approach, named Transaction Log Tailing by Chris Richardson, where the Message Relay tails the database transaction log to detect the new messages that have been appended to the Outbox table. I personally prefer this approach since it reduces the amount of manual work, but might not be available for all databases. The next image illustrates how the Message Relay integrates with the Transactional Outbox: Transactional Outbox in combination with Transaction Log Tailing CDC One of the main goals of this solution is to ensure that the software engineers of the two bounded contexts only need to focus on the elements with orange color in the diagram above; the grey components are just infrastructure elements that shall be transparent for the developers. So, how do we implement the Transaction Log Tailing Message Relay? Debezium Debezium is a Log Scanner type change data capture solution that provides connectors for several databases, creating a stream of messages out of the changes detected in the database's transaction log. Debezium comes in two flavors: Debezium Server: This is a full-feature version of Debezium which leverages Apache Kafka and Kafka Connectors to stream data from the source database to the target system. Debezium Embedded/Engine: The simplified version can be embedded as a library in your product; it does not require an Apache Kafka service but still makes use of Kafka Connectors to detect changes in the data sources. In this example, we will use Debezium Embedded, due to its simplicity (i.e., no Kafka instance is needed) but at the same time, it is robust enough to provide a suitable solution. The first time a Debezium instance starts to track a database, it takes a snapshot of the current data to be used as a basis, once completed, only the delta of changes from the latest stored offset will be processed. Debezium is highly configurable, making it possible to shape its behavior to meet different expectations, allowing, for instance, to define: The database operations to be tracked (updates, inserts, deletions, schema modifications) The database tables to be tracked Offset backing store solution (in-memory, file-based, Kafka-based) Offset flush internal Some of these properties will be analyzed later in the article. All Pieces Together The next image shows the overall solution from the deployment perspective: Services: In this example, we will use Spring Boot for building the Course Management and Rating System bounded contexts. Each bounded context will be deployed as separate runtimes. The persistence solutions for both services will be PostgreSQL having each of the services a dedicated schema. Persistence: PostgreSQL is also deployed as a Docker container. Message Broker: For the message broker, we will use RabbitMQ, which is also running as a Docker container. Message Relay: Leveraging Spring Boot and Debezium Embedded provides the Change Data Capture (CDC) solution for detecting new records added in the outbox table of the Course Management service. Show Me the Code All the code described in this article can be found in my personal GitHub repository. Overall Project Structure The code provided to implement this example is structured as a multi-module Java Maven project, leveraging Spring Boot and following a hexagonal architecture-like structure. There are three main package groups: 1. Toolkit Supporting Context Modules providing shared components and infrastructure-related elements used by the functional bounded contexts (in this example Course Management and Rating System). For instance, the transactional outbox and the Debezium-based change data capture are shared concerns, and therefore their code belongs to these modules. Where: toolkit-core: Core classes and interfaces, to be used by the functional contexts toolkit-outbox-jpa-postgres: Implementation of the transactional outbox using JPA for Postgres toolkit-cdc-debezium-postgres: Implementation of the message relay as CDC with Debezium embedded for Postgres toolkit-message-publisher-rabbitmq: Message publishers implementation for RabbitMQ toolkit-tx-spring: Provides programmatic transaction management with Spring toolkit-state-propagation-debezium-runtime: Runtime (service) responsible for the CDC and message publication to the RabbitMQ instance 2. Course Management Bounded Context These modules conform to the Course Management bounded context. The module adheres to hexagonal architecture principles, similar to the structure already used in my previous article about repository testing. Where: course-management-domain: Module with the Course Management domain definition, including entities, value objects, domain events, ports, etc; this module has no dependencies with frameworks, being as pure Java as possible. course-management-application: Following hexagonal architecture, this module orchestrates invocations to the domain model using commands and command handlers. course-management-repository-test: Contains the repository test definitions, with no dependencies to frameworks, only verifies the expectation of the repository ports defined in the course-management-domain module course-management-repository-jpa: JPA implementation of the repository interface CourseDefinitionRepository defined in the course-management-domain module; Leverages Spring Boot JPA course-management-repository-jpa-postgres: Specialization of the repository JPA implementation of the previous module, adding postgres specific concerns (e.g., Postgres database migration scripts) course-management-rest: Inbound web-based adapter exposing HTTP endpoints for course creation course-management-runtime: Runtime (service) of the Course Management context 3. Rating System Bounded Context For the sake of simplicity, this bounded context is partially implemented with only an inbound AMPQ-based adapter for receiving the messages created by the Course Management service when a new course is created and published by the CDC service (toolkit-state-propagation-debezium-runtime). Where: rating-system-amqp-listener: AMQP listener leveraging Spring Boot AMQP, subscribes to messages of the Course Management context rating-system-domain: No domain has been defined for the Rating System context. rating-system-application: No application layer has been defined for the Rating System context. rating-system-runtime: Runtime (service) of the Rating System context, starting the AMQP listener defined in rating-system-amqp-listener Request Flow This section outlines the flow of a request for creating a course, starting when the user requests the creation of a course to the backend API and finalizing with the (consistent) publication of the corresponding event in the message broker. This flow has been split into three phases. Phase 1: State Mutation and Domain Events Creation This phase starts with the request for creating a new Course Definition. The HTTP POST request is mapped to a domain command and processed by its corresponding command handler defined in course-management-application. Command handlers are automatically injected into the provided CommandBus implementation; in this example, the CommandBusInProcess defined in the toolkit-core module: Java import io.twba.tk.command.CommandHandler; import jakarta.inject.Inject; import jakarta.inject.Named; @Named public class CreateCourseDefinitionCommandHandler implements CommandHandler<CreateCourseDefinitionCommand> { private final CourseDefinitionRepository courseDefinitionRepository; @Inject public CreateCourseDefinitionCommandHandler(CourseDefinitionRepository courseDefinitionRepository) { this.courseDefinitionRepository = courseDefinitionRepository; } @Override public void handle(CreateCourseDefinitionCommand command) { if(!courseDefinitionRepository.existsCourseDefinitionWith(command.getTenantId(), command.getCourseDescription().title())) { courseDefinitionRepository.save(CourseDefinition.builder(command.getTenantId()) .withCourseDates(command.getCourseDates()) .withCourseDescription(command.getCourseDescription()) .withCourseObjective(command.getCourseObjective()) .withDuration(command.getCourseDuration()) .withTeacherId(command.getTeacherId()) .withPreRequirements(command.getPreRequirements()) .withCourseId(command.getCourseId()) .createNew()); } else { throw new IllegalStateException("Course definition with value " + command.getCourseDescription().title() + " already exists"); } } } The command handler creates an instance of the CourseDefinition entity. The business logic and invariants (if any) of creating a Course Definition are encapsulated within the domain entity. The creation of a new instance of the domain entity also comes with the corresponding CourseDefinitionCreated domain event: Java @Getter public class CourseDefinition extends MultiTenantEntity { /* ... */ public static class CourseDefinitionBuilder { /* ... */ public CourseDefinition createNew() { //new instance, generate domain event CourseDefinition courseDefinition = new CourseDefinition(tenantId, 0L, id, courseDescription, courseObjective, preRequirements, duration, teacherId, courseDates, CourseStatus.PENDING_TO_REVIEW); var courseDefinitionCreatedEvent = CourseDefinitionCreatedEvent.triggeredFrom(courseDefinition); courseDefinition.record(courseDefinitionCreatedEvent); //record event in memory return courseDefinition; } } } The event is "recorded" into the created course definition instance. This method is defined in the abstract class Entity of the toolkit-core module: Java public abstract class Entity extends ModelValidator implements ConcurrencyAware { @NotNull @Valid protected final List<@NotNull Event<? extends DomainEventPayload>> events; private Long version; public Entity(Long version) { this.events = new ArrayList<>(); this.version = version; } /*...*/ protected void record(Event<? extends DomainEventPayload> event) { event.setAggregateType(aggregateType()); event.setAggregateId(aggregateId()); event.setEventStreamVersion(Objects.isNull(version)?0:version + events.size()); this.events.add(event); } public List<Event<? extends DomainEventPayload>> getDomainEvents() { return Collections.unmodifiableList(events); } } Once the course definition instance is in place, the command handler will persist the instance in the course definition repository, starting the second phase of the processing flow. Phase 2, Persisting the State: Course Definition and Events in the Outbox Whenever a domain entity is saved in the repository, the domain events associated with the domain state mutation (in this example, the creating of a CourseDefinition entity) are temporarily appended to an in-memory, ThreadLocal, buffer. This buffer resides in the DomainEventAppender of toolkit-core. Java public class DomainEventAppender { private final ThreadLocal<List<Event<? extends DomainEventPayload>>> eventsToPublish = new ThreadLocal<>(); /*...*/ public void append(List<Event<? extends DomainEventPayload>> events) { //add the event to the buffer, later this event will be published to other bounded contexts if(isNull(eventsToPublish.get())) { resetBuffer(); } //ensure event is not already in buffer events.stream().filter(this::notInBuffer).map(this::addEventSourceMetadata).forEach(event -> eventsToPublish.get().add(event)); } /*...*/ } Events are placed in this buffer from an aspect executed around methods annotated with AppendEvents. The pointcut and aspect (both in toolkit-core) look like: Java public class CrossPointcuts { @Pointcut("@annotation(io.twba.tk.core.AppendEvents)") public void shouldAppendEvents() {} } @Aspect @Named public class DomainEventAppenderConcern { private final DomainEventAppender domainEventAppender; @Inject public DomainEventAppenderConcern(DomainEventAppender domainEventAppender) { this.domainEventAppender = domainEventAppender; } @After(value = "io.twba.tk.aspects.CrossPointcuts.shouldAppendEvents()") public void appendEventsToBuffer(JoinPoint jp) { if(Entity.class.isAssignableFrom(jp.getArgs()[0].getClass())) { Entity entity = (Entity)jp.getArgs()[0]; domainEventAppender.append(entity.getDomainEvents()); } } } The command handlers are automatically decorated before being "injected" into the command bus. One of the decorators ensures the command handlers are transactional, and another ensures when the transaction completes the events in the in-memory-thread-local buffer are published to the outbox table consistently with the ongoing transaction. The next sequence diagram shows the decorators applied to the domain-specific command handler. High-Level Processing Flow The outbox is an append-only buffer (as a Postgres table in this example) where a new entry for each event is added. The outbox entry has the following structure: Java public record OutboxMessage(String uuid, String header, String payload, String type, long epoch, String partitionKey, String tenantId, String correlationId, String source, String aggregateId) { } Where the actual event payload is serialized as a JSON string in the payload property. The outbox interface is straightforward: Java public interface Outbox { void appendMessage(OutboxMessage outboxMessage); int partitionFor(String partitionKey); } The Postgres implementation of the Outbox interface is placed in the toolkit-outbox-jpa-postgres module: Java public class OutboxJpa implements Outbox { private final OutboxProperties outboxProperties; private final OutboxMessageRepositoryJpaHelper helper; @Autowired public OutboxJpa(OutboxProperties outboxProperties, OutboxMessageRepositoryJpaHelper helper) { this.outboxProperties = outboxProperties; this.helper = helper; } @Override public void appendMessage(OutboxMessage outboxMessage) { helper.save(toJpa(outboxMessage)); } @Override public int partitionFor(String partitionKey) { return MurmurHash3.hash32x86(partitionKey.getBytes()) % outboxProperties.getNumPartitions(); } /*...*/ } Phase 2 is completed, and now both the domain state and the corresponding event consistently persist under the same transaction in our Postgres database. Phase 3: Events Publication In order to make the domain events generated in the previous phase available to external consumers, the message relay implementation based on Debezium Embedded is monitoring the outbox table, so that whenever a new record is added to the outbox, the message relay creates a Cloud Event and publishes it to the RabbitMQ instance following the Cloud Event AMQP binding specification. The following code snippet shows the Message Relay specification as defined in the toolkit-core module: Java public class DebeziumMessageRelay implements MessageRelay { private static final Logger log = LoggerFactory.getLogger(DebeziumMessageRelay.class); private final Executor executor = Executors.newSingleThreadExecutor(r -> new Thread(r, "debezium-message-relay")); private final CdcRecordChangeConsumer recordChangeConsumer; private final DebeziumEngine<RecordChangeEvent<SourceRecord>> debeziumEngine; public DebeziumMessageRelay(DebeziumProperties debeziumProperties, CdcRecordChangeConsumer recordChangeConsumer) { this.debeziumEngine = DebeziumEngine.create(ChangeEventFormat.of(Connect.class)) .using(DebeziumConfigurationProvider.outboxConnectorConfig(debeziumProperties).asProperties()) .notifying(this::handleChangeEvent) .build(); this.recordChangeConsumer = recordChangeConsumer; } private void handleChangeEvent(RecordChangeEvent<SourceRecord> sourceRecordRecordChangeEvent) { SourceRecord sourceRecord = sourceRecordRecordChangeEvent.record(); Struct sourceRecordChangeValue= (Struct) sourceRecord.value(); log.info("Received record - Key = '{}' value = '{}'", sourceRecord.key(), sourceRecord.value()); Struct struct = (Struct) sourceRecordChangeValue.get(AFTER); recordChangeConsumer.accept(DebeziumCdcRecord.of(struct)); } @Override public void start() { this.executor.execute(debeziumEngine); log.info("Debezium CDC started"); } @Override public void stop() throws IOException { if (this.debeziumEngine != null) { this.debeziumEngine.close(); } } @Override public void close() throws Exception { stop(); } } As can be seen in the code snippet above, the DebeziumEngine is configured to notify a private method handleChangeEvent when a change in the database is detected. In this method, a Consumer of CdcRecord is used as a wrapper of the internal Debezium model represented by the Struct class. Initial configuration must be provided to the Debezium Engine; in the example, this is done with the DebeziumConfigurationProvider class: Java public class DebeziumConfigurationProvider { public static io.debezium.config.Configuration outboxConnectorConfig(DebeziumProperties properties) { return withCustomProps(withStorageProps(io.debezium.config.Configuration.create() .with("name", "outbox-connector") .with("connector.class", properties.getConnectorClass()) .with("offset.storage", properties.getOffsetStorage().getType()), properties.getOffsetStorage()) .with("offset.flush.interval.ms", properties.getOffsetStorage().getFlushInterval()) .with("database.hostname", properties.getSourceDatabaseProperties().getHostname()) .with("database.port", properties.getSourceDatabaseProperties().getPort()) .with("database.user", properties.getSourceDatabaseProperties().getUser()) .with("database.password", properties.getSourceDatabaseProperties().getPassword()) .with("database.dbname", properties.getSourceDatabaseProperties().getDbName()), properties) .with("database.server.id", properties.getSourceDatabaseProperties().getServerId()) .with("database.server.name", properties.getSourceDatabaseProperties().getServerName()) .with("skipped.operations", "u,d,t") .with("include.schema.changes", "false") .with("table.include.list", properties.getSourceDatabaseProperties().getOutboxTable()) .with("snapshot.include.collection.list", properties.getSourceDatabaseProperties().getOutboxTable()) .build(); } private static Configuration.Builder withStorageProps(Configuration.Builder debeziumConfigBuilder, DebeziumProperties.OffsetProperties offsetProperties) { offsetProperties.getOffsetProps().forEach(debeziumConfigBuilder::with); return debeziumConfigBuilder; } private static Configuration.Builder withCustomProps(Configuration.Builder debeziumConfigBuilder, DebeziumProperties debeziumProperties) { debeziumProperties.getCustomProps().forEach(debeziumConfigBuilder::with); return debeziumConfigBuilder; } } The most relevant properties are outlined below: connector.class: The name of the connector to use; usually this is related to the database technology being tracked for changes. In this example, we are using io.debezium.connector.postgresql.PostgresConnector. offset.storage: Type of storage for maintaining the offset; in this example, we are using org.apache.kafka.connect.storage.MemoryOffsetBackingStore, so that offsets are lost after restarting the service. See below. offset.flush.interval. ms: Number of milliseconds for the offsets to be persisted in the offset store database.*: These properties refer to the database being tracked (CDC) for changes by Debezium. skipped.operations: If not specified, all the operations will be tracked. For our example, since we only want to detect newly created events, all the operations but inserts are skipped. table.include.list: List of the tables to include for the CDC; in this example, only the table where events are stored during Phase 2 is relevant (i.e., outbox_schema.outbox) The first thing Debezium will do after starting is take a snapshot of the current data and generate the corresponding events. After that, the offset is updated to the latest record, and the deltas (newly added, updated, or deleted records) are processed, updating the offset accordingly. Since in the provided example we are using an in-memory-based offset store, the snapshot is performed always after starting the service. Therefore, since this is not a production-ready implementation yet, there are two options: Use a durable offset store (file-based or Kafka-based are supported by Debezium Embedded) Delete the outbox table after the events are being processed, ensuring the delete operations are skipped by Debezium in its configuration. The Message Relay is configured and initialized in the toolkit-state-propagation-debezium-runtime module. The values of the configuration properties needed by Debezium Embedded are defined in the Spring Boot properties.yaml file: YAML server: port: 9091 management: endpoints: web: exposure: include: prometheus, health, flyway, info debezium: connector-class: "io.debezium.connector.postgresql.PostgresConnector" custom-props: "[topic.prefix]": "embedded-debezium" "[debezium.source.plugin.name]": "pgoutput" "[plugin.name]": "pgoutput" source-database-properties: db-name: "${CDC_DB_NAME}" hostname: "${CDC_HOST}" user: "${CDC_DB_USER}" password: "${CDC_DB_PASSWORD}" port: 5432 server-name: "debezium-message-relay" server-id: "debezium-message-relay-1" outbox-table: "${CDC_OUTBOX_TABLE}:outbox_schema.outbox" outbox-schema: "" offset-storage: type: "org.apache.kafka.connect.storage.MemoryOffsetBackingStore" flush-interval: 3000 offset-props: "[offset.flush.timeout.ms]": 1000 "[max.poll.records]": 1000 The engine is started using the Spring Boot lifecycle events: Java @Component public class MessageRelayInitializer implements ApplicationListener<ApplicationReadyEvent> { private final MessageRelay messageRelay; @Autowired public MessageRelayInitializer(MessageRelay messageRelay) { this.messageRelay = messageRelay; } @Override public void onApplicationEvent(ApplicationReadyEvent event) { messageRelay.start(); } } The change data capture records (CdcRecord) are processed by the CloudEventRecordChangeConsumer which creates the Cloud Event representation of the CDC record and publishes it through the MessagePublisher. Java public class CloudEventRecordChangeConsumer implements CdcRecordChangeConsumer { /*...*/ @Override public void accept(CdcRecord cdcRecord) { final CloudEvent event; try { String payload = cdcRecord.valueOf("payload"); String uuid = cdcRecord.valueOf("uuid"); String type = cdcRecord.valueOf("type"); String tenantId = cdcRecord.valueOf("tenant_id"); String aggregateId = cdcRecord.valueOf("aggregate_id"); long epoch = cdcRecord.valueOf("epoch"); String partitionKey = cdcRecord.valueOf("partition_key"); String source = cdcRecord.valueOf("source"); String correlationId = cdcRecord.valueOf("correlation_id"); event = new CloudEventBuilder() .withId(uuid) .withType(type) .withSubject(aggregateId) .withExtension(TwbaCloudEvent.CLOUD_EVENT_TENANT_ID, tenantId) .withExtension(TwbaCloudEvent.CLOUD_EVENT_TIMESTAMP, epoch) .withExtension(TwbaCloudEvent.CLOUD_EVENT_PARTITION_KEY, partitionKey) .withExtension(TwbaCloudEvent.CLOUD_EVENT_CORRELATION_ID, correlationId) .withExtension(TwbaCloudEvent.CLOUD_EVENT_GENERATING_APP_NAME, source) .withSource(URI.create("https://thewhiteboardarchitect.com/" + source)) .withData("application/json",payload.getBytes("UTF-8")) .build(); messagePublisher.publish(event); } catch (Exception e) { throw new RuntimeException(e); } } } The provided MessagePublisher is a simple RabbitMQ outbound adapter converting the Cloud Event to the corresponding AMQP message as per the Cloud Event AMQP protocol binding. Java public class MessagePublisherRabbitMq implements MessagePublisher { /*...*/ @Override public boolean publish(CloudEvent dispatchedMessage) { MessageProperties messageProperties = new MessageProperties(); messageProperties.setContentType(MessageProperties.CONTENT_TYPE_JSON); rabbitTemplate.send("__MR__" + dispatchedMessage.getExtension(CLOUD_EVENT_GENERATING_APP_NAME), //custom extension for message routing dispatchedMessage.getType(), toAmqpMessage(dispatchedMessage)); return true; } private static Message toAmqpMessage(CloudEvent dispatchedMessage) { return MessageBuilder.withBody(Objects.nonNull(dispatchedMessage.getData()) ? dispatchedMessage.getData().toBytes() : new byte[0]) .setContentType(MessageProperties.CONTENT_TYPE_JSON) .setMessageId(dispatchedMessage.getId()) .setHeader(CLOUD_EVENT_AMQP_BINDING_PREFIX + CLOUD_EVENT_TENANT_ID, dispatchedMessage.getExtension(CLOUD_EVENT_TENANT_ID)) .setHeader(CLOUD_EVENT_AMQP_BINDING_PREFIX + CLOUD_EVENT_TIMESTAMP, dispatchedMessage.getExtension(CLOUD_EVENT_TIMESTAMP)) .setHeader(CLOUD_EVENT_AMQP_BINDING_PREFIX + CLOUD_EVENT_PARTITION_KEY, dispatchedMessage.getExtension(CLOUD_EVENT_PARTITION_KEY)) .setHeader(CLOUD_EVENT_AMQP_BINDING_PREFIX + CLOUD_EVENT_SUBJECT, dispatchedMessage.getSubject()) .setHeader(CLOUD_EVENT_AMQP_BINDING_PREFIX + CLOUD_EVENT_SOURCE, dispatchedMessage.getSource().toString()) .build(); } } After Phase 3, the events are published to the producer's (Course Management Service) message relay RabbitMQ exchange defined under the convention __MR__<APP_NAME>. In our example, __MR__course-management, messages are routed to the right exchange based on a custom cloud event extension as shown in the previous code snippet. Visit my GitHub repository and check the readme file to see how to spin up the example. Alternative Solutions This example makes use of Debezium Embedded to provide a change data capture and message relay solution. This works fine for technologies supported by Debezium through its connectors. For non-supported providers, alternative approaches can be applied: DynamoDB Streams: Suitable for DynamoDB databases; in combination with Kinesis, it can be used for subscribing to changes in a DynamoDB (outbox) table Custom database polling: This could be implemented for supporting databases with no connectors for Debezium. Adding any of those alternatives in this example would simply just provide specific implementations of the MessageRelay interface, without additional changes in any of the other services. Conclusion Ensuring consistency in the state propagation and data exchange between services is key for providing a reliable distributed solution. Usually, this is not carefully considered when designing distributed, event-driven software, leading to undesired states and situations especially when those systems are already in production. By the combination of the transactional outbox pattern and message relay, we have seen how this consistency can be enforced, and by using the hexagonal architecture style, the additional complexity of implementing those patterns can be easily hidden and reused across bounded context implementations. The code of this article is not yet production-ready, as concerns like observability, retries, scalability (e.g., with partitioning), proper container orchestration, etc. are still pending. Subsequent articles will go through those concerns using the provided code as a basis.

By David Cano

Sad Truth Is, Bad Tests Are the Norm!

This article is a high-level overview introducing some often overlooked concepts and the need for productive functional and or integration tests. Modern development should be founded on the following: Iterating with fast feedback Eliminating waste Amplifying value Based on Lean_software_development. Allen Holub tweeted: "Agile: Work small Talk to each other Make people’s lives better That’s it. All that other junk is a distraction." Here’s a link to the original tweet. Here’s more information. What’s the Problem? How can we effectively fulfill the above, if we don’t test productively? We should be focusing on delivering value, however, most companies and projects don’t focus on this. In the ideal world, your whole company would be aligned with Agile to facilitate fast feedback and quick iterations. This very concept is hard for some to imagine, maybe you think that you are already doing this? To paraphrase Allen Holub — we have world views; one person’s world view could be an agile one, as they have worked in a startup. Another, if they have worked in a non-tech savvy organization, couldn’t imagine what agile looks like. When it comes to testing, many people seem to have the world view that hard-to-maintain tests are the norm and acceptable. In my experience, the major culprits are BDD frameworks that are based on text feature files. This is amplifying waste. The extra feature file layer in theory allows; The user to swap out the language at a later date Allows a business person to write user stories and or acceptance criteria Allows a business person to read the user stories and or acceptance criteria Collaboration Etc… You have actually added more complexity than you think, for little benefit. I am explicitly critiquing the approach of writing the extra feature file layer first, not the benefits of BDD as a concept. You test more efficiently, with better results not writing the feature file layer, such as with Smart BDD, where it’s generated by code. Here I compare the complexities and differences between Cucumber and Smart BDD. In summary and reality the feature file layer: Doesn’t help with swapping out the language. Both language implementations and maintenance would have diminishing returns. Doesn’t help with a business person writing user stories and or acceptance criteria, because as the article above shows in more detail, the language you use is actually determined by the framework and limitations thereof. Therefore, it would need to be re-worded to some degree, hence a feature file isn’t helping. Doesn’t help with a business person reading the user stories and or acceptance criteria. Usually, feature files are badly written due to the complexity of the BDD framework, therefore a business person in reality is more likely to ask a developer how the system behaves. Smart BDD generates more concise and consistent documentation. Does help with collaboration, which is a major benefit What’s the Best Tool for the Job? If you want to be productive, you might seek the best tool for the job, but what is “the best tool for the job”? It’s retrospective; you don’t know upfront. Using an extra framework alongside your existing unit test framework could be wasteful. It’s better to learn one testing framework well. For example, Cucumber is evidently hard to master leading to poor quality tests that take longer and longer to maintain. You can easily underestimate the future complexities, especially with frameworks that make you write feature files first. You don’t plan to write poor-quality tests, it’s something that could happen over time. You don’t want to spend extra time battling over a framework. You could have the worst of all worlds and use a feature file-driven framework with language you’re not proficient in. If you’re a Java developer using Ruby or Typescript etc. for tests, this could also lead to poor quality and less productivity. I’m suggesting, that if you’re a Java developer, Smart BDD would be the closest to your main skill-set (the least friction) and it tries its hardest to promote good practices. You do less and get more out! Test Based on Your Project's Needs If you’ve heard of the testing pyramid, you can use it as a reference, but do what works for you. You don’t have to do more unit tests and less functional testing because you think that’s the shape of a triangle. You need to align your culture by thinking what value does something provide? Aim for what works for your team, not something designed by a committee. The number of unit tests or the coverage is an output, not an outcome. Aiming for some % test coverage is an amplifying waste. With unit tests, TDD is about code quality, it drives the architecture, and coverage is a side effect. Higher quality code is better for agility. Where does Smart BDD and or another productive testing framework fit in? If you are going to test, it’s best to test first, as it’s more work to test later, and you’ll miss many of the benefits of testing first. With any new feature and or requirement you should generally start outside in. If you start with the functional tests, you start to understand better the requirements and the features that you are delivering. A lot of software development is about learning and communication. Once you have validated that you are working on the right thing, and you’ve increased your understanding of the problem you’re solving, you can work on the code, ideally using TDD. Next is to get the feature in the hands of your client and or the next best available person for feedback. Obviously, feedback is feedback and not future requirements. Feedback from a client is not a silver bullet. It could be used with metrics, for example, was the new feature used as expected? Think of this process as learning, or even better, validated learning if you can prove something. You should strive to solve the problem; how can you get meaningful feedback, as soon as possible? It’s a red flag when you spend too long writing and maintaining functional tests. For example, if you use well-designed defaults and builders you can really reduce the amount of work required in creating and maintaining functional tests. You also want the test framework to be smart, for example; Create interaction diagrams Test boundary cases Produce metrics like timers Give insights And many more At the heart of this is specifying intent and not implementing it many times over. I think the industry in general is moving in the direction of declaring intent over implementing details. By using best practices you’ll get better at testing/documenting behavior at an appropriate level and not making the tests/documentation obfuscated with irrelevant data. Conclusion Culture is hugely important, I’m sure we and our bosses and senior leaders would all ultimately agree with the following: For more value, you need more feedback and less waste For more feedback, you need more value and less waste For less waste, you need more value and more feedback However, most work culture is not aligned with this. Agreeing with something and having the culture are very different, it’s the difference between agreeing that eating healthily and exercising is good for you and actually doing something about it. The next level in the healthy analogy is, having friends or a partner that are similarly minded. Is your company metaphorically encouraging a healthy lifestyle or just agreeing that being healthy makes sense? Culture drives your mindset, behavior, and processes. This has been a very brief introduction, hopefully, you’ll think about amplifying value, thanks for reading please do have the courage to do more or less of something and introduce change where needed.

By James Bayliss

Less Time Learning, More Time Building

Back in 2018, I decided to use my free time to help modernize a family member’s business. Along the way, I wanted to gain some experience and understanding of AWS. Ultimately, I discovered that nearly all my free time was spent learning AWS cloud infrastructure concepts. I had only a small fraction of my time left to focus on building the modern, cloud-based solution that I originally had in mind. As I planned more feature requests for the app, I realized that I needed a better approach. In early 2020, I discovered Heroku. Since I didn’t need to worry about underlying cloud configurations, I could focus my time on adding new features. The Heroku ecosystem worked out great for my simple use case, but I began to wonder about more complex use cases. What about the scenario where a collection of secure and private services needs to interact with one another for a payment processing solution? Would this use case force me to live in the ecosystem of one of the big three cloud service providers? I’m going to find out. The Current State of Secure Microservices For several years, I was fortunate to work in an environment that valued the DevOps lifecycle. The DevOps team handled all things cloud for me, so I could focus on architecting and building microservices to meet my customers’ needs. During that time of my life, this environment was the exception, not the norm. I just did a search for “companies lacking cloud infrastructure knowledge” in my browser, and the results yielded some pretty surprising conclusions: There is a serious shortage of cloud expertise. The lack of cloud skills is leading to significant performance impacts with cloud-native services. Cloud security is a challenge for 1 in 4 companies. The top search results talked about a lack of understanding of the core cloud concepts and the need for crucial training for teams to be effective. The training most teams need usually falls by the wayside as customer demands and deliverables take higher priority. With this current approach, most cloud implementations are forced to move at a slower pace, and they’re often exposed to unknown vulnerabilities. The current state of securing your microservices in the cloud is not a happy one. The Ideal State for Secure Microservices The ideal state for cloud-native solutions would adhere to a personal mission statement I established several years ago: “Focus your time on delivering features/functionality that extends the value of your intellectual property. Leverage frameworks, products, and services for everything else.” – J. Vester In this context, those with a directive to drive toward cloud-native solutions should be able to move at a pace that aligns with corporate objectives. They shouldn’t be slowed by the learning curve associated with underlying cloud infrastructure. So, what does this look like when we’re facing a cloud solution encompassing multiple microservices, all of which need to be isolated from the public and adhere to compliance regulations (like SOC, ISO, PCI, or HIPAA)? About Private Spaces My 2020 Heroku experience was positive. So I wanted to see how it would work with this complex use case. That’s when I discovered Private Spaces. Private Spaces are available as a part of Heroku Enterprise. They’re dedicated environments for running microservices within an isolated network. This approach allows teams to deploy their services into a network that’s not exposed to the public internet. Under the hood, these services function exactly the same as in my basic use case. I can set them up through the Heroku CLI, and simple Git-based commands can trigger deployments. For the regulatory compliance needs, I can lean on Heroku Shield to help me comply with PCI DSS, HIPAA, ISO (27001, 27017, and 27018), and SOC (1, 2, and 3). At a high level, Heroku lets me implement a secure cloud-native design that can be illustrated like this: Here, we have an implementation that leverages Heroku Shield all within a Private Space. This allows a collection of microservices — utilizing several different programming languages — to interact with all the major primary and secondary card networks, all while adhering to various regulatory compliance requirements. Additionally, I get secure communications with the Salesforce platform and GitLab. Heroku in Action Using the Heroku CLI, I can get my Private Space and Heroku Shield up and running. In Heroku, this is called a Shield Private Space. Here are some high-level examples to work through the process. To create a new Shield Private Space, we use spaces:create and add the --shield option. Shell $ heroku spaces:create payment-network --shield --team payments-team --region oregon Creating space payment-network in team payments-team... done === payment-network Team: payments-team Region: oregon State: allocating If the use case requires Classless Inter-Domain Routing (CIDR) ranges, I can use --cidr and --data-cidr flags. You’ll notice that I created my Private Space in the oregon region. You can create a Private Space in one of 10 available regions (in the U.S., Europe, Asia, and Australia). For a list of available regions, do the following: Shell $ heroku regions ID Location Runtime ───────── ─────────────────────── ────────────── eu Europe Common Runtime us United States Common Runtime dublin Dublin, Ireland Private Spaces frankfurt Frankfurt, Germany Private Spaces london London, United Kingdom Private Spaces montreal Montreal, Canada Private Spaces mumbai Mumbai, India Private Spaces oregon Oregon, United States Private Spaces singapore Singapore Private Spaces sydney Sydney, Australia Private Spaces tokyo Tokyo, Japan Private Spaces virginia Virginia, United States Private Spaces For each microservice that needs to run in the payment-network Private Space, I simply add the --space option when running the apps:create command: Shell $ heroku apps:create clearing-service --space payment-network Creating app... done, clearing-service To grant consumers access to the payment-network space, I can maintain the allowed list of trusted IPs: Shell $ heroku trusted-ips:add 192.0.2.128/26 --space payment-network Added 192.0.2.128/26 to trusted IP ranges on payment-network - WARNING: It may take a few moments for the changes to take effect. Conclusion Teams are often given a directive from above to adopt a cloud-native approach. However, many teams have a serious gap in understanding when it comes to deploying secure cloud architectures. If you’re using one of the big three cloud providers, then bridging this gap will come at a price—likely missed timelines expected by the product owner. Is there a better option for secure cloud deployment? I think Private Spaces combined with Heroku Shield represents a better option. For me personally, it also matters that Heroku is part of the solutions platform from Salesforce, which has a history of dedication to providing a cloud adoption alternative focused on the success of its customers. So I felt like this was a long-term strategy for consideration. Have a really great day!

By John Vester

CORE

Using Cron Jobs With Encrypted Home Folders and Malware Protection on Linux

An encrypted home directory is typically used to protect a user's personal data by encrypting the contents of their home directory. This directory is only decrypted and mounted when the user logs in, providing an extra layer of security. To create a new user with an encrypted home directory, you can use the following command: Shell adduser --encrypt-home username After login onto the host system, the user must mount the encrypted home directory by user action: Shell Access-Your-Private-Data.desktop README.txt However, this encryption can pose challenges for cron jobs that need to access files within the home directory, especially if these jobs are supposed to run when the user is not logged in. What Is the Issue With Cron Jobs Now? Cron jobs allow tasks to be executed at scheduled times. These tasks can be defined on a system-wide basis or per user. To edit, create, or delete cron jobs, you can use the following command: Shell crontab -e User-specific cron jobs are stored in the user's home directory, which, if encrypted, might not be accessible when the cron job is supposed to run. Solutions for Running Cron Jobs With Encrypted Home Directories System-Wide Cron Jobs One effective solution is to use system-wide cron jobs. These are defined in files like /etc/crontab or /etc/cron.d/ and can run as any specified user. Since these cron jobs are not stored within an individual user’s home directory, they are not affected by encryption. Example Create a script: Place your script in a non-encrypted directory, such as /usr/local/bin/. For example, create a script to back up your home directory: Shell #!/bin/bash tar -czf /backup/home_backup.tar.gz /home/username/ Ensure the script is executable: Shell sudo chmod +x /usr/local/bin/backup.sh Define the cron job: Edit the system-wide crontab file to schedule your job: Shell sudo crontab -e Add the following line to run the script daily at 2 AM: Shell 0 2 * * * username /usr/local/bin/backup.sh User-Specific Cron Jobs Another effective way is to use user-specific cron jobs. If you need to run cron jobs as a specific user and access files within the encrypted home directory, there are several strategies you can employ: Ensure the home directory is mounted: Make sure the encrypted home directory is mounted and accessible before the cron job runs. This typically means the user needs to be logged in. Handle decryption securely: If handling decryption within a script, use tools like ecryptfs-unwrap-passphrase carefully. Ensure that passphrases and sensitive data are handled securely. Delayed job scheduling: Schedule cron jobs to run at times when the user is likely to be logged in, ensuring the home directory is decrypted. Using @reboot: The @reboot cron directive runs a script at system startup. This can set up necessary environment variables or mount points before the user logs in. Example Using @reboot,create a script that performs the necessary tasks: Shell #!/bin/bash # Script to run at system startup # Ensure environment is set up /usr/local/bin/your_startup_script.sh Add the cron job to run at reboot: Shell crontab -e Add the following line: Shell @reboot /usr/local/bin/your_startup_script.sh Cronjobs and Malware Protection Now, let us consider how to use cron jobs on an encrypted home directory that executes a malware scanner. ClamAV (Clam AntiVirus) is a popular open-source antivirus engine used to detect malware. clamscan is the command-line scanner component of ClamAV. To set up a cron job to run clamscan regularly on an encrypted home directory, you can follow these steps: First, ensure that ClamAV is installed on your system. On most Linux distributions, you can install it using the package manager. Shell sudo apt-get update sudo apt-get install clamav clamav-daemon Before running a scan, update the virus definitions. This can be done using the freshclam command: Shell sudo freshclam Create a script that runs clamscan and places it in a non-encrypted directory. Create a script named scan_home.sh in /usr/local/bin/: Shell sudo nano /usr/local/bin/scan_home.sh Add the following content to the script: Shell #!/bin/bash # Directory to scan SCAN_DIR="/home/username" # Log file LOG_FILE="/var/log/clamav/scan_log.txt" # Run clamscan clamscan -r $SCAN_DIR --log=$LOG_FILE Make the script executable: Shell sudo chmod +x /usr/local/bin/scan_home.sh Edit the system-wide crontab to schedule the scan. Open the crontab file with: Shell sudo crontab -e Add the following line to schedule the script to run for example daily at 3 AM: Shell 0 3 * * * /usr/local/bin/scan_home.sh Additional Considerations Handling Encrypted Home Directory If your home directory is encrypted and you want to ensure the scan runs when the directory is accessible, schedule the cron job at a time when the user is typically logged in, or use a system-wide cron job as shown above. Log Rotation Ensure that the log file does not grow indefinitely. You can manage this using log rotation tools like logrotate. Email Alerts Optionally, configure the script to send email alerts if malware is found. This requires an MTA (Mail Transfer Agent) like sendmail or postfix. Example As a last example, let us take a look at a cron job with a script that sends email notifications. Here's an enhanced version of the script that sends an email if malware is detected: Edit scan_home.sh: Shell sudo nano /usr/local/bin/scan_home.sh Add the following content: Shell #!/bin/bash # Directory to scan SCAN_DIR="/home/username" # Log file LOG_FILE="/var/log/clamav/scan_log.txt" # Email address for alerts EMAIL="user@example.com" # Run clamscan clamscan -r $SCAN_DIR --log=$LOG_FILE # Check if any malware was found if grep -q "Infected files: [1-9]" $LOG_FILE; then mail -s "ClamAV Malware Alert" $EMAIL < $LOG_FILE fi Ensure that the script is executable: Shell sudo chmod +x /usr/local/bin/scan_home.sh Add the cron job: Shell sudo crontab -e Schedule the job, for example daily at 3 AM: Shell 0 3 * * * /usr/local/bin/scan_home.sh Conclusion Permissions: Ensure that the cron job and scripts have the correct permissions and that the user running the job has the necessary access rights. Security: Be cautious when handling passphrases and sensitive data in scripts to avoid compromising security. Testing: Thoroughly test your cron jobs to ensure they function as expected, particularly in the context of encrypted home directories. By following these guidelines, you can effectively manage cron jobs on Linux systems with encrypted home directories, ensuring your automated tasks run smoothly and securely. You also can set up a cron job to run clamscan regularly, ensuring your system is scanned for malware even if your home directory is encrypted. Adjust the scan time and log handling as needed to fit your environment and usage patterns. If you do not like clamscan, there are several alternatives to clamscan for scanning for malware on a Linux system. One popular alternative is Lynis, which is a security auditing tool for Unix-based systems. It can be used to scan for security issues, including malware. Another alternative to clamscan for scanning for malware on a Linux system is Chkrootkit. In both cases, the setup of the cronjob is the same.

By Constantin Kwiatkowski

Product Owner and Scrum Master Combined in One Individual?

TL;DR: Product Owner and Scrum Master? Combining the roles of Product Owner and Scrum Master in one individual is a contentious topic in the Agile community. A recent LinkedIn poll (see below) revealed that 54% of respondents consider this unification useless, while 30% might accept it in rare moments. This blog post explores the implications of merging these roles, emphasizing the importance of distinct responsibilities and the potential pitfalls of combining them. We also consider exceptions where this approach might be temporarily justified and analyze the insightful comments from industry professionals. The LinkedIn Poll: Could the Product Owner and Scrum Master Be the Same Individual? On May 23, 2024, I asked a simple question: Could the Product Owner and Scrum Master be the same individual? Or is mixing roles disadvantageous? Agile puts a lot of emphasis on focus. How come then that so often practitioners are asked — or expected — to cover for two roles simultaneously? Penny-pinching or smart move from a holistic perspective? Referring to the comments, the majority strongly opposes combining the Product Owner and Scrum Master roles, citing significant differences in responsibilities and the need for checks and balances. Conditional acceptance is noted mainly in startup contexts with resource constraints. Some are open to exceptions but remain cautious about long-term viability. Personal experiences highlight the challenges and potential conflicts, while flexible approaches are suggested for specific contexts. We can identify five categories among the comments: 1. Strict Opposition: Fundamental Differences in Roles The Product Owner and Scrum Master roles have distinct responsibilities, requiring full-time attention and unique skill sets. Combining them can lead to neglect and conflict of interest and undermine the healthy tension that balances product goals with team capacities. The roles act as checks and balances, ensuring ambitious goals and realistic execution. 2. Conditional Acceptance: Resource Constraints in Startups In resource-limited situations, such as startups, combining roles may be necessary due to budget constraints. However, this should be a temporary solution until the organization can afford to separate the roles. 3. Skeptical But Open to Exceptions: Specific Contexts and Temporary Solutions While generally inadvisable, combining roles might be feasible in exceptional circumstances, such as during temporary absences or in small teams, provided there is clear role differentiation and support. 4. Experiential Insights: Personal Experience Individuals with personal experience managing both roles or observing this practice often find it problematic due to inherent conflicts of interest and the heavy workload. 5. Pragmatic and Flexible Approaches: Practical Solutions Some suggest rotating the Scrum Master role among team members or having a Developer take on the role to balance responsibilities. Understanding Agile principles and maintaining flexibility in role management can help mitigate potential issues. Ten Reasons Why Combining Product Owner and Scrum Master Roles Is Not a Good Idea What other reasons might there be to question the idea of unifying Product Owner and Scrum Master roles? Let’s have a look: 1. Conflict of Interest Combining the roles of Product Owner (PO) and Scrum Master (SM) creates a conflict of interest. The PO maximizes the product’s value, often requiring prioritization and tough trade-offs. The SM ensures Scrum practices are followed, fostering a healthy team environment. Combining these roles compromises both priorities, reducing objectivity and effectiveness. 2. Loss of Focus Each role demands full attention to be effective. The PO must stay engaged with stakeholders, market trends, and the Product Backlog while creating alignment with their teammates. Simultaneously, the SM needs to focus on coaching the team, removing impediments, and supporting changes at the organizational level to improve the team’s environment. Combining roles can dilute focus, leading to suboptimal performance in both areas. 3. Compromised Accountability Scrum thrives on clear accountabilities. The PO is accountable for the Product Backlog and value delivery, while the SM is accountable for the Scrum process and team health. Merging these roles blurs the accountability a Scrum team’s success is based on. 4. Reduced Checks and Balances Scrum’s design includes built-in checks and balances where the PO focuses on improving value creation, while the SM ensures sustainable pace and quality. Combining the roles removes this tension, potentially leading to burnout or technical debt due to a lack of restraint on delivery pressures. 5. Increased Risk of Micromanagement Combining roles can lead to micromanagement, as the individual may struggle to switch between facilitation and decision-making. This can undermine the team’s self-management, reducing creativity and innovation. 6. Decreased Team Support The SM role involves supporting the team by removing impediments and ensuring a healthy work environment. A combined role may prioritize product issues over team issues, reducing the support the team receives and impacting morale and productivity. 7. Impaired Decision Making The PO must make decisions quickly to adapt to market changes, while the SM needs to foster team accord and gradual improvement. Combining these roles can slow decision-making processes and create confusion within the team regarding priorities. 8. Diluted Expertise Both roles require specific skills and expertise. A PO needs strong business acumen, while an SM needs a deep understanding of agile practices and team dynamics. Combining the roles often means one skill set will dominate, leaving gaps in the other area. 9. Impeded Transparency The Scrum framework relies on transparency to inspect and adapt effectively. A single person handling both roles may unintentionally hide issues or conflicts to maintain the appearance of progress, thus impairing the team’s ability to improve continuously. 10. Undermined Scrum Values Combining roles can undermine the Scrum values of focus, openness, respect, commitment, and courage, as the individual may struggle to balance conflicting responsibilities and provide the necessary support for the team to embody these values effectively. Consequently, by separating the roles of Product Owner and Scrum Master, organizations ensure clear accountability, maintain checks and balances, and foster a healthier, more productive Scrum environment. Additional Considerations What else do we need to consider? Five issues come to mind: 1. Role Synergy vs. Role Conflict While it’s tempting to think that combining roles might streamline processes and communication, each role has distinct and sometimes conflicting responsibilities. Consider whether the short-term gains of combining roles might be outweighed by long-term inefficiencies and conflicts. 2. Impact on Team Dynamics Consider how the combination of roles might affect team dynamics. A single person wielding both roles could inadvertently create a hierarchical dynamic, undermining the flat structure that Scrum promotes and potentially leading to reduced team morale and engagement. 3. Sustainability and Burnout The workload for both roles can be intense. Combining them can lead to burnout for the individual trying to manage both responsibilities. Think about how this might affect their ability to perform effectively over time and the potential impacts on team stability and productivity. 4. Training and Development Reflect on the development paths for team members. Combining roles might hinder individuals’ ability to specialize and grow in their respective areas. It might be more beneficial to invest in strong, separate training programs for Product Owners and Scrum Masters to ensure they can excel in their distinct roles. 5. Adaptability To Change Agile practices, including Scrum, thrive on adaptability. Combining roles might reduce the team’s ability to quickly adapt to changes, as the dual-role individual could be overloaded and less responsive to necessary pivots in product development or team facilitation. Three Exceptions Where Combining the Product Owner and Scrum Master Roles Might Be Justified By now, we have a solid understanding that under usual circumstances, it is not a good idea to combine the Product Owner and the Scrum Master roles. However, under which circumstance might it be acceptable? Let’s delve into the following: 1. Small Startups or Early-Stage Companies Context In the early stages of a startup, resources are often limited. The team might be small, focusing on rapid development and iteration to find product-market fit. Justification Combining the roles can help streamline decision-making processes and reduce overhead. The person in the dual role can quickly pivot and make changes without waiting for coordination between separate roles. Considerations This should be temporary until the startup grows and can afford to hire separate individuals for each role. As the company scales, the complexity and workload will likely necessitate separating the roles to maintain effectiveness and prevent burnout. 2. Temporary Absence or Transition Period Context If the organization is undergoing a transition, such as the departure of a Scrum Master or Product Owner, it might be necessary to combine roles temporarily to ensure continuity. Justification Having a single individual temporarily fill both roles can provide stability and maintain the momentum of ongoing projects. It ensures that the Scrum events continue to be facilitated and that Product Backlog management does not lapse. Considerations During this period, the organization should actively search for a replacement to fill the vacant role. Additionally, the individual in the dual role should receive support to manage their workload, such as delegating non-critical tasks to team members. 3. Highly Experienced Agile Practitioner Context In situations where an organization has an individual with extensive experience and a deep understanding of both Scrum and the product domain, they might be capable of effectively handling both roles. Justification An experienced agile practitioner might have the skills and knowledge to temporarily balance the demands of both roles, especially in crisis situations where their expertise is crucial to navigating complex challenges. Considerations This should be a short-term solution even with a highly skilled individual. The organization should closely monitor the impact on the team and the individual’s workload. Continuous feedback from the Scrum team and stakeholders is essential to ensure that combining roles does not negatively affect productivity and morale. Additional Guidance Clear communication: In any of these scenarios, it is crucial to maintain clear communication with the team about the temporary nature of the combined role and the reasons behind it. This transparency helps manage expectations, and fosters trust within the team. Monitoring and support: Regular check-ins are necessary to assess the individual’s well-being and effectiveness in managing both roles. Providing additional support, such as temporary assistance or redistributing some responsibilities, can help mitigate the risk of burnout. Plan for transition: Have a clear plan for transitioning back to separate roles as soon as feasible. This includes setting criteria for when the transition will occur, such as reaching a specific team size in a startup or hiring a new team member during a transition period. By considering these exceptions and managing them thoughtfully, organizations can navigate periods where combining the Product Owner and Scrum Master roles might be justified while minimizing potential drawbacks. Food for Thought By thoroughly considering the following aspects, you can make a more informed decision about whether combining the Product Owner and Scrum Master roles is the right move for your organization: Experimentation and Feedback If the idea of combining roles persists, consider running it as a time-boxed experiment. Gather feedback from the team and stakeholders before making a permanent change. This can provide insights into the practical implications and help you make a more informed decision. Cultural Fit Assess whether this change aligns with your organization’s culture and values. Scrum and Agile practices often challenge traditional hierarchies and thrive in a culture of collaboration and continuous improvement. Ensure that any role changes support rather than hinder these cultural elements. Long-Term Vision Keep the long-term vision in mind. Decisions made today should support the organization’s goals and values in the future. Consider how role clarity and adherence to Scrum principles will impact your team’s ability to deliver value continuously. Conclusion While combining the Product Owner and Scrum Master roles might seem efficient in specific contexts, it generally poses significant risks to the effectiveness of Scrum teams. These roles’ distinct responsibilities, necessary skills, and built-in checks and balances are crucial for fostering a productive and balanced environment where Scrum teams can thrive. Although there are rare situations, such as in resource-constrained startups or temporary transitions, where merging these roles might be justified, these should only be temporary solutions with straightforward plans for separation. The insights from the LinkedIn poll and comments highlight the importance of maintaining role clarity to ensure sustainable team performance and alignment with Agile principles.

By Stefan Wolpers

CORE