Author Archives: Shanoj

About Shanoj

Author : Shanoj is a Data engineer and solutions architect passionate about delivering business value and actionable insights through well-architected data products. He holds several certifications on AWS, Oracle, Apache, Google Cloud, Docker, Linux and focuses on data engineering and analysis using SQL, Python, BigData, RDBMS, Apache Spark, among other technologies. He has 17+ years of history working with various technologies in the Retail and BFS domains.

An Open Letter to All Managers: The Secrets of High-Performing Teams

Leave a reply

Dear Managers,

Have you ever wondered why some teams consistently outperform others? Whether it’s innovation, productivity, or the sheer ability to win, certain teams just seem to click better. The question is: What makes these teams so effective?

Historically, we tend to jump to two answers when exploring why some teams are more successful than others — Talent and Team Building. Let’s break down these two widely held beliefs and see how they stack up.

The Myth of Talent

The most common answer to why some teams excel is simple: Talent. We think, “If I hire the best, the results will naturally follow.” It’s like a sports team with a budget that acquires star players — on paper, they should win every time. But in reality, the performance often fails to meet expectations. Why? Because talent isn’t portable.

A wealth of research across various industries shows that picking up a top performer from one environment and placing them in another often leads to a drop in performance — not just for them but for the entire team. This happens because performance is inherently team-based. The star performer shines not just because of individual ability but because of the context of the team they’re in. Talent, as crucial as it is, cannot thrive without a strong team to support it.

As Steve Jobs once said, “Great things in business are never done by one person; they’re done by a team of people.” This quote beautifully highlights the importance of the collective over individual brilliance.

The Team Building Trap

The next strategy people turn to is team building. You know the drill — trust falls, ropes courses, personality tests. While these activities can be fun, the real question is, do they work?

Team building only truly works when it changes the habits and norms of behaviour in the day-to-day work. In other words, team building isn’t an event; it’s a habit. It’s something woven into the fabric of how a team operates, how they communicate, and how they collaborate. And you know what we call those ingrained habits and behaviours? Culture.

Peter Drucker famously said, “Culture eats strategy for breakfast.” No matter how talented your team or how clever your strategies, without the right culture, sustained success is impossible.

Culture: The Heart of High-Performing Teams

Research shows that high-performing teams across various industries, from business to sports to non-profit, all share similar cultural traits. These traits boil down to three key elements: common understanding, psychological safety, and pro-social purpose.

1. Common Understanding

At its core, common understanding is about clarity and empathy. It’s not just about knowing your role and responsibilities; it’s also about understanding the roles of others on the team. How do you fit into the bigger picture? This clarity fosters smooth collaboration.

But beyond clarity, great teams exhibit empathy. They understand each other’s preferences, strengths, and challenges. It’s about knowing the person behind the job title — what motivates them, what stresses them out, and how they best communicate.

As Harper Lee wrote in To Kill a Mockingbird, “You never really understand a person until you consider things from his point of view.” Empathy allows teams to see things from multiple perspectives, improving collaboration and cohesion.

2. Psychological Safety

Psychological safety is the extent to which team members feel safe to take risks, speak up, and even fail without fear of negative consequences. It’s about creating a space where people can express ideas, no matter how unconventional, and where dissent is encouraged as a form of collaboration.

Leaders play a critical role in cultivating psychological safety. Instead of punishing mistakes, they treat them as learning opportunities. They foster an environment where failures are celebrated for the lessons they provide, not for the mistakes themselves.

As Winston Churchill wisely said, “Success is not final, failure is not fatal: it is the courage to continue that counts.” This mindset creates an environment where innovation thrives and individuals are empowered to grow.

3. Pro-Social Purpose

While most organizations talk about having a strong purpose or mission, what really matters to people is knowing who they serve. Employees want to feel like their work is making a tangible impact on someone’s life. It’s not just about having a mission statement pinned on the wall; it’s about connecting employees to the beneficiaries of their work.

Teams that understand the “who” behind their work are more motivated and engaged. Whether it’s customers, stakeholders, or even internal colleagues, knowing that you’re contributing to someone else’s success gives a sense of fulfillment that drives effort and commitment.

In the words of Albert Einstein, “Only a life lived for others is a life worthwhile.” When team members feel they’re contributing to a greater cause, they bring more energy and passion to their work.

When you build a culture that emphasizes common understanding, psychological safety, and pro-social purpose, something incredible happens. The sum becomes greater than the parts. You attract better people, get better ideas, and most importantly, get the best effort from your team.

As Henry Ford famously stated, “Coming together is a beginning, staying together is progress, and working together is success.” With the right culture in place, you don’t just have a good team — you have the best team you’ve ever led.

OLAP: The Continuum from Transactions (OLTP)

Leave a reply

This article is my answer to many colleagues who often ask me, “If you were designing a data solution, what would you incorporate?” In real life, we rarely get the privilege of designing a solution from scratch, end to end. More often than not, we work with systems already designed and implemented by someone else. That, in itself, is a trade-off we all have to accept at some point. But nothing stops us from learning, reflecting, and studying from those systems — especially learning from the mistakes that might have been made.

This article is about exactly that: learning from what’s been done and thinking about what could have been improved. One key thing I’ve noticed in many data environments is the absence of a master data layer. Data analysts or data scientists often query directly from raw, unstructured snapshot data instead of working from properly curated master data. This leads to inefficient analysis and unreliable insights.

Let’s explore how a properly designed data flow can address these challenges using the OLTP-to-OLAP continuum.

1. Production Database Snapshots

In any data-driven system, the starting point is usually OLTP systems. These systems handle real-time transactions — whether it’s a customer placing an order, transferring money, or updating account details. Every operational activity generates data, and these systems are optimized to record that data at high speed.

However, while OLTP systems are excellent at handling transactions, they are not designed for complex data analysis. This is where production database snapshots come into play. These snapshots provide a periodic snapshot of the operational data, preserving the state of the system at a given moment. The key challenge here is what happens next: if you query directly from this raw snapshot, you’re likely to run into performance and consistency issues.

In an ideal scenario, we should move this snapshot data into a more structured format, setting the stage for accurate and meaningful analysis.

2. Master Data

This is where many data environments struggle. In the absence of a master data layer, analysts are forced to work with raw, inconsistent data. Master data provides a single source of truth, cleaning, organizing, and harmonizing disparate data sources.

For instance, imagine trying to analyze customer data across multiple products without a master data layer. Without a unified view of the customer, you end up with fragmented and sometimes contradictory data. This makes it harder to draw meaningful insights. The master data layer addresses this by creating consistent, well-organized records of key entities like customers, products, transactions, and more.

If I were designing a data solution, ensuring a solid master data layer would be one of my top priorities. This foundational layer improves the quality of data and ensures that all subsequent analyses are based on accurate, reliable information.

3. OLAP Cubes

Once the master data is set, the next step is processing it through OLAP cubes. OLAP systems are designed to handle complex, multidimensional queries that allow for deep analysis. For example, an OLAP cube might allow a company to analyze sales data by region, product category, and time period simultaneously.

The power of OLAP lies in its ability to aggregate data and provide quick access to insights across various dimensions. This is especially important in industries like finance, retail, or logistics, where understanding patterns and trends across different variables is critical for decision-making.

In many environments I’ve observed, OLAP systems are either underutilized or not implemented at all. This results in slow, inefficient analysis that can’t keep up with the speed of modern business. In contrast, using OLAP cubes to handle the heavy lifting of data aggregation ensures that insights can be generated faster and more efficiently.

4. Metrics

At the end of the continuum, we reach metrics — the ultimate output of the entire data pipeline. Whether it’s tracking sales performance, customer behavior, or operational efficiency, these metrics provide the actionable insights that drive business decisions.

However, the quality of these metrics depends entirely on the previous steps. Without proper data snapshots, master data, or OLAP cubes, the metrics generated will be unreliable. If each stage of the continuum is carefully managed, the metrics produced will be accurate and insightful, providing the information decision-makers need to act with confidence.

The key takeaway here is that in any data solution, from the most basic to the most complex, structure matters. A well-designed pipeline ensures that data flows smoothly from OLTP systems to OLAP analysis, ultimately providing the metrics needed to inform business strategy.

Stackademic 🎓

Thank you for reading until the end. Before you go:

Please consider clapping and following the writer! 👏
Follow us X | LinkedIn | YouTube | Discord
Visit our other platforms: In Plain English | CoFeed | Differ
More content at Stackademic.com

SQL for Data Engineering: Window Functions and Common Table Expressions (CTEs)

Leave a reply

This article is inspired by a true story involving one of my close friends. He’s a backend developer, not a database expert, but during a recent interview, he was grilled heavily on SQL. After hearing about his experience, I realized something that might resonate with many of you: the days when SQL knowledge was limited to basic GROUP BY and JOIN operations are long gone. Today, the depth of SQL skills required—especially in data engineering roles—demands much more. If you’re preparing for interviews, you’ll need to master more advanced concepts, like window functions and Common Table Expressions (CTEs), to truly stand out.

“In theory, there is no difference between theory and practice. But in practice, there is.” — Yogi Berra

Why Focus on Window Functions in SQL?

As my friend’s story reminded me, SQL interviews aren’t just about basic querying anymore. Window functions, in particular, have become a critical part of data engineering interviews. Whether it’s ranking transactions, calculating rolling metrics, or handling complex partitioning, window functions allow you to perform operations that basic SQL can’t easily handle.

Let’s start by breaking down window functions and why they’re essential in real-world scenarios, especially when working with large-scale data.

What is a Window Function?

A window function is a SQL tool that allows you to perform calculations across a set of rows that are somehow related to the current row. Think of it as looking at a “window” of surrounding rows to compute things like cumulative sums, ranks, or moving averages.

The most common window functions are:

LAG: Fetches the value from the previous row.
LEAD: Fetches the value from the next row.
RANK: Assigns ranks to rows, skipping numbers when there are ties.
DENSE_RANK: Similar to RANK but doesn’t skip numbers.
ROW_NUMBER: Assigns unique sequential numbers to rows, without ties.

These functions come in handy when dealing with tasks like analyzing customer transactions, calculating running totals, or ranking entries in financial datasets. Now, let’s move into a practical banking example that you might encounter in an interview.

Example: Identifying Top Three Transactions by Amount for Each Customer

Imagine you’re asked in an interview: “Find the top three largest transactions for each customer in the past year.” Right away, you should recognize that a simple GROUP BY or JOIN won’t work here—you’ll need a window function to rank transactions by amount for each customer.

Here’s how to approach it using the ROW_NUMBER function:

WITH customer_transactions AS (
    SELECT customer_id, transaction_id, transaction_date, amount,
           ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY amount DESC) AS transaction_rank
    FROM transactions
    WHERE transaction_date >= DATEADD(year, -1, GETDATE())
)
SELECT customer_id, transaction_id, transaction_date, amount
FROM customer_transactions
WHERE transaction_rank <= 3;

In this query:

The PARTITION BY clause divides the data into groups by customer.
The ORDER BY clause ranks the transactions based on the amount, from highest to lowest.
The ROW_NUMBER() function assigns a unique rank to each transaction for each customer, allowing you to filter out the top three for each.

This example goes beyond basic SQL skills, showcasing how window functions enable you to solve more complex real-world problems — something you’ll encounter frequently in interviews and on the job.

Keywords That Hint at Using Window Functions

In a SQL interview, look out for keywords that signal the need for window functions:

Rolling (e.g., rolling sum or average of balances)
Rank (e.g., top transactions, highest loan amounts)
Consecutive (e.g., consecutive late payments)
De-duplicate (e.g., identifying unique customer transactions)

For example, a question like “Give me the top five deposits per account over the past six months” is a clear indication that a window function — such as RANK or ROW_NUMBER—is required.

Understanding the Components of a Window Function

Each window function consists of three main components:

Function: This could be something like SUM(), RANK(), or LAG().
OVER() Clause: Defines the window, i.e., the rows across which the function is applied. Without this, it’s just a regular aggregate function. This is where PARTITION BY and ORDER BY come into play.
Optional ROWS Clause: Rarely used but powerful when you need to calculate things like rolling averages or sums over a defined number of rows.

Let’s look at a practical example for a bank that wants to calculate the rolling 30-day balance for each customer’s account:

SELECT customer_id, transaction_date, 
       SUM(amount) OVER (PARTITION BY customer_id ORDER BY transaction_date 
                         ROWS BETWEEN 29 PRECEDING AND CURRENT ROW) AS rolling_balance
FROM transactions;

ROWS BETWEEN 29 PRECEDING AND CURRENT ROW defines a 30-day window for the balance calculation.
The result is a rolling sum of account balances over the last 30 days for each customer, a common requirement in banking data analysis.

Common Table Expressions (CTEs): Your Best Friend for Complex Queries

CTEs are another key tool in advanced SQL, especially for interviews. A CTE allows you to define a temporary result set that can be referenced within the main query, making your code more readable and maintainable.

Syntax of a CTE:

WITH cte_name AS (
    SELECT column1, column2
    FROM table
    WHERE condition
)
SELECT *
FROM cte_name
WHERE another_condition;

Let’s extend our banking example. Suppose you’re asked to identify customers who have made consecutive late payments. Instead of cluttering your query with subqueries, you can simplify it using a CTE:

WITH customer_late_payments AS (
    SELECT customer_id, payment_date, 
           LAG(payment_status) OVER (PARTITION BY customer_id ORDER BY payment_date) AS previous_payment_status
    FROM payments
    WHERE payment_status = 'Late'
)
SELECT customer_id, COUNT(*) AS consecutive_late_payments
FROM customer_late_payments
WHERE previous_payment_status = 'Late'
GROUP BY customer_id;

In this case, the LAG() function helps identify whether the previous payment was also marked as “Late.” This query identifies customers with consecutive late payments, a typical use case in risk management for banks.

When to Use CTEs vs. Subqueries vs. Temporary Tables

A common question that arises is when to use CTEs over subqueries or temporary tables. Here’s a quick rule of thumb:

CTEs: Ideal for improving readability and maintainability, especially in big data environments like Spark, Snowflake, or Trino.
Subqueries: Useful when you need a single scalar value, such as the total sum of loan amounts or average transaction size.
Temporary Tables: Best when you need to reuse intermediate results multiple times across queries, often improving performance in complex pipelines.

For example, if you’re working with millions of financial transactions and need to run multiple calculations, a temporary table could save significant processing time by storing intermediate results that are reused in other queries.

Mastering window functions and CTEs is your secret weapon in SQL interviews. These tools allow you to handle complex tasks like ranking transactions, calculating rolling balances, and identifying consecutive events — skills that will set you apart from other candidates.

By focusing on these advanced SQL techniques and understanding when to apply them, you’ll not only excel in interviews but also be prepared for the challenges you’ll face in real-world data analysis.

Stackademic 🎓

Thank you for reading until the end. Before you go:

Please consider clapping and following the writer! 👏
Follow us X | LinkedIn | YouTube | Discord
Visit our other platforms: In Plain English | CoFeed | Differ
More content at Stackademic.com

AWS Step Functions Distributed Map: Scaling Interactive Bank Reconciliation

Leave a reply

Problem Statement: Scaling Interactive Bank Reconciliation

Bank reconciliation is a crucial but often complex and resource-intensive process. Suppose we have 500,000 reconciliation files stored in S3 for a two-way reconciliation process. These files are split into two categories:

250,000 bank statement files
250,000 transaction files from the company’s internal records

The objective is to reconcile these files by matching the transactions from both sources and loading the results into a database, followed by triggering reporting jobs.

Challenges with Current Approaches:

Sequential Processing Limitations:

A simple approach might involve iterating through the 500,000 files in a loop, but this would take an impractical amount of time. Processing even smaller datasets of 5,000 files would already show the inefficiency of sequential processing for this scale.

Data Scalability:

As the number of files increases (say, to 1 million files), this approach becomes completely unfeasible without significant performance degradation. Traditional methods are not designed to scale effectively.

Fault Tolerance:

In a large-scale operation like this, system failures can happen. If one node fails during reconciliation, the entire process could stop, requiring complex error-handling logic to ensure continuity.

Cost & Resource Management:

Balancing the cost of infrastructure with performance is another challenge. Over-provisioning resources to handle peak load times could be expensive while under-provisioning could lead to delays and failed jobs.

Complexity in Distributed Processing:

Setting up distributed processing frameworks, such as Hadoop or Spark, introduces a significant learning curve for developers who aren’t experienced with big data frameworks. Additionally, provisioning and maintaining clusters of machines adds further complexity.

Solution: Leveraging AWS Step Functions Distributed Map

AWS Step Functions, a serverless workflow orchestration service, solves these challenges efficiently by enabling scalable, distributed processing with minimal infrastructure management. With the Step Functions Distributed Map feature, large datasets like the 500,000 reconciliation files can be processed in parallel, simplifying the workflow while ensuring scalability, fault tolerance, and cost-effectiveness.

Key Benefits of the Solution:

Parallel Processing for Faster Reconciliation:

Distributed Map breaks down the 500,000 reconciliation tasks across multiple compute nodes, allowing files to be processed concurrently. This greatly reduces the time needed to reconcile large volumes of data.

Scalability:

The workflow scales effortlessly as the number of reconciliation files increases. Step Functions Distributed Map handles the coordination, ensuring that you can move from 500,000 to 1 million files without requiring a major redesign.

Fault Tolerance & Recovery:

If a node fails during the reconciliation process, the coordinator will rerun the failed tasks on another compute node, preventing the entire process from stalling. This ensures greater resilience in high-scale operations.

Cost Optimization:

As a serverless service, Step Functions automatically scales based on usage, meaning you’re only charged for what you use. There’s no need to over-provision resources, and scaling happens without manual intervention.

Developer-Friendly:

Developers don’t need to learn complex big data frameworks like Spark or Hadoop. Step Functions allows for orchestration of workflows using simple tasks and services like AWS Lambda, making it accessible to a broader range of teams.

Workflow Implementation:

The proposed Step Functions Distributed Map workflow for bank reconciliation can be broken down into the following steps:

Stage the Data:

AWS Athena is used to stage the reconciliation data, preparing it for further processing.

Gather Third-Party Data:

A Lambda function fetches any necessary third-party data, such as exchange rates or fraud detection information, to enrich the reconciliation process.

Run Distributed Map:

The Distributed Map state initiates the reconciliation between each pair of files (one from the bank statements and one from the internal records). Each pair is processed in parallel, maximizing throughput and minimizing reconciliation time.

Aggregation:

Once all pairs are reconciled, the results are aggregated into a summary report. This report is stored in a database, making the data ready for reporting and further analysis.

{
  "Comment": "Reconciliation Workflow using Distributed Map in Step Functions",
  "StartAt": "StageReconciliationData",
  "States": {
    "StageReconciliationData": {
      "Type": "Task",
      "Resource": "arn:aws:athena:us-west-2:123456789012:workgroup/reconciliation-query",
      "Next": "FetchBankFiles"
    },
    "FetchBankFiles": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-west-2:123456789012:function:FetchBankFilesLambda",
      "Next": "FetchInternalFiles"
    },
    "FetchInternalFiles": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-west-2:123456789012:function:FetchInternalFilesLambda",
      "Next": "ReconciliationDistributedMap"
    },
    "ReconciliationDistributedMap": {
      "Type": "Map",
      "ItemReader": {
        "Type": "S3ListReader",
        "Configuration": {
          "BucketName": "your-bank-statements-bucket",
          "Prefix": "bank_files/"
        }
      },
      "MaxConcurrency": 1000,
      "ItemProcessor": {
        "ProcessorConfig": {
          "Mode": "DISTRIBUTED",
          "ExecutionType": "STANDARD"
        },
        "StartAt": "ReconcileFiles"
      },
      "ItemBatchSize": 50,
      "Next": "AggregateResults"
    },
    "ReconcileFiles": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-west-2:123456789012:function:ReconcileFilesLambda",
      "Parameters": {
        "bankFile.$": "$.S3ObjectKey",
        "internalFile": "s3://your-internal-files-bucket/internal_files/{file-matching-key}"
      },
      "Next": "CheckReconciliationStatus"
    },
    "CheckReconciliationStatus": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$.status",
          "StringEquals": "FAILED",
          "Next": "HandleFailedReconciliation"
        }
      ],
      "Default": "ReconciliationSuccessful"
    },
    "ReconciliationSuccessful": {
      "Type": "Pass",
      "Next": "AggregateResults"
    },
    "HandleFailedReconciliation": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-west-2:123456789012:function:HandleFailedReconciliationLambda",
      "Next": "ReconciliationSuccessful"
    },
    "AggregateResults": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-west-2:123456789012:function:AggregateResultsLambda",
      "Next": "GenerateReports"
    },
    "GenerateReports": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-west-2:123456789012:function:GenerateReportsLambda",
      "End": true
    }
  }
}

AWS Step Functions Distributed Map offers a scalable, fault-tolerant, and cost-effective solution to processing large datasets for bank reconciliation. Its serverless nature removes the complexity of managing infrastructure and enables developers to focus on the core business logic. By integrating services like AWS Lambda and Athena, businesses can achieve better performance and efficiency in high-scale reconciliation processes and many other use cases.

In Plain English 🚀

Thank you for being a part of the In Plain English community! Before you go:

Be sure to clap and follow the writer ️👏️️
Follow us: X | LinkedIn | YouTube | Discord | Newsletter
Visit our other platforms: CoFeed | Differ
More content at PlainEnglish.io

Management is a Role, Not a Promotion!

Leave a reply

Transitioning from an individual contributor to a managerial role is often seen as a promotion, but in reality, it’s much more — a complete shift in responsibilities and mindset. However, the journey doesn’t stop at management; the true evolution is transitioning from being a manager to becoming a leader.

Here are some key takeaways:

1. Management is a Role, Not a Promotion

Becoming a manager is not just about climbing the career ladder; it’s about taking on a new role with distinct responsibilities. Whether you’re an individual contributor or a manager, each role has its unique challenges and rewards. As Peter Drucker wisely put it, “Management is doing things right; leadership is doing the right things.”

Reference: The Essential Drucker by Peter Drucker

2. Finding Your Niche

Understanding what drives you is essential to finding your niche in leadership. It’s crucial to introspect and determine whether you enjoy working with and developing people. If you naturally gravitate towards helping others succeed, leadership could be a fulfilling path for you. Steve Jobs once said, “The only way to do great work is to love what you do.” This applies equally to leadership.

3. Types of Managers

There are different types of managers, such as people managers and engineering managers. People managers focus on personal development, conducting one-on-ones, and supporting their team members’ growth. Engineering managers, on the other hand, combine technical responsibilities with people management, such as conducting code reviews and unblocking team members.

Question: How does understanding the type of manager you are help in your role?
Answer: Knowing your management style allows you to tailor your approach to your team’s needs, whether it’s focusing on technical guidance or personal development. This alignment is crucial for your team’s success and your effectiveness as a leader.

4. Delegation and Decision-Making

Effective leaders excel in delegation and make decisions based on data, not just gut feelings. Delegation involves more than just handing off tasks; it requires clear communication, guidance, and understanding each team member’s strengths.

Example: If you’re faced with a decision about which feature to prioritize, data-driven insights can guide you. For instance, choosing a feature that provides 80% of the impact with 20% of the effort is a strategic move, ensuring maximum value for the time invested.

Question: How does a leader excel in delegation and decision-making?
Answer: To excel, a leader must set clear expectations and provide guidance while empowering their team to take ownership of their work. As John Maxwell famously stated, “A leader is one who knows the way, goes the way, and shows the way.”

Reference: The 21 Irrefutable Laws of Leadership by John C. Maxwell

5. Facing Failures

Failures are inevitable, but how you handle them defines your growth as a leader. Recognizing issues early, communicating with stakeholders, and creating a plan to address and learn from failures are essential steps. Winston Churchill’s words resonate here: “Success is not final, failure is not fatal: It is the courage to continue that counts.”

Question: How does a leader face failures?
Answer: Leaders must own their failures, communicate openly with their teams, and focus on recovery and future prevention. For example, when a project is going off-track, a leader should proactively inform stakeholders, adjust the scope if necessary, and set up guardrails to avoid future issues.

6. Experience Matters

There’s no substitute for experience in leadership. The breadth of your experiences will guide your decisions and shape your leadership style. Julius Caesar aptly noted, “Experience is the teacher of all things.” Leadership is enriched by diverse experiences, which prepare you for the complexities of guiding others.

Example: If you’ve been put in charge of a project that’s going south, the stress and difficulty of salvaging it will teach you invaluable lessons that contribute to your growth as a leader.

Reference: Outliers by Malcolm Gladwell

7. Leadership as a Coach

Leadership goes beyond managing tasks; it’s about coaching and developing your team. As a coach, the focus is on nurturing and supporting each team member, leading to more meaningful and impactful leadership. Phil Jackson, a legendary coach, once said, “The strength of the team is each individual member. The strength of each member is the team.”

Question: How does a leader transition from managing to coaching?
Answer: Transitioning to a coaching mindset involves shifting from task management to personal development. Leaders should focus on helping team members grow, providing guidance and support tailored to individual needs.

Reference: Eleven Rings: The Soul of Success by Phil Jackson

For those aspiring to move from management to true leadership, the advice is clear: understand the broader role of a leader, seek opportunities to develop your team, and always be open to learning from experiences — both yours and others.

Leadership isn’t just about overseeing a team; it’s about continuous self-reflection, growth, and elevating those around you. As I often reflect, “Being a leader helps you deliver 10x, 100x more than what you do as an individual.”

Stackademic 🎓

Thank you for reading until the end. Before you go:

Please consider clapping and following the writer! 👏
Follow us X | LinkedIn | YouTube | Discord
Visit our other platforms: In Plain English | CoFeed | Differ
More content at Stackademic.com

System Design: Automating Banking Reconciliation with AWS

Leave a reply

This article outlines the system design for automating the banking reconciliation process by migrating existing manual tasks to AWS. The solution leverages various AWS services to create a scalable, secure, and efficient system. The goal is to reduce manual effort, minimize errors, and enhance operational efficiency within the financial reconciliation workflow.

Key Objectives:

Develop a user-friendly custom interface for managing reconciliation tasks.
Utilize AWS services like Lambda, Glue, S3, and EMR for data processing automation.
Implement robust security and monitoring mechanisms to ensure system reliability.
Provide post-deployment support and monitoring for continuous improvement.

Architecture Overview

The architecture comprises several AWS services, each fulfilling specific roles within the system, and integrates with corporate on-premises resources via Direct Connect.

Direct Connect: Securely connects the corporate data center to the AWS VPC, enabling fast and secure data transfer between on-premises systems and AWS services.

Data Ingestion

Amazon S3 (Incoming Files Bucket): Acts as the primary data repository where incoming files are stored. The bucket triggers the Lambda function when new data is uploaded.
Bucket Policy: Ensures that only authorized services and users can access and interact with the data stored in S3.

Event-Driven Processing

AWS Lambda: Placed in a private subnet, this function is triggered by S3 events (e.g., file uploads) and initiates data processing tasks.
IAM Permissions: Lambda has permissions to access the S3 bucket and trigger the Glue ETL job.

Data Transformation

AWS Glue ETL Job: Handles the extraction, transformation, and loading (ETL) of data from the S3 bucket, preparing it for further processing.
NAT Gateway: Located in a public subnet, the NAT Gateway allows the Lambda function and Glue ETL job to access the internet for downloading dependencies without exposing them to inbound internet traffic.

Data Processing and Storage

Amazon EMR: Performs complex transformations and applies business rules necessary for reconciliation processes, processing data securely within the private subnet.
Amazon Redshift: Serves as the central data warehouse where processed data is stored, facilitating further analysis and reporting.
RDS Proxy: Manages secure and efficient database connections between Glue ETL, EMR, and Redshift.

Business Intelligence

Amazon QuickSight: A visualization tool that provides dashboards and reports based on the data stored in Redshift, helping users to make informed decisions.

User Interface

Reconciliation UI: Hosted on AWS and integrated with RDS, this custom UI allows finance teams to manage reconciliation tasks efficiently.
Okta SSO: Manages secure user authentication via Azure AD, ensuring that only authorized users can access the reconciliation UI.

Orchestration and Workflow Management

AWS Step Functions: Orchestrates the entire workflow, ensuring that each step in the reconciliation process is executed in sequence and managed effectively.
Parameter Store: Holds configuration data, allowing dynamic and flexible workflow management.

Security and Monitoring

AWS Secrets Manager: Securely stores and manages credentials needed by various AWS services.
Monitoring and Logging:
Scalyr: Provides backend log collection and analysis, enabling visibility into system operations.
New Relic: Monitors application performance and tracks key metrics to alert on any issues or anomalies.

Notifications

AWS SNS: Sends notifications to users about the status of reconciliation tasks, including completions, failures, or other important events.

Security Considerations

Least Privilege Principle:
All IAM roles and policies are configured to ensure that each service has only the permissions necessary to perform its functions, reducing the risk of unauthorized access.

Encryption:
Data is encrypted at rest in S3, Redshift, and in transit, meeting compliance and security standards.

Network Security:
The use of private subnets, security groups, and network ACLs ensures that resources are securely isolated within the VPC, protecting them from unauthorized access.

Code Implementation

Below are the key pieces of code required to implement the Lambda function and the CloudFormation template for the AWS infrastructure.

Lambda Python Code to Trigger Glue

Here’s a Python code snippet that can be deployed as part of the Lambda function to trigger the Glue ETL job upon receiving a new file in the S3 bucket:

import json
import boto3
import logging

# Set up logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)

# Initialize the Glue and S3 clients
glue_client = boto3.client('glue')
s3_client = boto3.client('s3')

def lambda_handler(event, context):
    """
    Lambda function to trigger an AWS Glue job when a new file is uploaded to S3.
    """
    try:
        # Extract the bucket name and object key from the event
        bucket_name = event['Records'][0]['s3']['bucket']['name']
        object_key = event['Records'][0]['s3']['object']['key']
        
        # Log the file details
        logger.info(f"File uploaded to S3 bucket {bucket_name}: {object_key}")

        # Define the Glue job name
        glue_job_name = "your_glue_job_name"
        
        # Start the Glue job with the required arguments
        response = glue_client.start_job_run(
            JobName=glue_job_name,
            Arguments={
                '--s3_input_file': f"s3://{bucket_name}/{object_key}",
                '--other_param': 'value'  # Add any other necessary Glue job parameters here
            }
        )
        
        # Log the response from Glue
        logger.info(f"Started Glue job: {response['JobRunId']}")

    except Exception as e:
        logger.error(f"Error triggering Glue job: {str(e)}")
        raise e

The Lambda function code is structured as follows:

Import Libraries: Imports necessary libraries like json, boto3, and logging to handle JSON data, interact with AWS services, and manage logging.
Set Up Logging: Configures logging to capture INFO level messages, which is crucial for monitoring and debugging the Lambda function.
Initialize AWS Clients: Initializes Glue and S3 clients using boto3 to interact with these AWS services.
Define Lambda Handler Function: The main function, lambda_handler(event, context), serves as the entry point and handles events triggered by S3.
Extract Event Data: Retrieves the S3 bucket name (bucket_name) and object key (object_key) from the event data passed to the function.
Log File Details: Logs the bucket name and object key of the uploaded file to help track what is being processed.
Trigger Glue Job: Initiates a Glue ETL job using start_job_run with the S3 object passed as input, kicking off the data transformation process.
Log Job Run ID: Logs the Glue job’s JobRunId for tracking purposes, helping to monitor the job’s progress.
Error Handling: Catches and logs any exceptions that occur during execution to ensure issues are identified and resolved quickly.
IAM Role Configuration: Ensures the Lambda execution role has the necessary permissions (glue:StartJobRun, s3:GetObject, etc.) to interact with AWS resources securely.

CloudFormation Template

Below is the CloudFormation template that defines the infrastructure required for this architecture:

AWSTemplateFormatVersion: '2010-09-09'
Description: CloudFormation template for automating banking reconciliation on AWS

Resources:
  
  S3Bucket:
    Type: AWS::S3::Bucket
    Properties:
      BucketName: !Sub 'banking-reconciliation-bucket-${AWS::AccountId}'
      AccessControl: Private
      VersioningConfiguration:
        Status: Enabled

  LambdaExecutionRole:
    Type: AWS::IAM::Role
    Properties:
      RoleName: lambda-glue-execution-role
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              Service: lambda.amazonaws.com
            Action: sts:AssumeRole
      Policies:
        - PolicyName: lambda-glue-policy
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Effect: Allow
                Action:
                  - glue:StartJobRun
                  - glue:GetJobRun
                  - s3:GetObject
                  - s3:PutObject
                Resource: "*"

  LambdaFunction:
    Type: AWS::Lambda::Function
    Properties:
      FunctionName: trigger-glue-job
      Handler: index.lambda_handler
      Role: !GetAtt LambdaExecutionRole.Arn
      Code:
        ZipFile: |
          import json
          import boto3
          import logging

          logger = logging.getLogger()
          logger.setLevel(logging.INFO)

          glue_client = boto3.client('glue')
          s3_client = boto3.client('s3')

          def lambda_handler(event, context):
              try:
                  bucket_name = event['Records'][0]['s3']['bucket']['name']
                  object_key = event['Records'][0]['s3']['object']['key']
                  
                  logger.info(f"File uploaded to S3 bucket {bucket_name}: {object_key}")

                  glue_job_name = "your_glue_job_name"
                  
                  response = glue_client.start_job_run(
                      JobName=glue_job_name,
                      Arguments={
                          '--s3_input_file': f"s3://{bucket_name}/{object_key}",
                          '--other_param': 'value'
                      }
                  )
                  
                  logger.info(f"Started Glue job: {response['JobRunId']}")

              except Exception as e:
                  logger.error(f"Error triggering Glue job: {str(e)}")
                  raise e
      Runtime: python3.8
      Timeout: 60

  S3BucketNotification:
    Type: AWS::S3::BucketNotification
    Properties:
      Bucket: !Ref S3Bucket
      NotificationConfiguration:
        LambdaConfigurations:
          - Event: s3:ObjectCreated:*
            Function: !GetAtt LambdaFunction.Arn

  GlueJob:
    Type: AWS::Glue::Job
    Properties:
      Name: your_glue_job_name
      Role: !GetAtt LambdaExecutionRole.Arn
      Command:
        Name: glueetl
        ScriptLocation: !Sub 's3://${S3Bucket}/scripts/glue_etl_script.py'
        PythonVersion: '3'
      DefaultArguments:
        --job-bookmark-option: job-bookmark-disable
      MaxRetries: 1
      ExecutionProperty:
        MaxConcurrentRuns: 1
      GlueVersion: "2.0"
      Timeout: 2880

Outputs:
  S3BucketName:
    Description: "Name of the S3 bucket created for incoming files"
    Value: !Ref S3Bucket
    Export:
      Name: S3BucketName

  LambdaFunctionName:
    Description: "Name of the Lambda function that triggers the Glue job"
    Value: !Ref LambdaFunction
    Export:
      Name: LambdaFunctionName

  GlueJobName:
    Description: "Name of the Glue job that processes the incoming files"
    Value: !Ref GlueJob
    Export:
      Name: GlueJobName

This CloudFormation template sets up the following resources:

S3 Bucket: For storing incoming files that will trigger further processing.
Lambda Execution Role: An IAM role with the necessary permissions for the Lambda function to interact with S3 and Glue.
Lambda Function: The function that is triggered when a new object is created in the S3 bucket, which then triggers the Glue ETL job.
S3 Bucket Notification: Configures the S3 bucket to trigger the Lambda function when a new file is uploaded.
Glue Job: Configures the Glue ETL job that processes the incoming data.

This system design article outlines a comprehensive approach to automating banking reconciliation processes using AWS services.

In Plain English 🚀

Thank you for being a part of the In Plain English community! Before you go:

Be sure to clap and follow the writer ️👏️️
Follow us: X | LinkedIn | YouTube | Discord | Newsletter
Visit our other platforms: CoFeed | Differ
More content at PlainEnglish.io

Ace Your Data Engineering Interviews: A 6-Month Plan for Engineers and Managers

Leave a reply

This article addresses the question, “If I want to prepare today, what should I do?” It offers a 6-month roadmap for aspiring and seasoned Data Engineers or Data Engineering Managers, including course recommendations. Keep in mind that the courses are not mandatory, and you should choose based on your availability and interest.

1. Pick Your Cloud Platform (AWS, Azure, GCP)

Duration: 60 days
Start by choosing a cloud platform based on your experience and background. It’s important to cover all the data-related services offered by the platform and understand their use cases and best practices.
If you’re aiming for a managerial role, you should also touch on well-architected frameworks, particularly those related to staging, ingestion, orchestration, transformation, and visualization.
Key Advice: Always include a focus on security, especially when dealing with sensitive data.

Some Useful Resources:

Data Engineering on AWS — The complete training

Data Lake in AWS — Easiest Way to Learn [2024]

Migration to AWS

Optional: Consider taking a Pluralsight Skill IQ or Role IQ test to assess where you stand in your knowledge journey at this stage. It’s a great way to identify areas where you need to focus more attention.

“Give me six hours to chop down a tree and I will spend the first four sharpening the axe.” — Abraham Lincoln

2. Master SQL and Data Structures & Algorithms (DSA)

Duration: 30 days
SQL is the bread and butter of Data Engineering. Ensure you’ve practiced medium to complex SQL scenarios, focusing on real-world problems.
Alongside SQL, cover basic DSA concepts relevant to Data Engineering. You don’t need to delve as deep as a full-stack developer, but understanding a few key areas is crucial.

Key DSA Concepts to Cover:

Arrays and Strings: How to manipulate and optimize these data structures.
Hashmaps: Essential for efficiently handling large data sets.
Linked Lists and Trees: Useful for understanding hierarchical data.
Basic Sorting and Searching Algorithms: To optimize data processing tasks.

Some Useful Resources:

SQL for Data Scientists, Data Engineers and Developers

50Days of DSA JavaScript Data Structures Algorithms LEETCODE

3. Deep Dive into Data Lake and Data Warehousing

Duration: 30 days
A thorough understanding of Data Lakes and Data Warehousing is vital. Start with Apache Spark, which can be implemented using Databricks. For Data Warehousing, choose a platform like Redshift, Snowflake, or BigQuery.
I recommend focusing on Databricks and Snowflake as they are cloud-agnostic and offer flexibility across platforms.
Useful Resources:

Practical Lakehouse Architecture: Designing and Implementing Modern Data Platforms at Scale

4. Build Strong Foundations in Data Modeling

“In God we trust, all others must bring data.” — W. Edwards Deming

Duration: 30 days
Data Modeling is critical for designing efficient and scalable data systems. Focus on learning and practicing dimensional data models.
Useful Resources:

Data Modeling with Snowflake: A practical guide to accelerating Snowflake development using universal data modeling techniques

5. System Design and Architecture

“The best way to predict the future is to create it.” — Peter Drucker

Duration: 30 days
System design is an advanced topic that often comes up in interviews, especially for managerial roles. Re-design a large-scale project you’ve worked on and improve it based on well-architected principles.
Key Advice: Refer to Amazon customer case studies and engineering blogs from leading companies to make necessary changes to your architecture.
Useful Resources:

System Design Primer on GitHub

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Amazon Architecture Blog

6. Fine-Tune Your Resume and Prepare STAR Stories

“Opportunities don’t happen. You create them.” — Chris Grosser

Duration: 15 days
Now that you have built up your skills, it’s time to work on your resume. Highlight your accomplishments using the STAR method, focusing on customer-centric stories that showcase your experience.
Keep actively searching for jobs but avoid cold applications. Instead, try to connect with someone who can help you with a referral.

7. Utilize Referrals & LinkedIn Contacts

“Your network is your net worth.” — Porter Gale

Building connections and networking is crucial in landing a good job. Utilize LinkedIn and other platforms to connect with industry professionals. Remember to research the company thoroughly and understand their strengths, weaknesses, and key technologies before interviews.

Always tailor your job applications and resumes to the specific company and role.
Utilize your connections to gain insights and possibly a referral, which significantly increases your chances of getting hired.

8. Always Stay Prepared, Even If You’re Not Looking to Move

“Luck is what happens when preparation meets opportunity.” — Seneca

Even if you’re actively working somewhere and not planning to change jobs, it’s wise to stay prepared. In many cases, workplace politics can overshadow skills, and in such scenarios, the quality of empathy may be lacking. Often, self-preservation takes precedence over team or skilled resources, so it’s important to always be ready to seize new opportunities if they arise.

This roadmap offers a structured approach to mastering the necessary skills for Data Engineering and Data Engineering Manager roles within six months. It’s designed to be flexible — feel free to adjust the timeline based on your current experience and availability. Remember, the key to success lies in consistent practice, continuous learning, and proactive networking.

“The only limit to our realization of tomorrow is our doubts of today.” — Franklin D. Roosevelt

Good luck and best wishes in achieving your career goals!

Stackademic 🎓

Thank you for reading until the end. Before you go:

Please consider clapping and following the writer! 👏
Follow us X | LinkedIn | YouTube | Discord
Visit our other platforms: In Plain English | CoFeed | Differ
More content at Stackademic.com

Distribution Styles in Amazon Redshift: A Banking Reconciliation Use Case

Leave a reply

When loading data into a table in Amazon Redshift, the rows are distributed across the node slices according to the table’s designated distribution style. Selecting the right distribution style (DISTSTYLE) is crucial for optimizing performance.

The primary goal is evenly distributing the data across the cluster, ensuring efficient parallel processing.
The secondary goal is to minimize the cost of data movement during query processing. Ideally, the data should be positioned where it’s needed before the query is executed, reducing unnecessary data shuffling.

Let’s bring this concept to life with an example from the banking industry, specifically focused on reconciliation processes — a common yet critical operation in financial institutions.

In a banking reconciliation system, transactions from various accounts and systems (e.g., internal bank records and external clearing houses) must be matched and validated to ensure accuracy. This process often involves large datasets with numerous transactions that need to be compared across different tables.

Example Table Structures

To demonstrate how different distribution styles can be applied, consider the following sample tables:

Transactions Table (Internal Bank Records)

CREATE TABLE internal_transactions (
    transaction_id   BIGINT,
    account_number   VARCHAR(20),
    transaction_date DATE,
    transaction_amount DECIMAL(10,2),
    transaction_type  VARCHAR(10)
)
DISTSTYLE KEY
DISTKEY (transaction_id);

The internal_transactions table is distributed using the KEY distribution style on the transaction_id column. This means that records with the same transaction_id will be stored together on the same node slice. This is particularly useful when these transactions are frequently joined with another table, such as external transactions, on the transaction_id.

Transactions Table (External Clearing House Records)

CREATE TABLE external_transactions (
    transaction_id   BIGINT,
    clearinghouse_id VARCHAR(20),
    transaction_date DATE,
    transaction_amount DECIMAL(10,2),
    status           VARCHAR(10)
)
DISTSTYLE KEY
DISTKEY (transaction_id);

Similar to the internal transactions table, the external_transactions table is also distributed using the KEY distribution style on the transaction_id column. This ensures that when a join operation is performed between the internal and external transactions on the transaction_id, the data is already co-located, minimizing the need for data movement and speeding up the reconciliation process.

CREATE TABLE currency_exchange_rates (
    currency_code VARCHAR(3),
    exchange_rate DECIMAL(10,4),
    effective_date DATE
)
DISTSTYLE ALL;

The currency_exchange_rates table uses the ALL distribution style. A full copy of this table is stored on the first slice of each node, which is ideal for small reference tables that are frequently joined with larger tables (such as transactions) but are not updated frequently. This eliminates the need for data movement during joins and improves query performance.

CREATE TABLE audit_logs (
    log_id          BIGINT IDENTITY(1,1),
    transaction_id  BIGINT,
    action          VARCHAR(100),
    action_date     TIMESTAMP,
    user_id         VARCHAR(50)
)
DISTSTYLE EVEN;

The audit_logs table uses the EVEN distribution style. Since this table may not participate in frequent joins and primarily serves as a log of actions performed during the reconciliation process, EVEN distribution ensures that the data is evenly spread across all node slices, balancing the load and allowing for efficient processing.

Applying the Distribution Styles in a Reconciliation Process

In this banking reconciliation scenario, let’s assume we need to reconcile internal and external transactions, convert amounts using the latest exchange rates, and log the reconciliation process.

The internal and external transactions will be joined on transaction_id. Since both tables use KEY distribution on transaction_id, the join operation will be efficient, as related data is already co-located.
Currency conversion will use the currency_exchange_rates table. With ALL distribution, a copy of this table is readily available on each node, ensuring fast lookups during the conversion process.
As actions are performed, logs are written to the audit_logs table, with EVEN distribution ensuring that logging operations are spread out evenly, preventing any single node from becoming a bottleneck.

This approach demonstrates how thoughtful selection of distribution styles can significantly enhance the performance and scalability of your data processing in Amazon Redshift, particularly in a complex, data-intensive scenario like banking reconciliation.

In Plain English 🚀

Thank you for being a part of the In Plain English community! Before you go:

Be sure to clap and follow the writer ️👏️️
Follow us: X | LinkedIn | YouTube | Discord | Newsletter
Visit our other platforms: CoFeed | Differ
More content at PlainEnglish.io

My Leadership Journey: An Open Letter to My Team

Leave a reply

Dear Team,

This letter is for all of you who have walked this leadership journey with me — through the highs and lows, the challenges and triumphs, the late nights and the early mornings. Many of the details I’ll mention here will be experiences you’ve had while working with me on various engagements. As we continue to grow and evolve together, I believe it’s important to take a moment to reflect on the leadership styles that guide our day-to-day interactions and the success of our projects.

The Core of Flexible Leadership

Leadership isn’t a one-size-fits-all approach. In my experience, a successful leader needs to be flexible, adapting their style based on the situation, the needs of the team, and the objectives of the organization. This flexibility isn’t just a theoretical concept; it’s something I practice daily. You’ve likely seen it in action during our one-on-ones, project kick-offs, and those challenging moments when we’re navigating through complex problems.

“Sometimes it is better to lose and do the right thing than to win and do the wrong thing.” — Tony Blair

Tony Blair’s words remind us that leadership involves making tough decisions and sometimes steering the ship through stormy waters. My approach to leadership is much like navigating a ship through diverse and sometimes unpredictable seas. Depending on the waves — whether they’re calm or tumultuous — I adjust the sails, not just for myself, but for all of you aboard.

Visionary Leadership

One of the foundational pillars of my leadership style is being visionary. When we kick off a project, I make it a point to set a clear vision (end to end) that aligns with our company goals. This isn’t just about outlining what needs to be done; it’s about communicating why we’re doing it, the impact it will have, and how it aligns with the bigger picture. You might recall our discussions where I’ve shared how customers appreciate our work, how leadership is eager to see the project go live, or how our efforts will make a significant difference. This is all part of the visionary approach.

I’ve always believed that a motivated team is a productive team. By continuously reinforcing the vision, I aim to keep the team focused and driven, even when we hit inevitable roadblocks. Visionary leadership is crucial, especially during challenging times like organizational changes or market shifts. It’s during these times that we need to look forward, not backward, and concentrate on the opportunities ahead.

“Leadership is the capacity to translate vision into reality.” — Warren Bennis

Warren Bennis’s quote perfectly encapsulates what I strive for in my leadership. Translating vision into reality is what we do every day, and it’s a journey I’m proud to take with all of you.

Coaching

Another key aspect of my leadership style is coaching. I see my role as not just a leader, but as a mentor — a GPS that helps you navigate your career paths. During our one-on-ones or when delegating tasks, I focus on understanding where you are now, where you want to go, and how we can get you there together. This isn’t about dictating your path but rather guiding you to find your way.

For instance, when I delegate tasks, I don’t just consider your skills and timelines; I also factor in your interests and career goals. It’s about finding the sweet spot where what you’re passionate about intersects with what needs to be done. I often use frameworks like WWW (What Went Well) and EBI (Even Better If) during feedback sessions to ensure that the coaching process is constructive and growth-oriented.

“A leader is one who knows the way, goes the way, and shows the way.” — John C. Maxwell

John C. Maxwell’s perspective on leadership aligns with my own. It’s about more than just knowing the way; it’s about walking the path alongside you and showing you how to navigate it successfully.

Democratic Leadership

When it comes to decision-making, particularly on matters like architecture decisions, prioritizing tech debt, or reviewing sprint retrospectives, I lean towards a democratic leadership style. You’ve seen this in our sprint planning meetings and retrospectives, where I encourage each of you to voice your opinions and share your insights. This isn’t just about reaching a consensus; it’s about fostering a collaborative environment where everyone feels they have a stake in the outcome.

I believe that when you’re involved in the decision-making process, it not only leads to better decisions but also empowers you to take ownership of the project’s success. This collective approach promotes team effort and collaboration, which are critical to our success.

“The strength of the team is each individual member. The strength of each member is the team.” — Phil Jackson

Phil Jackson’s quote highlights the essence of democratic leadership. It’s about recognizing that our collective strength comes from each of you contributing your unique perspectives and skills to the team.

Affiliative Leadership

Leadership isn’t just about pushing towards goals; it’s also about creating an environment where everyone feels supported and valued. That’s where affiliative leadership comes in. Whether it’s organizing team lunches, celebrating wins, or simply being there during tough times, I prioritize building a positive and harmonious work environment. You’ve likely experienced this during our team-building activities or in moments where we’ve had to support each other through challenges.

During stressful times, whether due to external pressures or internal changes, I focus on compassion and empathy. It’s about putting goals and standards aside temporarily to ensure that your immediate needs are met, creating a safe space where you feel valued and understood.

“People will forget what you said, people will forget what you did, but people will never forget how you made them feel.” — Maya Angelou

Maya Angelou’s words serve as a reminder that leadership is deeply personal. It’s about how we make each other feel as we work towards our goals.

Pacesetting Leadership

There are moments when urgency and high standards are necessary, and this is where pacesetting leadership comes into play. You’ve seen this when we’ve faced critical production issues or tight deadlines. In such situations, I step in to lead by example, guiding the team through the process and setting a high standard for performance. This isn’t about micromanaging but rather about showing you what excellence looks like in action and helping you rise to the occasion.

“The best way to lead people into the future is to connect with them deeply in the present.” — James M. Kouzes

Kouzes’s quote reflects my approach during these critical moments. It’s about being present, hands-on, and ensuring that we move forward together.

Avoiding Coercive Leadership

One leadership style I consciously avoid is coercive leadership — giving direct orders without seeking input or considering the team’s perspectives. This style can create an environment of fear and insecurity, which is the opposite of what I strive for. In high-performing, motivated teams like ours, coercive leadership stifles creativity and reduces job satisfaction. It’s a last resort, used only when no other options remain.

“Leadership is not about being in charge. It is about taking care of those in your charge.” — Simon Sinek

Simon Sinek’s philosophy resonates with me deeply. Leadership is about care, respect, and fostering an environment where everyone can thrive.

A Journey of Continuous Learning

My leadership style is not static; it evolves with each interaction, project, and challenge we face together. Over the years, I’ve learned from experience, as well as from formal education, books and training programs, but the real learning comes from working with all of you.

I hope this deep dive into my leadership style resonates with your experiences and offers insights into how and why I lead the way I do. Together, we’ve achieved great things, and I’m excited about what we will accomplish in the future.

“The only way to do great work is to love what you do.” — Steve Jobs

Let’s continue to love what we do, support each other, and strive for excellence in everything we undertake.

With gratitude and respect,

Shanoj Kumar V

Daily Dose of Cloud Learning: AWS Resource Cleanup with Cloud-nuke

Leave a reply

If you’re diving into AWS for testing, development, or experimentation, you know how crucial it is to clean up your environment afterwards. Often, manual cleanup can be tedious, error-prone, and may leave resources running that could cost you later. That’s where Cloud-nuke comes into play — a command-line utility designed to automate the deletion of all resources within your AWS account.

Cloud-nuke is a powerful and potentially destructive tool, as it will delete all specified resources within an account. Users should exercise caution and ensure they have backups or have excluded critical resources before running the tool.

Step 1: Install Cloud-nuke

Download Cloud-nuke:

Visit the Cloud-nuke GitHub releases page.

Download the appropriate .exefile for Windows and run the installer.

Windows: You can install cloud-nuke using winget:

winget install cloud-nuke

Verify Installation:

Open the Command Prompt and type cloud-nuke --version to verify that Cloud-nuke is installed correctly.

Step 2: Configure AWS CLI with Your Profile

Install AWS CLI:

If you don’t have the AWS CLI installed, download and install it from here.

Configure AWS CLI:

Open Command Prompt. Run the following command:

aws configure --profile your-profile-name

Provide the required credentials (Access Key ID, Secret Access Key) for your AWS account.

Specify your preferred default region (e.g., us-west-2).

Specify the output format (e.g., json).

Step 3: Use Cloud-nuke to Clean Up Resources

Run Cloud-nuke with IAM Exclusion:

To ensure no IAM users are deleted, include the --resource-type flag to exclude IAM resources:

cloud-nuke aws --exclude-resource-type iam

This command will target all resources except IAM.

Bonus Commands:

To list all the profiles configured on your system, use the following command:

aws configure list-profiles

This will display all the profiles you have configured.
To see the configuration details for a specific profile, use the following command:

aws configure list --profile your-profile-name

This command will display the following details:

Access Key ID: The AWS access key ID.
Secret Access Key: The AWS secret access key (masked).
Region: The default AWS region.
Output Format: The default output format (e.g., json, text, yaml).

I hope this article helps those who plan to use Cloud-nuke. It’s a handy tool that can save you time and prevent unnecessary costs by automating the cleanup process after you’ve tried out resources in your AWS account.

In Plain English 🚀

Thank you for being a part of the In Plain English community! Before you go:

Be sure to clap and follow the writer ️👏️️
Follow us: X | LinkedIn | YouTube | Discord | Newsletter
Visit our other platforms: CoFeed | Differ
More content at PlainEnglish.io