Linux: How to use nmcli to display current network configuraton and reconfigure a network adapter

Posted on March 18, 2024 by Arturo Gutierrez Loza

nmcli is a command-line tool used to interact with NetworkManager, a network management daemon used in many Linux distributions. It allows you to view and configure network settings. Here’s how you can use nmcli to display the current network configuration and reconfigure a network adapter:

Display Current Network Configuration:You can use nmcli to display the current network configuration:nmcli connection show This command will list all network connections along with their configuration details.
Display Detailed Information about a Specific Connection:To view detailed information about a specific connection, such as the IP address, gateway, DNS servers, etc., use:nmcli connection show <connection_name> Replace <connection_name> with the name of the connection you want to inspect.
Reconfigure a Network Adapter:To reconfigure a network adapter, you can modify its settings directly using nmcli. Here’s a basic example to set a static IP address:nmcli connection modify <connection_name> ipv4.addresses <ip_address>/<subnet_mask> ipv4.gateway <gateway_address> ipv4.dns <dns_server> Replace <connection_name>, <ip_address>, <subnet_mask>, <gateway_address>, and <dns_server> with appropriate values.For example:nmcli connection modify "Wired Connection 1" ipv4.addresses 192.168.1.100/24 ipv4.gateway 192.168.1.1 ipv4.dns 8.8.8.8 This command modifies the “Wired Connection 1” connection to use the specified static IP address, subnet mask, gateway, and DNS server.
Apply Changes:After modifying the connection settings, apply the changes:nmcli connection up <connection_name> Replace <connection_name> with the name of the connection you modified.
Verify Changes:Use nmcli connection show <connection_name> to verify that the changes have been applied successfully.

Remember to replace placeholders such as <connection_name>, <ip_address>, <subnet_mask>, <gateway_address>, and <dns_server> with actual values relevant to your network configuration. Additionally, ensure that you have appropriate permissions (usually root or sudo) to modify network settings.

Linux: Using lsblk and smartctl to display hard disk overall-health self-assessment

Posted on March 18, 2024 by Arturo Gutierrez Loza

root@debian01:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
nvme0n1 259:0 0 476.9G 0 disk
├─nvme0n1p1 259:1 0 512M 0 part /boot/efi
├─nvme0n1p2 259:2 0 488M 0 part /boot
└─nvme0n1p3 259:3 0 476G 0 part
└─nvme0n1p3_crypt 254:0 0 475.9G 0 crypt
├─debian01–vg-root 254:1 0 23.3G 0 lvm /
├─debian01–vg-var 254:2 0 9.3G 0 lvm /var
├─debian01–vg-swap_1 254:3 0 976M 0 lvm
├─debian01–vg-tmp 254:4 0 1.9G 0 lvm /tmp
└─debian01–vg-home 254:5 0 440.5G 0 lvm /home

root@debian01:~# smartctl -a –test=long /dev/nvme0n1
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.1.0-18-amd64] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, http://www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number: SAMSUNG MZ9LQ512HBLU-00B00
Serial Number: S7DANXMW102944
Firmware Version: FXM7601Q
PCI Vendor/Subsystem ID: 0x144d
IEEE OUI Identifier: 0x002538
Total NVM Capacity: 512,110,190,592 [512 GB]
Unallocated NVM Capacity: 0
Controller ID: 5
NVMe Version: 1.4
Number of Namespaces: 1
Namespace 1 Size/Capacity: 512,110,190,592 [512 GB]
Namespace 1 Utilization: 61,558,759,424 [61.5 GB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: 002538 d130ba314d
Local Time is: Mon Mar 18 11:42:24 2024 CST
Firmware Updates (0x16): 3 Slots, no Reset required
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005f): Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Log Page Attributes (0x1e): Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Pers_Ev_Lg
Maximum Data Transfer Size: 512 Pages
Warning Comp. Temp. Threshold: 83 Celsius
Critical Comp. Temp. Threshold: 85 Celsius
Namespace 1 Features (0x10): NP_Fields

Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 5.12W – – 0 0 0 0 0 0
1 + 3.59W – – 1 1 1 1 0 0
2 + 2.92W – – 2 2 2 2 0 500
3 – 0.0500W – – 3 3 3 3 210 1200
4 – 0.0050W – – 4 4 4 4 1000 9000

Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 51 Celsius
Available Spare: 100%
Available Spare Threshold: 50%
Percentage Used: 0%
Data Units Read: 181,599 [92.9 GB]
Data Units Written: 1,857,619 [951 GB]
Host Read Commands: 1,898,681
Host Write Commands: 48,222,637
Controller Busy Time: 238
Power Cycles: 75
Power On Hours: 52
Unsafe Shutdowns: 61
Media and Data Integrity Errors: 0
Error Information Log Entries: 0
Warning Comp. Temperature Time: 153
Critical Comp. Temperature Time: 3
Temperature Sensor 1: 51 Celsius
Thermal Temp. 1 Transition Count: 1236
Thermal Temp. 2 Transition Count: 1014
Thermal Temp. 1 Total Time: 2672
Thermal Temp. 2 Total Time: 12386

Error Information (NVMe Log 0x01, 16 of 64 entries)
No Errors Logged

root@debian01:~#

Site Reliability Engineering (SRE) principles

Posted on March 13, 2024 by Arturo Gutierrez Loza

Site Reliability Engineering (SRE) principles, as defined by Google, focus on creating scalable and reliable software systems through a combination of engineering and operations practices. SRE aims to balance the need for rapid innovation with the requirement for reliability, availability, and scalability. Here are some key principles of SRE:

Service Level Objectives (SLOs):
- SLOs define the level of reliability or performance that a service should achieve, typically expressed as a percentage of uptime or response time.
- SLOs provide a clear target for reliability and help align engineering efforts with business goals.
- SRE teams monitor and measure SLOs, using them to make informed decisions about service improvements and investments.
Error Budgets:
- Error budgets are a concept closely related to SLOs. They represent the permissible amount of downtime or errors that a service can experience within a given time period.
- SRE teams manage error budgets to strike a balance between reliability and innovation. They allow for a certain level of risk-taking and experimentation, as long as it doesn’t exceed the error budget.
Automation:
- SRE emphasizes automation to reduce manual toil and improve efficiency. Automation helps standardize processes, eliminate human error, and scale operations.
- Automation is applied to various areas, including deployment, monitoring, incident response, and capacity management.
Monitoring and Alerting:
- Effective monitoring and alerting are crucial for detecting and responding to issues proactively.
- SRE teams use monitoring tools to collect and analyze metrics, track the health and performance of systems, and identify potential problems.
- Alerting systems notify teams about incidents or deviations from expected behavior, allowing for timely responses.
Incident Management:
- SRE follows a structured approach to incident management, aiming to minimize the impact of incidents on service reliability and user experience.
- Incident response processes include escalation paths, on-call rotations, incident retrospectives, and postmortems to learn from failures and prevent recurrence.
Capacity Planning:
- SRE teams perform capacity planning to ensure that systems have sufficient resources to handle current and future workloads.
- Capacity planning involves forecasting demand, monitoring resource utilization, and scaling infrastructure as needed to maintain performance and reliability.
Blameless Culture:
- SRE promotes a blameless culture where individuals are encouraged to take risks, learn from failures, and collaborate to improve systems.
- Postmortems focus on identifying root causes and systemic issues rather than assigning blame to individuals.
Continuous Improvement:
- SRE emphasizes continuous improvement through iterative processes, experimentation, and feedback loops.
- Teams regularly review performance, reliability, and user feedback to identify opportunities for optimization and enhancement.

By embracing these principles, SRE teams strive to build and operate resilient and scalable systems that meet user expectations for reliability and performance.

Linux: Steps involved in updating the Linux kernel

Posted on March 13, 2024 by Arturo Gutierrez Loza

Updating the Linux kernel involves several steps to ensure a smooth and successful process. Here’s a general overview of the steps involved in updating the Linux kernel:

Check Current Kernel Version:
- Before updating the kernel, check the current kernel version using the uname command:bashCopy codeuname -r
- Note down the current kernel version to compare it with the new kernel version after the update.
Backup Important Data:
- Although updating the kernel typically doesn’t affect user data directly, it’s always a good practice to back up important data before making any system-level changes.
Check for Available Updates:
- Use your package manager to check for available kernel updates. The commands vary depending on your Linux distribution:
  - For Debian/Ubuntu-based systems:sqlCopy codesudo apt update sudo apt list --upgradable
  - For CentOS/RHEL-based systems:sqlCopy codesudo yum check-update
  - For Fedora:sqlCopy codesudo dnf check-update
Install the New Kernel:
- Once you’ve identified available kernel updates, install the new kernel using your package manager. Be sure to install both the kernel image and kernel headers (if required):
  - For Debian/Ubuntu-based systems:phpCopy codesudo apt install linux-image-<version> linux-headers-<version>
  - For CentOS/RHEL-based systems:Copy codesudo yum install kernel
  - For Fedora:Copy codesudo dnf install kernel
Update Boot Loader Configuration:
- After installing the new kernel, update the boot loader configuration to include the new kernel entry. This ensures that the system can boot into the updated kernel.
  - For GRUB (used in most Linux distributions):bashCopy codesudo update-grub # Debian/Ubuntu-based sudo grub2-mkconfig -o /boot/grub2/grub.cfg # CentOS/RHEL-based
  - For systemd-boot (used in some distributions):sqlCopy codesudo bootctl update
Reboot the System:
- Once the new kernel is installed and the boot loader configuration is updated, reboot the system to load the updated kernel:Copy codesudo reboot
Verify Kernel Update:
- After rebooting, log in to the system and verify that the new kernel is running:bashCopy codeuname -r
Test System Functionality:
- Test various system functionalities and applications to ensure that they work correctly with the new kernel.
- Pay attention to any hardware drivers or kernel modules that may require reinstallation or configuration adjustments.
Monitor System Stability:
- Monitor system stability and performance over time to ensure that the new kernel update doesn’t introduce any issues or regressions.
Rollback (If Necessary):
- In case the new kernel causes issues or compatibility problems, you can roll back to the previous kernel version.
- Most boot loaders allow you to select the kernel version to boot from during system startup. Choose the previous kernel version from the boot menu to boot into it.

By following these steps, you can safely update the Linux kernel on your system while minimizing the risk of downtime or compatibility issues.

Linux: How to replace a bad disk on a Linux RAID configuration

Posted on March 13, 2024 by Arturo Gutierrez Loza

Replacing a failed disk in a Linux RAID configuration involves several steps to ensure that the array remains operational and data integrity is maintained. Below is a step-by-step guide on how to replace a bad disk in a Linux RAID configuration using the mdadm utility:

Identify the Failed Disk:
- Use the mdadm --detail /dev/mdX command to display detailed information about the RAID array.
- Look for the state of each device in the array to identify the failed disk.
- Note the device name (e.g., /dev/sdX) of the failed disk.
Prepare the New Disk:
- Insert the new disk into the system and ensure it is recognized by the operating system.
- Partition the new disk using a partitioning tool like fdisk or parted. Create a Linux RAID (type FD) partition on the new disk.
Add the New Disk to the RAID Array:
- Use the mdadm --manage /dev/mdX --add /dev/sdX1 command to add the new disk to the RAID array.
- Replace /dev/mdX with the name of your RAID array and /dev/sdX1 with the partition name of the new disk.
- This command starts the process of rebuilding the RAID array onto the new disk.
Monitor the Rebuild Process:
- Monitor the rebuild process using the mdadm --detail /dev/mdX command.
- Check the progress and status of the rebuild operation to ensure it completes successfully.
- The rebuild process may take some time depending on the size of the RAID array and the performance of the disks.
Verify RAID Array Status:
- After the rebuild process completes, verify the status of the RAID array using the mdadm --detail /dev/mdX command.
- Ensure that all devices in the array are in the “active sync” state and that there are no errors or warnings.
Update Configuration Files:
- Update configuration files such as /etc/mdadm/mdadm.conf to ensure that the new disk is recognized and configured correctly in the RAID array.
Perform Testing and Verification:
- Perform thorough testing to ensure that the RAID array is functioning correctly and that data integrity is maintained.
- Test read and write operations on the array to verify its performance and reliability.
Optional: Remove the Failed Disk:
- Once the rebuild process is complete and the RAID array is fully operational, you can optionally remove the failed disk from the array using the mdadm --manage /dev/mdX --remove /dev/sdX1 command.
- This step is optional but can help clean up the configuration and remove any references to the failed disk.

By following these steps, you can safely replace a bad disk in a Linux RAID configuration using the mdadm utility while maintaining data integrity and ensuring the continued operation of the RAID array.

What is RAID and how do you configure it in Linux?

Posted on March 13, 2024 by Arturo Gutierrez Loza

RAID (Redundant Array of Independent Disks) is a technology used to combine multiple physical disk drives into a single logical unit for data storage, with the goal of improving performance, reliability, or both. RAID arrays distribute data across multiple disks, providing redundancy and/or improved performance compared to a single disk.

There are several RAID levels, each with its own characteristics and benefits. Some common RAID levels include RAID 0, RAID 1, RAID 5, RAID 6, and RAID 10. Each RAID level uses a different method to distribute and protect data across the disks in the array.

Here’s a brief overview of some common RAID levels:

RAID 0 (Striping):
- RAID 0 offers improved performance by striping data across multiple disks without any redundancy.
- It requires a minimum of two disks.
- Data is distributed evenly across all disks in the array, which can improve read and write speeds.
- However, there is no redundancy, so a single disk failure can result in data loss for the entire array.
RAID 1 (Mirroring):
- RAID 1 provides redundancy by mirroring data across multiple disks.
- It requires a minimum of two disks.
- Data written to one disk is simultaneously written to another disk, providing redundancy in case of disk failure.
- RAID 1 offers excellent data protection but doesn’t provide any performance benefits compared to RAID 0.
RAID 5 (Striping with Parity):
- RAID 5 combines striping with parity data to provide both improved performance and redundancy.
- It requires a minimum of three disks.
- Data is striped across multiple disks, and parity information is distributed across all disks.
- If one disk fails, data can be reconstructed using parity information stored on the remaining disks.
RAID 6 (Striping with Dual Parity):
- RAID 6 is similar to RAID 5 but includes an additional level of redundancy.
- It requires a minimum of four disks.
- RAID 6 can tolerate the failure of up to two disks simultaneously without data loss.
- It provides higher fault tolerance than RAID 5 but may have slightly lower performance due to the additional parity calculations.
RAID 10 (Striping and Mirroring):
- RAID 10 combines striping and mirroring to provide both improved performance and redundancy.
- It requires a minimum of four disks.
- Data is striped across mirrored sets of disks, offering both performance and redundancy benefits of RAID 0 and RAID 1.

To configure RAID in Linux, you typically use software-based RAID management tools provided by the operating system. The most commonly used tool for configuring RAID in Linux is mdadm (Multiple Device Administration), which is a command-line utility for managing software RAID devices.

Here’s a basic outline of the steps to configure RAID using mdadm in Linux:

Install mdadm (if not already installed):sudo apt-get install mdadm # For Debian/Ubuntu sudo yum install mdadm # For CentOS/RHEL
Prepare the disks:
- Ensure that the disks you plan to use for RAID are connected and recognized by the system.
- Partition the disks using a partitioning tool like fdisk or parted. Create Linux RAID (type FD) partitions on each disk.
Create RAID arrays:
- Use the mdadm command to create RAID arrays based on the desired RAID level.
- For example, to create a RAID 1 array with two disks (/dev/sda and /dev/sdb):sudo mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sda1 /dev/sdb1
Format and mount the RAID array:
- Once the RAID array is created, format it with a filesystem of your choice (e.g., ext4) using the mkfs command.
- Mount the RAID array to a mount point in the filesystem.
Update configuration files:
- Update configuration files such as /etc/mdadm/mdadm.conf to ensure that the RAID array configuration is persistent across reboots.
Monitor and manage RAID arrays:
- Use mdadm commands to monitor and manage RAID arrays, such as adding or removing disks, checking array status, and replacing failed disks.

These are general steps for configuring software RAID using mdadm in Linux. The exact commands and procedures may vary depending on the specific RAID level and configuration requirements. It’s essential to refer to the documentation and guides specific to your Linux distribution and RAID configuration.

Linux: Troubleshooting network connectivity issues

Posted on March 13, 2024 by Arturo Gutierrez Loza

Troubleshooting network connectivity issues in Linux involves identifying and diagnosing the root cause of the problem by checking various network components and configurations. Here’s a systematic approach to troubleshoot network connectivity issues in Linux:

Check Physical Connections:
- Ensure that all network cables are securely connected, and network interfaces (Ethernet, Wi-Fi) are properly seated in their respective ports.
Verify Network Interface Status:
- Use the ip or ifconfig command to check the status of network interfaces.ip addr show orcssCopy codeifconfig -a
- Ensure that the network interface is up (UP state) and has an IP address assigned.
Check IP Configuration:
- Use the ip or ifconfig command to verify the IP address, subnet mask, gateway, and DNS server settings of the network interface.
- Ensure that the IP configuration is correct and matches the network configuration of your environment.
Verify DNS Resolution:
- Use the ping command to test DNS resolution by pinging a domain name.ping example.com
- If DNS resolution fails, check the /etc/resolv.conf file for correct DNS server configurations and try using alternative DNS servers.
Test Local Network Connectivity:
- Use the ping command to test connectivity to other devices on the local network by pinging their IP addresses.ping <IP_address>
- If local pings fail, check the network configuration of the local device, including IP address, subnet mask, and gateway settings.
Check Firewall Settings:
- Disable the firewall temporarily using the appropriate command for your firewall software (e.g., ufw disable for Uncomplicated Firewall).
- If network connectivity improves after disabling the firewall, adjust firewall rules to allow necessary network traffic.
Inspect Routing Table:
- Use the ip route command to view the routing table and ensure that the default gateway is configured correctly.ip route show
- If necessary, add or modify routing entries using the ip route add command.
Check Network Services:
- Verify that essential network services (such as DHCP client, network manager, and DNS resolver) are running using the systemctl command.systemctl status NetworkManager systemctl status systemd-resolved
- Restart or troubleshoot network services as needed.
Review System Logs:
- Check system logs (e.g., /var/log/syslog, /var/log/messages) for any network-related errors or warnings that may provide clues about the issue.bashCopy codetail -n 50 /var/log/syslog
Test Connectivity to External Resources:
- Use the ping or traceroute command to test connectivity to external servers and websites.ping google.com traceroute google.com
- If external pings or traceroutes fail, check for network issues outside your local network, such as ISP problems or internet service disruptions.

By following these steps and systematically checking network components and configurations, you can effectively troubleshoot and resolve network connectivity issues in Linux.

Linux: systemd target units examples

Posted on March 13, 2024 by Arturo Gutierrez Loza

Here is a list of some systemd target units along with examples of how to use them:

multi-user.target:
- This target is used for a multi-user system without a graphical interface. It includes services required for a text-based or command-line environment.
- Example: To switch to the multi-user target, you can use the following command: sudo systemctl isolate multi-user.target
graphical.target:
- Represents a multi-user system with a graphical interface (GUI). It includes services required for a graphical desktop environment.
- Example: To switch to the graphical target, you can use the following command:sudo systemctl isolate graphical.target
rescue.target:
- Similar to runlevel 1 or single-user mode in traditional SysVinit systems. It provides a minimal environment with a root shell for system recovery and maintenance tasks.
- Example: To switch to the rescue target, you can use the following command:sudo systemctl isolate rescue.target
emergency.target:
- Provides the most minimal environment possible, intended for emergencies where the system is in an unusable state. It drops the system into a single-user shell without starting any services.
- Example: To switch to the emergency target, you can use the following command:sudo systemctl emergency
shutdown.target:
- Used to gracefully shut down the system. All services are stopped, and the system is powered off or rebooted, depending on the shutdown command used.
- Example: To initiate a shutdown using this target, you can use the following command:sudo systemctl shutdown
network.target:
- Represents the availability of the network. Other services that depend on network connectivity may be started after this target is reached.
- Example: To view the status of the network target, you can use the following command:systemctl status network.target
sockets.target:
- Represents the availability of system sockets. Services that provide network services via sockets may be started after this target is reached.
- Example: To view the status of the sockets target, you can use the following command:systemctl status sockets.target

These are some of the systemd target units along with examples of how to use them. Depending on your specific distribution and configuration, there may be additional targets or custom targets defined. You can explore more targets and their usage by referring to the systemd documentation or using the systemctl list-units --type=target command.

Linux: Systemd target units

Posted on March 13, 2024 by Arturo Gutierrez Loza

Systemd target units are used to group and manage services and other units in Linux distributions that use systemd as the init system. Targets are similar to runlevels in traditional SysVinit systems but offer more flexibility and granularity in defining system states and dependencies between units.

Here are some common systemd target units:

default.target:
- This is the default target unit that the system boots into. It typically represents the normal operational state of the system.
multi-user.target:
- Represents a multi-user system without a graphical interface. It includes services required for a text-based or command-line environment.
graphical.target:
- Represents a multi-user system with a graphical interface (GUI). It includes services required for a graphical desktop environment.
rescue.target:
- Similar to runlevel 1 or single-user mode in traditional SysVinit systems. It provides a minimal environment with a root shell for system recovery and maintenance tasks.
emergency.target:
- Provides the most minimal environment possible, intended for emergencies where the system is in an unusable state. It drops the system into a single-user shell without starting any services.
shutdown.target:
- Used to gracefully shut down the system. All services are stopped, and the system is powered off or rebooted, depending on the shutdown command used.
poweroff.target:
- Initiates a system poweroff, shutting down the system and powering off the hardware.
reboot.target:
- Initiates a system reboot, shutting down the system and restarting the hardware.
network.target:
- Represents the network being available. Other services that depend on network connectivity may be started after this target is reached.
basic.target:
- A minimal target that is reached early during the boot process. It includes basic system initialization and dependency handling.
sockets.target:
- Represents the availability of system sockets. Services that provide network services via sockets may be started after this target is reached.
timers.target:
- Represents the availability of system timers. Services that depend on timers for scheduling tasks may be started after this target is reached.

These are some of the common systemd target units used in Linux distributions. Depending on the specific distribution and configuration, there may be additional targets or custom targets defined. You can view the available targets on your system using the systemctl list-units --type=target command.

DevOps: Responsibility, transparency and feedback

Posted on March 6, 2024 by Arturo Gutierrez Loza

“Culture is the #1 success factor in DevOps. Building a culture of shared responsibility, transparency, and faster feedback is the foundation of every high-performing DevOps team.” -Attlasian

Arturo Gutiérrez Loza Blog

LIVE FREE OR DIE

Linux: How to use nmcli to display current network configuraton and reconfigure a network adapter

Linux: Using lsblk and smartctl to display hard disk overall-health self-assessment

Linux: Steps involved in updating the Linux kernel

Linux: How to replace a bad disk on a Linux RAID configuration

What is RAID and how do you configure it in Linux?

Linux: Troubleshooting network connectivity issues

Linux: systemd target units examples

Linux: Systemd target units

DevOps: Responsibility, transparency and feedback