InfoSec Write-ups

A collection of write-ups from the best hackers in the world on topics ranging from bug bounties and CTFs to vulnhub machines, hardware challenges and real life encounters. Subscribe to our weekly newsletter for the coolest infosec updates: https://weekly.infosecwriteups.com/

Follow publication

Using Python for Malware Analysis — A Beginners Guide

Overview

Malware refers to malicious software which is intended to harm computer systems and networks, by stealing or misusing confidential information without authorization, or saturating the network bandwidth. The danger of Malware has been constantly increasing and can have an impact from an individual level to an organizational level as well. To prevent such software from entering the systems, Malware analysis is carried out. It refers to the process of analyzing malware to understand how it works and how to defend against it.

It is an important and growing field in the Cybersecurity industry. To choose among the most preferred and comfortable languages for performing malware analysis, python stands on the first place. Python is a powerful and popular programming language that is commonly used for malware analysis due to its flexibility, ease of use, and availability of libraries and tools.

Malware

Malware is designed to harm or exploit computers and networks. Malware attacks can cause significant damage to organizations and individuals, ranging from theft of sensitive data to disruption of critical systems. Before we discuss how Python can be used to perform malware analysis, we must know how Malware can take various forms and enter computer systems and networks and harm them. Malware can be broadly classified into four categories: viruses, worms, Trojans, and ransomware.

  • Viruses are programs that can replicate themselves by infecting other files or systems. They can be spread through email attachments, infected websites, or file-sharing networks. Once a virus infects a system, it can cause damage by corrupting files or stealing sensitive information.
  • Worms are similar to viruses in that they can replicate themselves, but they do not require a host file. Instead, they can spread across networks or the internet on their own. Worms can cause damage by consuming network bandwidth, crashing systems, or stealing data.
  • Trojans are software applications that present as trustworthy but actually contain harmful code. They can be downloaded from the internet or spread through email attachments. Once a Trojan infects a system, it can give attackers remote access to the system, steal sensitive information, or cause other types of damage.
  • Ransomware is a type of malware that encrypts files on a system and demands money in return for the decryption key. It can be spread through infected email attachments, infected websites, or file-sharing networks. Ransomware can cause significant damage by encrypting critical files and rendering them unusable.

Python for Malware Analysis

Python is a popular programming language among malware analysts due to its versatility and ease of use. Python’s extensive library of modules and tools can streamline the process of analyzing malware samples and identifying their behavior.

Python’s automation capabilities also come in handy for automating tasks and processes within the malware analysis workflow, making the process more efficient and streamlined. One of the significant advantages of using Python for malware analysis is the availability of libraries and tools for tasks like disassembly or reverse engineering. These tools enable analysts to extract and analyze the underlying code of malware samples, helping them to understand how the malware works and what it does.

Additionally, Python’s high-level syntax and dynamic typing allow analysts to write concise code that is easy to understand and maintain. Its cross-platform compatibility ensures that analysts can use Python on a wide range of operating systems and devices. Overall, Python’s flexibility, ease of use, and extensive library of modules make it an ideal choice for analyzing malware samples and understanding their behavior.

Tools and Libraries for Malware Analysis with Python

Python provides a great variety of tools and libraries that can be used for malware analysis. Below is a selection of a few of the most well-liked ones:

  • Pyew

Pyew is a Python-based command-line tool that allows users to perform forensic analysis on malware samples. It has a number of features, such as the ability to identify file types, convert files, and disassemble. Pyew can extract details about a file’s headers, sections, and imports as well as its overall structure and contents. Pyew can also analyze the code section of a file and identify any suspicious behavior, such as the presence of packers or obfuscation techniques. The Pyew Python library can be used to automate the analysis of Portable Executable files and extract information about the malware’s behavior.

  • Yara

Yara is a powerful open-source tool that allows the creation of rules for identifying malware based on specific characteristics, such as file names, hashes, and strings. Yara rules can be written in a simple and flexible syntax that is easy to understand and modify. The Yara Python library allows the integration of Yara rules into Python scripts for automated malware analysis. The Yara Python library can also be used to scan a directory or a file for matching Yara rules.

  • Scapy

Scapy is a potent Python package for packet modification and network analysis. It allows the creation and manipulation of network packets and the analysis of network traffic. Scapy can be used to identify suspicious network traffic generated by malware, such as connections to command-and-control servers or data exfiltration. The Scapy Python library can also be used to automate the analysis of network traffic and extract information about the malware’s behavior.

  • angr

angr’ is a powerful open-source binary analysis framework that allows the analysis of binary code in an automated and scalable manner. Angr can perform various analysis tasks, such as symbolic execution, concolic execution, and taint analysis, to extract information about the behavior and vulnerabilities of a binary. Angr provides a Python API to interact with the binary and extract information about its behavior. The Angr Python library can be used to automate the analysis of binary code and identify any malicious or vulnerable behavior.

  • r2pipe

r2pipe is a Python library that provides a Python interface to the radare2 framework, which is a popular open-source reverse engineering platform. r2pipe can be used to interact with a binary file and perform various analysis tasks, such as disassembly, debugging, and patching. r2pipe provides a simple and flexible API that allows the integration of radare2 into Python scripts for automated analysis. The r2pipe Python library can be used to automate the analysis of binary code and identify any suspicious or vulnerable behavior.

  • AnalyzePE

AnalyzePE is a Python library that allows the extraction of structured information from binary files. This library provides functionalities to access the headers, sections, imports, and other important metadata present in the binary files. It also provides functions to analyze the code section of a binary file and identify any suspicious behavior, such as the presence of packers or obfuscation techniques. This library can be used to automate the analysis of binary files and extract information about the behavior of the program.

Example

Here we describe an example of performing malware analysis using the ‘pyew’ library.

import pyew

# Load the executable file
pe = pyew.PE("malware.exe")

# Analyze the code section
code_section = pe.get_section_by_name(".text")
code_bytes = code_section.get_data()

# Disassemble the code
disasm = pyew.Dasm(code_bytes, code_section.get_addr())

# Find function calls
for instruction in disasm:
if instruction.mnemonic == "call":
# Extract the target address
target_addr = instruction.op1.get_val()
# Check if the target address is in the imports
for imp in pe.get_imports():
if imp["Address"] == target_addr:
print("Function call to: ", imp["Name"])

In this example, we first load the PE (Portable Executable) file using pyew.PE. We then extract the code section of the file using get_section_by_name and get_data. We then make use of the pyew.Dasm class to disassemble the code section and analyze its contents. The Dasm constructor takes the code bytes and the base address of the section as its arguments.

We can then iterate over the instructions in the disassembly using a for loop, and check for specific types of instructions, such as function calls. In this example, we check for instructions with the mnemonic “call”, and extract the target address of the call. We then check if the target address is present in the imported functions of the PE file, and print the name of the imported function if it is called.

This is just a simple example of how we could use pyew to analyze a PE file. Depending on the nature of the file and the specific analysis goals, we could perform many other types of analysis using pyew as well as other tools in the Python ecosystem.

Conclusion

In this article, we discussed in detail what Malware is, why it is harmful and how we can get rid of it, which is through Malware Analysis. Further, we discussed why and how we can use Python tools and libraries to perform Malware Analysis. To summarize:

  • Malware analysis is the process of analyzing malware to understand how it works and uncover its behavior and capabilities.
  • Python is a popular and powerful programming language that can be used for malware analysis due to its flexibility, ease of use, and availability of libraries and tools.
  • Hence, if you are a newbie and getting into web development, cyber security, or any other trending technology, it is ideal to master Python programming.
  • There are several key steps involved in using Python for malware analysis, including setting up a virtual environment, installing the required libraries, and analyzing the malware sample.
  • There are several types of malware including Trojan Horses, Worms, Viruses, ransomware.
  • Python offers a wide range of tools for the purpose of analyzing malware that include pyew, scapy, yara, angr, r2pipe, AnalyzePE.
  • The typical process of malware analysis includes extracting information about the malware sample, analyzing its behavior, and understanding its network activity.
  • The information obtained from malware analysis can help in identifying the type of malware, understanding its capabilities, and developing effective strategies for detection and mitigation.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Published in InfoSec Write-ups

A collection of write-ups from the best hackers in the world on topics ranging from bug bounties and CTFs to vulnhub machines, hardware challenges and real life encounters. Subscribe to our weekly newsletter for the coolest infosec updates: https://weekly.infosecwriteups.com/

Written by Sarang S. Babu

A tech enthusiast with a great taste in technology, avid gamer and a marketer by profession. 😎

No responses yet

Write a response