Investigating The Files With Forensics | CTF Newbies

HackTheBox SRMIST
InfoSec Write-ups
Published in
8 min readApr 11, 2024

--

Forensics is the art of recovering the digital trail left on a computer. There are various methods to find data that is seemingly deleted, not stored, or worse, covertly recorded. In a CTF context, “Forensic” challenges can include file format analysis, steganography, memory dump analysis, or network packet capture analysis.

For solving forensics CTF challenges, the three most useful abilities are probably:

  • Knowing a scripting language (e.g., Python).
  • Knowing how to manipulate binary data (byte-level manipulations) in that language.
  • Recognizing formats, protocols, structures, and encodings.

CTF Challenges

CTF (Capture The Flag) forensics challenges are a type of cybersecurity competition where participants are tasked with analyzing digital evidence, conducting forensic investigations, and solving puzzles to uncover hidden information or solve specific objectives. Here are some common types of CTF forensics challenges:

  1. File Analysis: Participants may be given files such as disk images, memory dumps, network captures, or log files to analyze.
  2. Steganography: Steganography involves hiding messages or data within other files or media to conceal their existence.
  3. Network Forensics: Participants may be provided with network traffic captures or packet captures obtained from a compromised system or network.
  4. Memory Forensics: Memory forensics involves analyzing the volatile memory (RAM) of a system to extract information such as running processes, open network connections, and artifacts left behind by malware.
  5. Log Analysis: Log files from various sources such as web servers, firewalls, and intrusion detection systems (IDS) may be provided to participants.

Common forensics concepts

Below follows a high-level overview of some of the common concepts in forensics CTF challenges, and some recommended commands for performing common tasks.

~Initial Analysis

Forensic CTF challenges often require the use of exploratory steps to determine what to do next. Check below:

  • strings: search for plaintext strings in a file.
  • grep: search for a particular string in a file.
  • bgrep: search for non-text data patterns.
  • hexdump: displays the content of a file in hexadecimal.

~Encodings

The Binary is 1’s and 0’s, but often is transmitted as text. It would be wasteful to transmit actual sequences of 101010101, so the data is first encoded using one of a variety of methods.

The ability to recognize encodings is beneficial to the solving of forensic CTF challenges. Certain encodings, such as Base64 encoded content, are easily identifiable by their alphanumeric charset and its “=” padding suffix (when present). See example below:

$ echo aGVsbG8gd29ybGQh | base64 -D
hello world!

~File Formats

Common file formats one can encounter during forensics CTF challenges are:

  • Archive files (ZIP, TGZ)
  • Image file formats (JPG, GIF, BMP, PNG)
  • Packet captures (PCAP, PCAPNG)
  • Video (especially MP4) or Audio (especially WAV, MP3)
  • Microsoft’s Office formats (RTF, OLE, OOXML)

~File Signatures

File Extensions are not the sole way to identify the type of a file, files have certain leading bytes called file signatures which allow programs to parse the data consistently. File signatures (also known as File Magic Numbers) are bytes within a file used to identify the format of the file. Generally, they’re 2–4 bytes long, found at the beginning of a file. Files can sometimes come without an extension, or with incorrect ones. We use file signature analysis to identify the format (file type) of the file.

A Hex Editor is recommended to view file signatures. Once you find the file signature, you can check it against file signature repositories such as Gary Kessler’s.

~Image File Analysis

An image file’s metadata can be viewed using Exiftool. The tool displays metadata for an input file, including file size, dimensions (width and height), file type, as well as the program used to create (e.g., Photoshop). Run the following command:

exiftool(-k).exe [filename]

~Timestamps

Timestamps are data that indicate the time of certain events (MAC):

  • Modification: when a file is modified.
  • Access: when a file was read or accessed
  • Creation: when a file was created.

Certain events such as creating, moving, copying, opening, editing, etc. might affect the MAC times. If the MAC timestamps can be attained, a timeline of events could be created.

~Steganography

Steganography is the art of hiding data in images or audio. While extraordinarily rare in the real world, steganography is often a popular CTF challenge. Steganography could be implemented using any kind of data as the “cover text” but media file formats are ideal because they tolerate a certain amount of unnoticeable data loss. One example is the Least Significant Bit (LSB) Steganography, where data is recorded in the lowest bit of a byte.

Files are made of bytes. Each byte is composed of eight bits. As shown in the images below, changing the least significant bit doesn’t affect the value very much.

The difficulty with steganography is that extracting the hidden message requires not only a detection that steganography has been used, but also the exact steganographic tool used to embed it. A bit of trial and error might be required.

Recommended tools to tackle steganography include:

  • Stegsolve: used to apply various steganography techniques to image files in an attempt to detect and extract hidden data.
  • Steghide: hidea data in various kinds of image- and audio files.
  • zsteg: detect hidden data in PNG and GMP files.
  • OpenStego: free steganography solution.
  • Foremost: a forensic program to recover lost files based on their headers, footers, and internal data structures.
  • StegOnline: online steganography tool.

~Filesystem Analysis

Occasionally, a forensic CTF challenge will involve a full disk image. A disk image is a computer file containing the contents and structure of a disk volume or of an entire data storage device, such as a hard disk drive. The first logical step will be to mount the disk image file. Below is an example of mounting a CD-ROM filesystem image:

mkdir /mnt/challenge

mount -t iso9660 challengefile /mnt/challenge

Searching for a flag in a mounted disk image is similar to finding a needle in this haystack — a strategy will be required. Once the filesystem is mounted, the tree command can be used to view the directory structure and see if anything sticks out requiring further analysis. Therefore, a bit of understanding and insight into well-known filesystems will be beneficial:

  • New Technology File System (NTFS): a modern, well-formed filesystem that is most commonly used by Windows.
  • File Allocation Table (FAT): a general purpose file system that is compatible with all major operating systems.
  • Extended (EXT) filesystem: created to be used with the Linux kernel (EXT4 is the most recent version).
  • Hierarchical File System (HFS) Plus: a file system developed by Apple for Mac OS X.

In certain cases, one might not be looking for a visible file within the filesystem, but rather a hidden volume, unallocated space (disk space that is not a part of any partition), a deleted file, or a non-file filesystem structure. For the recovery of deleted or missing files, the following tools are commended:

  • extundelete: find deleted files in EXT3 and EXT4 filesystems.
  • TestDisk: recover missing partition tables, fix corrupted ones, undelete files on FAT or NTFS, etc.

Autopsy is a powerful open-source toolkit for filesystem analysis. Although more geared toward law-enforcement tasks, available features can be helpful for tasks like searching for a keyword across the entire disk image, or looking at the unallocated space.

~Network Traffic Analysis

Network traffic is stored and captured as a Packet capture (PCAP) file using programs like tcpdump or Wireshark . A popular forensic CTF challenge is to provide a PCAP file representing some network traffic and challenge the player to recover/reconstitute a transferred file or transmitted secret. Complicating matters, the packets of interest are usually in an ocean of unrelated traffic, so analysis triage and filtering of the data are also required.

For initial analysis, take a high-level view of the packets using Wireshark’s statistics or conversations view, or applying the

capinfos

command. Wireshark, and its command-line version

tshark

both support the concept of using filters that can reduce the scope of the analysis. Alternatively, PCAP files up to 50MB can be submitted to an online service called

PacketTotal

which can graphically display timelines of connections and SSL metadata on the secure connections.

Approach or Solving technique

~Wrong Spooky Season

  • “I told them it was too soon and in the wrong season to deploy such a website, but they assured me that theming it properly would be enough to stop the ghosts from haunting us. I was wrong.” Now there is an internal breach in the Spooky Network and you need to find out what happened. Analyze the the network traffic and find how the scary ghosts got in and what they did.

Solution:

Downloading the challenge files, we get a PCAP file. Opening it in wireshark, we see the HTTP requests. Looking into them we can see the command injection.

Analyzing the http stream, we get the commands sent and one command has a base64 encoded string.

Decoding and reversing the string we get the flag.

PurpleThing

Step-1:

After downloading PurpleThing.jpeg from the cloud, I tried strings PurpleThing.jpeg | grep {.

Step-2:

So I tried binwalk PurpleThing.jpeg as the question suggests.

It showed me the following output:

Clearly, there is hidden data in there, let’s extract that.

Step-3:

I input a command of binwalk -D 'image:png' PurpleThing.jpeg and I get a directory named _PurpleThing.jpeg.extracted.

The contents are different files. In it '25795.png’ has the flag.

Step-4:

Finally the flag becomes:

HTF{b1nw4lk_is_us3ful}

and many more… You can always google for more 🛡️👨‍💻

Contribution

Dhruv Gupta

Adithya Sriram

Devansh Gupta

--

--

HackTheBox SRMIST focuses on training the next-gen of cyber-warriors transforming the cyber space in SRMIST and beyond. https://www.htbsrmist.tech