## TheHackerNews

#### Hackers Using Microsoft Build Engine to Deliver Malware Filelessly

###### 14 May 2021

Threat actors are abusing Microsoft Build Engine (MSBuild) to filelessly deliver remote access trojans and password-stealing malware on targeted Windows systems. The actively ongoing campaign is said to have emerged last month, researchers from cybersecurity firm Anomali said on Thursday, adding the malicious build files came embedded with encoded executables and shellcode that deploy backdoors,

#### Report to Your Management with the Definitive 'Incident Response for Management' Presentation Template

###### 14 May 2021

Security incidents occur. It's not a matter of 'if' but of 'when.' There are security products and procedures that were implemented to optimize the IR process, so from the 'security-professional' angle, things are taken care of. However, many security pros who are doing an excellent job in handling incidents find effectively communicating the ongoing process with their management a much more

###### 15 May 2021

Cybercriminals with suspected ties to Pakistan continue to rely on social engineering as a crucial component of its operations as part of an evolving espionage campaign against Indian targets, according to new research. The attacks have been linked to a group called Transparent Tribe, also known as Operation C-Major, APT36, and Mythic Leopard, which has created fraudulent domains mimicking

#### Magecart Hackers Now hide PHP-Based Backdoor In Website Favicons

###### 14 May 2021

Cybercrime groups are distributing malicious PHP web shells disguised as a favicon to maintain remote access to the compromised servers and inject JavaScript skimmers into online shopping platforms with an aim to steal financial information from their users. "These web shells known as Smilodon or Megalodon are used to dynamically load JavaScript skimming code via server-side requests into online

#### Big Cybersecurity Tips For Remote Workers Who Use Their Own Tech

###### 14 May 2021

As the total number of people working from home has grown dramatically in the last year or two, so has the number of individuals who use all of their own technology for their jobs. If you're a remote worker who relies on your own PC to get your work done, then you may be at a heightened risk for some of the major threats that are impacting the computer industry as a whole. Relatively few people

#### Rapid7 Source Code Breached in Codecov Supply-Chain Attack

###### 14 May 2021

Cybersecurity company Rapid7 on Thursday revealed that unidentified actors improperly managed to get hold of a small portion of its source code repositories in the aftermath of the software supply chain compromise targeting Codecov earlier this year. "A small subset of our source code repositories for internal tooling for our [Managed Detection and Response] service was accessed by an

#### Can Data Protection Systems Prevent Data At Rest Leakage?

###### 13 May 2021

Protection against insider risks works when the process involves controlling the data transfer channels or examining data sources. One approach involves preventing USB flash drives from being copied or sending them over email. The second one concerns preventing leakage or fraud in which an insider accesses files or databases with harmful intentions. What's the best way to protect your data? It

#### Dark Web Getting Loaded With Bogus Covid-19 Vaccines and Forged Cards

###### 14 May 2021

Bogus COVID-19 test results, fraudulent vaccination cards, and questionable vaccines are emerging a hot commodity on the dark web in what's the latest in a long list of cybercrimes capitalizing on the coronavirus pandemic. "A new and troubling phenomenon is that consumers are buying COVID-19 vaccines on the black market due to the increased demand around the world," said Anne An, a senior

#### Nearly All Wi-Fi Devices Are Vulnerable to New FragAttacks

###### 14 May 2021

Three design and multiple implementation flaws have been disclosed in IEEE 802.11 technical standard that undergirds Wi-Fi, potentially enabling an adversary to take control over a system and plunder confidential data. Called FragAttacks (short for FRgmentation and AGgregation Attacks), the weaknesses impact all Wi-Fi security protocols, from Wired Equivalent Privacy (WEP) all the way to Wi-Fi

#### Latest Microsoft Windows Updates Patch Dozens of Security Flaws

###### 12 May 2021

Microsoft on Tuesday rolled out its scheduled monthly security update with patches for 55 security flaws affecting Windows, Exchange Server, Internet Explorer, Office, Hyper-V, Visual Studio, and Skype for Business. Of these 55 bugs, four are rated as Critical, 50 are rated as Important, and one is listed as Moderate in severity. Three of the vulnerabilities are publicly known, although, unlike

#### Ransomware Gang Leaks Metropolitan Police Data After Failed Negotiations

###### 12 May 2021

The cybercrime syndicate behind Babuk ransomware has leaked more personal files belonging to the Metropolitan Police Department (MPD) after negotiations with the DC Police broke down, warning that they intend to publish all data if their ransom demands are not met. "The negotiations reached a dead end, the amount we were offered does not suit us, we are posting 20 more personal files on officers

#### LIVE Webinar — The Rabbit Hole of Automation

###### 11 May 2021

The concept of automation has taken on a life of its own in recent years. The idea is nothing new, but the current interest in automation is a mix of both hype and innovation. On the one hand, it's much easier today to automate everything from small processes to massive-scale tasks than it's ever been before. On the other hand, are we really prepared to hand the reins over to completely

#### U.S. Intelligence Agencies Warn About 5G Network Weaknesses

###### 12 May 2021

Inadequate implementation of telecom standards, supply chain threats, and weaknesses in systems architecture could pose major cybersecurity risks to 5G networks, potentially making them a lucrative target for cybercriminals and nation-state adversaries to exploit for valuable intelligence. The analysis, which aims to identify and assess risks and vulnerabilities introduced by 5G adoption, was

#### Experts warn of a new Android banking trojan stealing users' credentials

###### 11 May 2021

Cybersecurity researchers on Monday disclosed a new Android trojan that hijacks users' credentials and SMS messages to facilitate fraudulent activities against banks in Spain, Germany, Italy, Belgium, and the Netherlands. Called "TeaBot" (or Anatsa), the malware is said to be in its early stages of development, with malicious attacks targeting financial apps commencing in late March 2021,

#### U.S. Declares Emergency in 17 States Over Fuel Pipeline Cyber Attack

###### 11 May 2021

The ransomware attack against Colonial Pipeline's networks has prompted the U.S. Federal Motor Carrier Safety Administration (FMCSA) to issue a regional emergency declaration in 17 states and the District of Columbia (D.C.). The declaration provides a temporary exemption to Parts 390 through 399 of the Federal Motor Carrier Safety Regulations (FMCSRs), allowing alternate transportation of

#### Over 25% Of Tor Exit Relays Spied On Users' Dark Web Activities

###### 11 May 2021

An unknown threat actor managed to control more than 27% of the entire Tor network exit capacity in early February 2021, a new study on the dark web infrastructure revealed. "The entity attacking Tor users is actively exploiting tor users since over a year and expanded the scale of their attacks to a new record level," an independent security researcher who goes by the name nusenu said in a

#### Is it still a good idea to require users to change their passwords?

###### 10 May 2021

For as long as corporate IT has been in existence, users have been required to change their passwords periodically. In fact, the need for scheduled password changes may be one of the most long-standing of all IT best practices. Recently, however, things have started to change. Microsoft has reversed course on the best practices that it has had in place for decades and no longer recommends that

#### Four Plead Guilty to Aiding Cyber Criminals with Bulletproof Hosting

###### 09 May 2021

Four Eastern European nationals face 20 years in prison for Racketeer Influenced Corrupt Organization (RICO) charges after pleading guilty to providing bulletproof hosting services between 2008 and 2015, which were used by cybercriminals to distribute malware to financial entities across the U.S. The individuals, Aleksandr Grichishkin, 34, and Andrei Skvortsov, 34, of Russia; Aleksandr

#### Ransomware Cyber Attack Forced the Largest U.S. Fuel Pipeline to Shut Down

###### 10 May 2021

Colonial Pipeline, which carries 45% of the fuel consumed on the U.S. East Coast, on Saturday said it halted operations due to a ransomware attack, once again demonstrating how infrastructure is vulnerable to cyber attacks. "On May 7, the Colonial Pipeline Company learned it was the victim of a cybersecurity attack," the company said in a statement posted on its website. "We have since

###### 14 May 2021

WhatsApp on Friday disclosed that it won't deactivate accounts of users who don't accept its new privacy policy rolling out on May 15, adding it will continue to keep reminding them to accept the new terms. "No one will have their accounts deleted or lose functionality of WhatsApp on May 15 because of this update," the Facebook-owned messaging service said in a statement. The move marks a

#### Top 12 Security Flaws Russian Spy Hackers Are Exploiting in the Wild

###### 10 May 2021

Cyber operatives affiliated with the Russian Foreign Intelligence Service (SVR) have switched up their tactics in response to previous public disclosures of their attack methods, according to a new advisory jointly published by intelligence agencies from the U.K. and U.S. Friday. "SVR cyber operators appear to have reacted [...] by changing their TTPs in an attempt to avoid further detection and

###### 08 May 2021

Google has announced a number of user-facing and under-the-hood changes in an attempt to boost privacy and security, including rolling out two-factor authentication automatically to all eligible users and bringing iOS-styled privacy labels to Android app listings. "Today we ask people who have enrolled in two-step verification (2SV) to confirm it's really them with a simple tap via a Google

#### 6 Unpatched Flaws Disclosed in Remote Mouse App for Android and iOS

###### 07 May 2021

As many as six zero-days have been uncovered in an application called Remote Mouse, allowing a remote attacker to achieve full code execution without any user interaction. The unpatched flaws, collectively named 'Mouse Trap,' were disclosed on Wednesday by security researcher Axel Persinger, who said, "It's clear that this application is very vulnerable and puts users at risk with bad

## PacketStorm

#### Shining a Light on SolarCity: Practical Exploitation of the X2e IoT Device (Part One)

###### 17 Feb 2021

In 2019, Mandiant’s Red Team discovered a series of vulnerabilities present within Digi International’s ConnectPort X2e device, which allows for remote code execution as a privileged user. Specifically, Mandiant’s research focused on SolarCity’s (now owned by Tesla) rebranded ConnectPort X2e device, which is used in residential solar installations. Mandiant performs this type of work both for research purposes and in a professional capacity for their global clients.

Mandiant collaborated with Digi International and SolarCity/Tesla to responsibly disclose the results of the research, resulting in the following two CVEs:

Technical details can be found in Digi International’s 3.2.30.6 software release, and on FireEye’s Vulnerability Disclosures GitHub project (FEYE-2020-0019 and FEYE-2020-0020).

This two-part blog series will discuss our analysis at a high level, explore the novel techniques used to gain initial access to the ConnectPort X2e device, and share the technical details of the vulnerabilities discovered. Topics to be covered will include physical device inspection, debugging interface probing, chip-off techniques, firmware analysis, glitch attacks, and software exploitation.

If you’re interested in continuing the story in Part Two, you can read it now.

#### FAQ

What devices are affected, and (potentially) how many devices are affected?

The vulnerabilities described in this post affect ConnectPort X2e devices as well as the SolarCity rebranded variant. Other vendor devices may also be vulnerable. It is unclear how many ConnectPort X2e devices are deployed in the wild.

How is the issue being addressed?

Mandiant worked independently with Digi International and Tesla to remediate the vulnerabilities. Mandiant would like to thank Digi International and Tesla for their cooperation and dedication to improving the security of their products.

How would an attacker exploit these vulnerabilities?

An attacker with local network access (such as being connected to an individual’s home network via Ethernet) to a vulnerable X2e device can exploit CVE-2020-9306 and CVE-2020-12878 to gain privileged access to the device.

Who discovered these vulnerabilities?

Jake Valletta (@jake_valletta), Sam Sabetan (@samsabetan)

More information such as videos and datasheets on Mandiant’s Embedded Device Assessments can be found here.

#### Technical Analysis

##### Device Overview

Before diving into the details, we’ll discuss the ConnectPort X2e device (referred to as X2e device throughout the post) at a high level. The X2e device is a programmable gateway that connects to and collects data from ZigBee devices. It is commonly used as a Smart Energy gateway to interpret and send energy readings from a residential Solar Inverter. Vendors will often purchase an X2e device and configure it to read power consumption generated by a customer’s Solar Inverter. Figure 1 outlines a typical residential solar installation and highlights the X2e’s role.

Figure 1: Typical X2e residential deployment

For our research, we focused on the X2e device used by SolarCity, now Tesla, to retrieve data from residential solar installations. A typical setup would involve SolarCity providing a customer with a gateway that would be connected to the Internet via an Ethernet cable on the customer’s home network. Figure 2 shows one of the SolarCity branded X2e devices that we tested.

Figure 2: X2e device

Without even plugging in the X2e device, we know of at least two separate interfaces to explore: the Ethernet interface and the ZigBee radio. Note that we did not review the ZigBee interface between the X2e and a solar invertor, and that interface will not be covered in either Part One or Part Two of this series.

#### Initial Analysis and Physical Inspection

##### Network Reconnaissance

We started our research by assessing the X2e device from a network perspective. By using nmap, we discovered that the device exposed both SSH and HTTP/HTTPS, shown in Figure 3.

Figure 3: Port scan results from the X2e

Upon accessing these services remotely, we noted that both services required authentication. We also performed limited brute force attempts, which were unsuccessful. Additionally, the underlying services were not vulnerable to any public exploits. With not many network-based leads to follow, we shifted our analysis to a hardware perspective to determine if any local attacks may be possible to gain initial access onto the device.

##### Physical Board Inspection

To begin our hardware analysis, we removed the plastic casing from the device and mapped out the various integrated circuit (IC) components and searched for potential debugging interfaces. Inventorying the components present on the circuit board (also known as a PCB) is a crucial step in understanding how the device was designed and what can be expected down the road. Figure 4 shows the mapped-out components as well as a cluster of pins that resembled a typical 3-pin universal asynchronous transmit/receive (UART) connection, a common debugging interface on embedded devices.

Figure 4: X2e components and suspicious cluster of pins

Without a remote connection to the X2e device, UART is an attractive target. UART typically provides the equivalent functionality of a service like SSH or Telnet and the added benefit of watching verbose output during system boot. To determine if the cluster of pins was a UART interface, we first soldered a 3-pin through-hole header to the PCB. Using a combination of continuity tests with a multimeter and the digital logic analyzer Saleae, it became apparent that we were in fact dealing with a UART interface. The Figure 5 shows the three pins (Ground, TX, RX) connected to the header. Attached to the other end of the three wires was a FTDI serial TTL-232 to USB adapter, which was connected to a Linux virtual machine.

Figure 5: Connecting to potential UART interface

In addition to correctly identifying the UART pins and a UART to USB adapter, we also needed software to read/write from the interface as well as knowledge of the baud rate. Baud rates vary but typically follow standard values, including 9600, 14400, 19200, 38400, 57600, and 115200. Using the python module pySerial, we connected to the USB adapter and tried standard baud rates until one of the rates produced readable ASCII output (an incorrect baud rate will typically produce non-readable output), and determined the X2e used a baud rate of 115200.

Upon booting the X2e, we noted output from the BootROM, bootloader (which was Das U-Boot 2009.8, a common embedded bootloader), as well as output from the Linux kernel transmitted over the UART connection, shown in Figure 6.

Figure 6: UART boot messages

Many configurations of U-Boot allow a physically connected user (using an interface such as UART) the ability to interrupt the boot process; however, this configuration explicitly disabled that feature, shown in Figure 7.

Figure 7: Uninterruptable U-Boot bootloader on the X2e

Interrupting a bootloader is attractive to an attacker, as often the boot parameters passed to the Linux operating system can be manipulated to control how it will load, such as booting into single user mode (typically a recover shell) or mounting filesystems as read-write. In the case of the X2e, the UART connection was mapped to a Linux TTY which required username and password authentication, shown in Figure 8.

Figure 8: User authentication to Linux over UART

Without any ability to interrupt the boot process or credentials to authenticate to the X2e, we were faced with another dead end. We then shifted our analysis to obtaining the firmware stored on the X2e’s non-volatile storage.

##### Chip Removal and Data Extraction

In this section, we’ll cover the basics of non-volatile memory, often referred to as “flash memory”, present on embedded devices as well as the process used to extract content from the chip. As mentioned, taking inventory of the components on the PCB is an important first step. Figure 9 shows the suspected flash chip present on the PCB magnified under a digital microscope.

Figure 9: Closeup of Spansion flash

The visible markings seen in Figure 9 are important as they allow us to determine the manufacturer and model of the flash, which will assist us with obtaining the datasheet for the chip. In our case, the NAND we were dealing with was a Spansion S34ML01G1, and its datasheet could be found here.

##### NAND Overview

Before we talk about acquiring the firmware from the NAND chip, it’s important to first understand the various scenarios that embedded devices typically follow.

NAND verses NOR: These fundamentally different technologies each have their own benefits and drawbacks. NAND is cheap but suffers from high probability of “bad blocks,” or areas that are corrupt sometimes directly from the factory. As such, protections and considerations need to be present to be able to protect against this. NAND is also much faster to erase and write, making it ideal for storing file systems, kernels, and other pieces of code that may need to be reset or changed. NOR has significantly faster read times but is not as flexible with accessing data and has low erase and write speeds. NOR is usually used for low-level bootloaders, hardcoded firmware blobs, and other areas that are not expected to change frequently. The X2e uses a NAND flash.

Serial verses Parallel: This refers to how the data is accessed and is typically visually identifiable. If there are a large number of pins, the flash is likely parallel. Serial NOR chips can be small in size and typically need eight or fewer pins to function. Common serial interfaces are Serial Peripheral Interface (SPI) or Inter-Integrated Circuit (I2C), while a common parallel interface for NAND is Open NAND Flash Interface (ONFI2.0, ONFI3.0). The X2e is a parallel flash.

IC Form Factor: Another visually identifiable trait—form factor (or “package”)—refers to how the chip is attached to the PCB. There is a long list of options here, but common surface-mount flash packages include small outline package (SOP), thin outline small package (TOSP), or a variant of ball grid array (*BGA). The key distinction here is SOP and TOSP expose the pins, while BGA conceals the pins under the package. The X2e is BGA63, also referred to as a 63-pin BGA package.

Managed verses Unmanaged Flash: This one is more applicable to NAND, for reasons alluded to in the NAND verses NOR section. As stated, NAND needs help to manage the integrity of the data. With unmanaged NAND, the IC reserves sections of the flash (often called “spare” area) for someone else to manage the data. This is typically implemented as either a kernel driver or an external NAND controller. Managed NAND means that the IC package includes the controller and transparently manages the data. This is extremely common in embedded devices, as either embedded MMC (eMMC) or universal flash storage (UFS). The X2e uses unmanaged flash and is controlled by the main microcontroller present on the PCB.

With the basics out of the way, we proceeded with physically removing the chip from the PCB.

##### Chip Removal

Physical chip removal is considered a destructive approach but can certainly be performed without damaging the PCB or the flash chip itself. When presented with removal of BGA packages, the two most common removal techniques are either hot air or infrared light (IR). Commercial solutions exist for both hot air and IR, but cheaper options exist with hot air removal. We opted to use hot air on the X2e.

To minimize damage to the PCB and flash, a PCB heater or oven can be used to slowly bring the entire PCB to a temperature right below the solder melting point. This will reduce the amount of time we need to focus our hot air directly onto the flash IC and help with reducing the heat dissipation into the PCB throughout the process.

One final trick that can be used to minimize nearby chips from being damaged or lost (due to the air pressure) is the use of high-heat resistant tape, commonly referred to as Kapton tape. Figure 10 shows the PCB wrapped in Kapton tape to protect nearby components.

Figure 10: High-heat resistant tape on PCB

Figure 11 shows an example setup with the X2e PCB inserted into a PCB heater, with a hot air gun suspended over the IC.

Figure 11: Hot air rework/reflow station

While using the hot air to warm the IC and surrounding areas, we gently nudged the flash to see if the solder had become molten. Once the chip appeared to be floating, we quickly removed the chip and let it cool for about 30 seconds. Figure 12 shows the IC flash removed from the PCB, with the solder still present on the BGA pads.

Figure 12: NAND removed from X2e

Before inserting the NAND into a clam-shell chip reader, the leftover solder must be removed from the flash. This can be accomplished using a soldering iron, high-quality flux, and de-soldering wick. Once removed, isopropyl alcohol and a toothbrush are highly effective at removing the leftover flux residue and cleaning the chip.

In the next section, we’ll attempt to extract the data from the NAND chip using a multi-purpose chip programmer.

##### Data Extraction

With the cleaned flash chip in hand, we can now explore options for reading the raw contents. Commercial forensic acquisition devices exist, but a quick eBay or AliExpress search will produce a multitude of generic chip readers. One such device, the XGecu Pro, supports a variety of adapters and chipsets and connects to a Windows machine using USB. It also comes with software to interface with the XGecu Pro and can even auto-detect flash. To connect the Spansion NAND to the XGecu Pro, we also purchased a clamshell BGA63 adapter. Figure 13 shows the NAND inserted into the clamshell reader, and Figure 14 shows the clamshell adapter connected to the XGecu Pro device.

Figure 13: Spansion NAND in BGA clamshell adapter

Figure 14: NAND adapter connected to XGecu

Using the XGecu Pro software, we can read the entire contents of the flash to a binary file for further analysis. Since these are not commercial solutions, it is a good idea to perform two or three reads and then diff the extraction to confirm the content was read without errors.

#### Firmware Analysis

##### Cleaning and Mounting

With our fresh NAND dump in hand, the next step was to parse out any relevant firmware blobs, configurations, or filesystems. The go-to tool for starting this process is binwalk. binwalk does a fantastic job of detecting filesystems, bootloaders, and kernels. In addition, binwalk can calculate entropy (detecting packed or encrypted data) and identify assembly opcodes. Figure 15 shows partial output of running binwalk against the NAND dump.

Figure 15: Initial binwalk scan against NAND dump

We can see from the output that binwalk successfully identified what it believes are U-Boot uImage headers, Linux kernel images, and more than a dozen Journaling Flash File System version 2 (JFFS2) filesystems. JFFS2 is a common filesystem used in embedded devices; Unsorted Block Image File System (UBIFS) and SquashFS would also be common.

At first glance, the output appears to be promising; however, it is highly unlikely that there are actually that many JFFS2 filesystems present on our NAND. Another indication that something isn’t quite right are the hexadecimal offsets – they don’t appear to be clean, uniform offsets. It is far more common that the offsets of the items identified by binwalk would align with NAND page offsets, which are a multiple of 2048.

In order to understand what is occurring here, we need to revisit a characteristic of unmanaged (or “raw”) NAND ICs described in the NAND Overview section. To recap, raw NAND requires additional bytes per page for use by higher-level components to attest to the validity of the page, typically implemented as a defined “bad block” marker and a per-page (or subpage) Error-Correcting Code (ECC). Without going too deep into ECC fundamentals, ECC provides the ability for higher-level processes to detect n number of bad bits on a page and to correct m number of bits.

Since our goal here is not to perform forensics on the raw NAND, our immediate objective is to remove any ECC bytes or other non-data related bytes from the NAND dump. The MCU is ultimately the system manipulating the raw NAND, so understanding how our MCU, which was an NXP iMX28 series MCU, manages NAND is critical to being able to perform this.

Fortunately for us, this process has already been explored by the security community, and iMX parsing libraries exist to manipulate the raw NAND dump and remove existing extraneous data. Figure 16 shows the results of re-running binwalk on the output of the imx-nand-convert script.

Figure 16: binwalk scan of fixed NAND dump

This time, we see only one JFFS2 filesystem, at the very round offset of 0x880000. Using the extraction (-e) feature of binwalk, we can now obtain parsed versions of the U-Boot bootloader, Linux kernel, and JFFS2 system.

The final hurdle we need to overcome is mounting the extracted JFFS2 filesystem in a way that allows us to explore the contents. On Linux, the easiest way to perform this is to use the mtd, mtdblock, and nandsim kernel modules. The nandsim module simulates a given NAND device and uses the mtd and JFFS2 subsystems to parse and manage appropriately. The key piece of information that needs to be passed to the nandsim module is the ONFI chip identifier, which can be obtained from the NAND datasheet or by requesting the ID from the IC using a generic reader (like the XGecu Pro used in the Data Extraction section). A list of supported IDs is also provided by the mtd maintainers. Getting the parameters correct is a bit of luck and magic and may require you to compile your own version of the nandsim module; that process will not be covered in this post.

Figure 17 shows the steps required to simulate the correct Spansion NAND and mount the JFFS2 filesystem in the form of a Makefile target.

Figure 17: Makefile target to mount JFFS2 filesystem

By running make mount-jffs2, we can quickly prep and mount the JFFS2 filesystem and explore the contents as we would any filesystem.

##### Accessing the Filesystem

In the last section of this post, we’ll walk through our analysis of the JFFS2 filesystem. Remember that our end goal is to obtain a remotely exploitable bug that will permit privileged code execution. With that in mind, some areas of interest are running daemons/processes, system startup logic, and credentials for services listening on the network. The first stop was reviewing the /etc/shadow file to see if there were password hashes for the root user as well as other system users. A quick check of this file determined there was no password hash for the root user, which indicated we would not be able to authenticate using password authentication. We noticed that two other password hashes were present, for the addpd and python users, shown in Figure 18.

The addpd user had a weak default password but was unable to authenticate using remote methods, and we were ultimately unable to crack the python user’s hash using internal GPU-based servers.

Additionally, we were interested in processes that are launched during system boot or post-boot. The directory /WEB/python/ contained a ZIP archive called _x2e.zip, which contained over 200 compiled Python scripts (PYC files), which were loaded on system boot. Using the decompiler uncompyle2, we unpacked these files for review. One file that stood out by name was password_manager.pyc, a file used to reset the login password upon successful boot-up. The file contained five hardcoded and plaintext credentials that mapped to the python system user. These credentials could be used to access the web interface and SSH, shown in Figure 19. Mandiant confirmed different passwords were used for different versions and connectivity states. Mandiant reported this to SolarCity and was assigned the CVE number CVE-2020-9306.

Figure 19: Hardcoded credentials in password_manager.pyc

With the correct password, we were finally able to connect to the web and SSH ports on a running X2e, but unfortunately only as the less-privileged python system user. While this was a great start, it didn’t satisfy our final objective, which was to remotely compromise the X2e as a privileged user. In Part Two of this blog series, we will explore additional avenues to further compromise the X2e.

#### Conclusion

In Part One of this two-part blog series, we covered an overview of the X2e, our initial network-based reconnaissance, PCB inspection techniques, physical debugging interface probing, chip-off techniques, and firmware analysis. Using these methodologies, we were successfully able to remotely compromise the X2e device as a non-administrative user due to hardcoded credentials (CVE-2020-9306). In Part Two, we will re-investigate physical attacks against the X2e in the form of glitch attacks, re-explore the U-Boot bootloader, and finally demonstrate an attack to remotely compromise the X2e device as a privileged user.

To continue reading, check out Part Two now

#### Mandiant Exposes APT1 – One of China's Cyber Espionage Units & Releases 3,000 Indicators

###### 19 Feb 2013

Today, The Mandiant® Intelligence Center™ released an unprecedented report exposing APT1's multi-year, enterprise-scale computer espionage campaign. APT1 is one of dozens of threat groups Mandiant tracks around the world and we consider it to be one of the most prolific in terms of the sheer quantity of information it has stolen.

Highlights of the report include:

• Evidence linking APT1 to China's 2nd Bureau of the People's Liberation Army (PLA) General Staff Department's (GSD) 3rd Department (Military Cover Designator 61398).
• A timeline of APT1 economic espionage conducted since 2006 against 141 victims across multiple industries.
• APT1's modus operandi (tools, tactics, procedures) including a compilation of videos showing actual APT1 activity.
• The timeline and details of over 40 APT1 malware families.
• The timeline and details of APT1's extensive attack infrastructure.

Mandiant is also releasing a digital appendix with more than 3,000 indicators to bolster defenses against APT1 operations. This appendix includes:

• Digital delivery of over 3,000 APT1 indicators, such as domain names, and MD5 hashes of malware.
• Thirteen (13) X.509 encryption certificates used by APT1.
• A set of APT1 Indicators of Compromise (IOCs) and detailed descriptions of over 40 malware families in APT1's arsenal of digital weapons.
• IOCs that can be used in conjunction with Redline™, Mandiant's free host-based investigative tool, or with Mandiant Intelligent Response® (MIR), Mandiant's commercial enterprise investigative tool.

The scale and impact of APT1's operations compelled us to write this report. The decision to publish a significant part of our intelligence about Unit 61398 was a painstaking one. What started as a "what if" discussion about our traditional non-disclosure policy quickly turned into the realization that the positive impact resulting from our decision to expose APT1 outweighed the risk of losing much of our ability to collect intelligence on this particular APT group. It is time to acknowledge the threat is originating from China, and we wanted to do our part to arm and prepare security professionals to combat the threat effectively. The issue of attribution has always been a missing link in the public's understanding of the landscape of APT cyber espionage. Without establishing a solid connection to China, there will always be room for observers to dismiss APT actions as uncoordinated, solely criminal in nature, or peripheral to larger national security and global economic concerns. We hope that this report will lead to increased understanding and coordinated action in countering APT network breaches.

We recognize that no one entity can understand the entire complex picture that many years of intense cyber espionage by a single group creates. We look forward to seeing the surge of data and conversations a report like this will likely generate.

Dan McWhorter

Managing Director, Threat Intelligence

#### Shining a Light on SolarCity: Practical Exploitation of the X2e IoT Device (Part Two)

###### 17 Feb 2021

In this post, we continue our analysis of the SolarCity ConnectPort X2e Zigbee device (referred to throughout as X2e device). In Part One, we discussed the X2e at a high level, performed initial network-based attacks, then discussed the hardware techniques used to gain a remote shell on the X2e device as a non-privileged system user. In this segment, we’ll cover how we obtained a privileged shell on the device locally using power glitching attacks, and explore CVE-2020-12878, a vulnerability we discovered that permitted remote privilege escalation to the root user. Combined with CVE-2020-9306 (discussed in Part One), this would result in a complete remote compromise of the X2e device.

#### Technical Analysis

##### Recap

Before we dive into next steps, let’s recap where we left off:

• The X2e has an exposed universal asynchronous transmit/receive (UART) interface, which allows a physically connected user to view (but not interrupt) the Das U-Boot (U-Boot) boot process, and given proper credentials, authenticate to the Linux operating system. Since we do not have root credentials, we put this thread on the backburner.
• We have a full NAND dump of the Spansion raw flash, which includes boot configuration, bootloader firmware, filesystems, and the Linux kernel image. This was used previously in Part One to obtain the hardcoded credential for the python user.

#### Gaining Privileged Access Locally

Figure 1 shows the U-Boot boot process displayed while connected via UART connection. In some cases, it is possible to send keyboard input to the device during a set period (usually one to four seconds) when the bootloader presents the message, “Hit any key to stop autoboot,” which interrupts the boot process and drops the user into a U-Boot shell. On the X2e, this feature has been disabled by setting the U-Boot configuration parameter CONFIG_BOOTDELAY to 0.

Figure 1: Uninterruptable U-Boot bootloader output

One attack that has been documented to be successful to disrupt autoboot is to manipulate the bootloader’s ability to access the flash storage during the boot process. In certain circumstances where the U-Boot bootloader is unable to access its own configuration, it fails into a default environment, which may be less restricted. We decided to see if this would be possible on the X2e.

These attacks, known as glitch attacks (or more officially known as fault-injection), are a type of side channel attack that attempts to cause a microcontroller unit (MCU) to skip instructions, perform wrong instructions, or fail to access flash memory. Various types of glitching attacks exist including electrical, thermal, and radiation. Based on our objective, we opted to try glitching the power between the MCU and the Spansion NAND flash. Note that glitch attacks can often cause damage to the components on a board or put the device in an unusable state. These types of attacks should be tested as either a last resort or against a secondary device you are comfortable with damaging.

Based on previous research in this domain, we opted to target the data lines (I/O) between the MCU and NAND flash. Recall from Part One that the NAND flash on the X2e was the Spansion S34ML01G1, which was a 63-pin ball grid array (BGA) package. This chip is capable of supporting both 8-bit and 16-bit bus width, which corresponds to the number of I/O lines utilized. By using the datasheet for the flash and then querying the ONFI Device ID of our chip, we determined our chip was utilizing the 8-bit configuration, meaning eight I/O lines were present between the NAND flash and the MCU. For this attack, we focused on manipulating the power on the first (I/O0) data line. Figure 2 shows the configuration of the BGA-63 pins, with I/O0 highlighted.

Figure 2: Identifying I/O0 for NAND chip in the Spansion datasheet

Because the pins are actually underneath the flash package, we needed to find an exposed lead that corresponded to I/O0 elsewhere on the PCB. One such method for tracing connections across a PCB is a continuity test. A continuity test (using a multimeter) sends a low current electrical signal across two points and produces an audible beep if the points are connected. Using this technique, we located an exposed test point (known as a via) on the bottom of the PCB. Figure 3 shows the I/O0 pin on the top of the PCB (under the NAND chip), and Figure 4 shows the I/O0 pin exposed on the bottom of the PCB.

Figure 3: I/O0 on top of PCB (under NAND chip)

Figure 4: I/O0 on bottom of PCB

With exposed access to I/O0 located, we experimented with connecting this pin directly to a known ground (GND) pin at various points during the boot process. Figure 5 shows the device powering on with the metal tweezers connecting I/O0 to GND.

Figure 5: Shorting I/O0 to GND

While connected to the UART interface, we noted several different outcomes. When shorting the pin immediately after powering on, the device failed to produce any output or boot. When shorting after the bootloader finished loading (and handing off to the Linux kernel), the device would also force reboot. However, when timed perfectly between the bootloader loading and attempting to read its configuration, we noted that the bootloader would present different output, and the option to interrupt the boot process was possible with a four-second delay. By pressing keyboard input, we were successfully able to drop into a U-Boot shell, which is shown in Figure 6.

While this was great progress, we noted that the current failback bootloader configuration was completely inoperable and certain NAND blocks had been marked as bad (as expected). To get our device back to a working state, we needed to revisit the NAND dump we generated in Part One.

While the current configuration provided us a working shell, we needed to fix the damage we had done. This was performed in two steps: fixing the mistakenly marked bad blocks and then rebuilding the configuration. In our case, the nand utility and its sub-commands read, write, and scrub allowed us to inspect and manipulate pages and blocks of the NAND. The nand scrub command with a valid offset and size could be used to completely reset a segment of the NAND, which removed any bad block markers. The next challenge was determining what needed to be replaced in the damaged blocks and rebuilding the configuration.

Since we had a valid NAND image, we revisited the sections read by the bootloader to determine what changes were needed. The format did not match a known format, so we wrote a simple parser in Python to read the binary structure, shown in Figure 7.

Figure 7: Parsing bootloader nvram configuration from flash

With details of how the configuration should look, we used the nand write to rebuild this section, byte by byte with the correct details. We also set the boot delay to be four seconds, so that we could always interrupt the bootloader once the new configuration was committed. Once we confirmed our changes were stable, we saved the configuration to flash and could access the bootloader without performing the aforementioned glitch attack.

##### Accessing Linux as root User

Now that we have unrestricted access to the bootloader, we can finally influence the rest of the boot process and achieve a privileged shell. We alluded to this in Part One, but the easiest way to turn an unlocked U-Boot shell into a root Linux shell is to adjust the boot arguments that U-Boot passes to the Linux kernel. In our case, this was accomplished by using the setenv utility to change the std_bootarg environment variable to be init=/bin/sh and instructing U-Boot to resume the standard boot process. Figure 8 shows the Linux shell presented over UART.

Figure 8: root shell after bootloader

At this point, we’ve demonstrated a repeatable method for achieving local privilege escalation. In the final segment, we’ll complete our attack by exploring an avenue to remotely escalate privileges.

##### Gaining Privileged Access Remotely

Since the X2e has only two available listening network services, it makes sense to reinvestigate these services. During Part One, we identified hardcoded credentials for the limited user python. This was useful for initial probing of the device while it was running, but where do we go from here?

Embedded devices typically only have a handful of users, with a majority of functionality being performed by the root user. This presents an interesting opportunity for us to abuse overlap between actions performed by the root user on contents owned and controlled by the python user.

By reviewing the boot process, we noted a large number of custom init scripts in the /etc/init.d/ directory. These scripts are executed at system start by the root user and were responsible for starting daemons and ensuring directories or files exist. One file in particular, /etc/init.d/S50dropbear.sh, was interesting to us, as it appeared to perform a number of actions on files within the directory specified by the $PYTHON_HOME variable, which was /WEB/python/, shown in Figure 9. Figure 9: Unsafe operations on$PYTHON_HOME directory

At first glance this may seem benign but considering that the /WEB/python/ directory is controllable by the python user, it means that we can potentially control actions taken by root. More specifically, the chown operation is dangerous, as the previous mkdir command can fail silently and result in an unsafe chown operation. To weaponize this, we can use symbolic links to point the /WEB/python/.ssh/ to other areas of the filesystem and coerce the root process into chown’ing these files to be owned by the python user. The process we took to exploit this was as follows:

1. Authenticate over SSH using hardcoded python user credentials.
2. Create a symbolic link, /WEB/python/.ssh, that points to /etc/init.d/.
3. Reboot the X2e, forcing the system to re-execute /etc/init.d/S50dropbear.sh.
4. After boot completes, create a malicious init script in /etc/init.d/ as the python user.
5. Reboot the X2e, forcing the system to execute the new init script.

While not the cleanest approach (it requires two reboots), it accomplishes the goal of achieving code execution as root. Figure 10 shows the output of our proof of concept. In this case, our malicious init script spawned a bind shell on TCP port 8080, so that we could connect in as root.

Figure 10: Exploiting chown vulnerability to gain shell as user root

And there we have it: a remote connection as root, by abusing two separate vulnerabilities. While not explored in this series, another viable avenue of attack would be to explore potential vulnerabilities in the web server listening on TCP ports 80 and 443; however, this was not an approach that we took.

#### Conclusion

We covered a wide variety of topics in this two-part series, including:

• Physical device inspection
• Identifying and exploring physical debugging interfaces (UART)
• Chip-off techniques to remove the NAND storage
• Binary analysis of the filesystems and bootloader configurations
• Power glitch attacks against the U-Boot bootloader
• Linux user space privilege escalation

We hope that readers were able to learn from our experiences with the X2e and will be inspired to use these techniques in their own analysis. Finally, Mandiant would like to thank both Tesla/SolarCity and Digi International for their efforts to remediate these vulnerabilities and for their cooperation with releasing this blog series.

#### Introduction

In December 2017, FireEye's Mandiant discussed an incident response involving the TRITON framework. The TRITON attack and many of the publicly discussed ICS intrusions involved routine techniques where the threat actors used only what is necessary to succeed in their mission. For both INDUSTROYER and TRITON, the attackers moved from the IT network to the OT (operational technology) network through systems that were accessible to both environments. Traditional malware backdoors, Mimikatz distillates, remote desktop sessions, and other well-documented, easily-detected attack methods were used throughout these intrusions.

Despite the routine techniques employed to gain access to an OT environment, the threat actors behind the TRITON malware framework invested significant time learning about the Triconex Safety Instrumented System (SIS) controllers and TriStation, a proprietary network communications protocol. The investment and purpose of the Triconex SIS controllers leads Mandiant to assess the attacker's objective was likely to build the capability to cause physical consequences.

TriStation remains closed source and there is no official public information detailing the structure of the protocol, raising several questions about how the TRITON framework was developed. Did the actor have access to a Triconex controller and TriStation 1131 software suite? When did development first start? How did the threat actor reverse engineer the protocol, and to what extent? What is the protocol structure?

FireEye’s Advanced Practices Team was born to investigate adversary methodologies, and to answer these types of questions, so we started with a deeper look at the TRITON’s own Python scripts.

Glossary:

• TRITON – Malware framework designed to operate Triconex SIS controllers via the TriStation protocol.
• TriStation – UDP network protocol specific to Triconex controllers.
• TRITON threat actor – The human beings who developed, deployed and/or operated TRITON.

#### Diving into TRITON's Implementation of TriStation

TriStation is a proprietary network protocol and there is no public documentation detailing its structure or how to create software applications that use TriStation. The current TriStation UDP/IP protocol is little understood, but natively implemented through the TriStation 1131 software suite. TriStation operates by UDP over port 1502 and allows for communications between designated masters (PCs with the software that are “engineering workstations”) and clients (Triconex controllers with special communications modules) over a network.

To us, the Triconex systems, software and associated terminology sound foreign and complicated, and the TriStation protocol is no different. Attempting to understand the protocol from ground zero would take a considerable amount of time and reverse engineering effort – so why not learn from TRITON itself? With the TRITON framework containing TriStation communication functionality, we pursued studying the framework to better understand this mysterious protocol. Work smarter, not harder, amirite?

The TRITON framework has a multitude of functionalities, but we started with the basic components:

• TS_cnames.pyc # Compiled at: 2017-08-03 10:52:33
• TsBase.pyc # Compiled at: 2017-08-03 10:52:33
• TsHi.pyc # Compiled at: 2017-08-04 02:04:01
• TsLow.pyc # Compiled at: 2017-08-03 10:46:51

TsLow.pyc (Figure 1) contains several pieces of code for error handling, but these also present some cues to the protocol structure.

Figure 1: TsLow.pyc function print_last_error()

In the TsLow.pyc’s function for print_last_error we see error handling for “TCM Error”. This compares the TriStation packet value at offset 0 with a value in a corresponding array from TS_cnames.pyc (Figure 2), which is largely used as a “dictionary” for the protocol.

Figure 2: TS_cnames.pyc TS_cst array

From this we can infer that offset 0 of the TriStation protocol contains message types. This is supported by an additional function, tcm_result, which declares type, size = struct.unpack('

Since there are only 11 defined message types, it really doesn't matter much if the type is one byte or two because the second byte will always be 0x00.

We also have indications that message type 5 is for all Execution Command Requests and Responses, so it is curious to observe that the TRITON developers called this “Command Reply.” (We won’t understand this naming convention until later.)

Next we examine TsLow.pyc’s print_last_error function (Figure 3) to look at “TS Error” and “TS_names.” We begin by looking at the ts_err variable and see that it references ts_result.

Figure 3: TsLow.pyc function print_last_error() with ts_err highlighted

We follow that thread to ts_result, which defines a few variables in the next 10 bytes (Figure 4): dir, cid, cmd, cnt, unk, cks, siz = struct.unpack('<, ts_packet[0:10]). Now things are heating up. What fun. There’s a lot to unpack here, but the most interesting thing is how this piece script breaks down 10 bytes from ts_packet into different variables.

Figure 4: ts_result with ts_packet header variables highlighted

Figure 5: tcm_result

Referencing tcm_result (Figure 5) we see that it defines type and size as the first four bytes (offset 0 – 3) and tcm_result returns the packet bytes 4:-2 (offset 4 to the end minus 2, because the last two bytes are the CRC-16 checksum). Now that we know where tcm_result leaves off, we know that the ts_reply “cmd” is a single byte at offset 6, and corresponds to the values in the TS_cnames.pyc array and TS_names (Figure 6). The TRITON script also tells us that any integer value over 100 is a likely “command reply.” Sweet.

When looking back at the ts_result packet header definitions, we begin to see some gaps in the TRITON developer's knowledge: dir, cid, cmd, cnt, unk, cks, siz = struct.unpack('<, ts_packet[0:10]). We're clearly speculating based on naming conventions, but we get an impression that offsets 4, 5 and 6 could be "direction", "controller ID" and "command", respectively. Values such as "unk" show that the developer either did not know or did not care to identify this value. We suspect it is a constant, but this value is still unknown to us.

Figure 6: Excerpt TS_cnames.pyc TS_names array, which contain TRITON actor’s notes for execution command function codes

#### TriStation Protocol Packet Structure

The TRITON threat actor’s knowledge and reverse engineering effort provides us a better understanding of the protocol. From here we can start to form a more complete picture and document the basic functionality of TriStation. We are primarily interested in message type 5, Execution Command, which best illustrates the overall structure of the protocol. Other, smaller message types will have varying structure.

Figure 7: Sample TriStation "Allocate Program" Execution Command, with color annotation and protocol legend

#### Corroborating the TriStation Analysis

Minute discrepancies aside, the TriStation structure detailed in Figure 7 is supported by other public analyses. Foremost, researchers from the Coordinated Science Laboratory (CSL) at University of Illinois at Urbana-Champaign published a 2017 paper titled "Attack Induced Common-Mode Failures on PLC-based Safety System in a Nuclear Power Plant". The CSL team mentions that they used the Triconex System Access Application (TSAA) protocol to reverse engineer elements of the TriStation protocol. TSAA is a protocol developed by the same company as TriStation. Unlike TriStation, the TSAA protocol structure is described within official documentation. CSL assessed similarities between the two protocols would exist and they leveraged TSAA to better understand TriStation. The team's overall research and analysis of the general packet structure aligns with our TRITON-sourced packet structure.

There are some awesome blog posts and whitepapers out there that support our findings in one way or another. Writeups by Midnight Blue Labs, Accenture, and US-CERT each explain how the TRITON framework relates to the TriStation protocol in superb detail.

#### TriStation's Reverse Engineering and TRITON's Development

When TRITON was discovered, we began to wonder how the TRITON actor reverse engineered TriStation and implemented it into the framework. We have a lot of theories, all of which seemed plausible: Did they build, buy, borrow, or steal? Or some combination thereof?

Our initial theory was that the threat actor purchased a Triconex controller and software for their own testing and reverse engineering from the "ground up", although if this was the case we do not believe they had a controller with the exact vulnerable firmware version, else they would have had fewer problems with TRITON in practice at the victim site. They may have bought or used a demo version of the TriStation 1131 software, allowing them to reverse engineer enough of TriStation for the framework. They may have stolen TriStation Python libraries from ICS companies, subsidiaries or system integrators and used the stolen material as a base for TriStation and TRITON development. But then again, it is possible that they borrowed TriStation software, Triconex hardware and Python connectors from government-owned utility that was using them legitimately.

Looking at the raw TRITON code, some of the comments may appear oddly phrased, but we do get a sense that the developer is clearly using many of the right vernacular and acronyms, showing smarts on PLC programming. The TS_cnames.pyc script contains interesting typos such as 'Set lable', 'Alocate network accepted', 'Symbol table ccepted' and 'Set program information reponse'. These appear to be normal human error and reflect neither poor written English nor laziness in coding. The significant amount of annotation, cascading logic, and robust error handling throughout the code suggests thoughtful development and testing of the framework. This complicates the theory of "ground up" development, so did they base their code on something else?

While learning from the TriStation functionality within TRITON, we continued to explore legitimate TriStation software. We began our search for "TS1131.exe" and hit dead ends sorting through TriStation DLLs until we came across a variety of TriStation utilities in MSI form. We ultimately stumbled across a juicy archive containing "Trilog v4." Upon further inspection, this file installed "TriLog.exe," which the original TRITON executable mimicked, and a couple of supporting DLLs, all of which were timestamped around August 2006.

When we saw the DLL file description "Tricon Communications Interface" and original file name "TricCom.DLL", we knew we were in the right place. With a simple look at the file strings, "BAZINGA!" We struck gold.

 File Name tr1com40.dll MD5 069247DF527A96A0E048732CA57E7D3D Size 110592 Compile Date 2006-08-23 File Description Tricon Communications Interface Product Name TricCom Dynamic Link Library File Version 4.2.441 Original File Name TricCom.DLL Copyright Copyright © 1993-2006 Triconex Corporation

The tr1com40.DLL is exactly what you would expect to see in a custom application package. It is a library that helps support the communications for a Triconex controller. If you've pored over TRITON as much as we have, the moment you look at strings you can see the obvious overlaps between the legitimate DLL and TRITON's own TS_cnames.pyc.

Figure 8: Strings excerpt from tr1com40.DLL

Each of the execution command "error codes" from TS_cnames.pyc are in the strings of tr1com40.DLL (Figure 8). We see "An MP has re-educated" and "Invalid Tristation I command". Even misspelled command strings verbatim such as "Non-existant data item" and "Alocate network accepted". We also see many of the same unknown values. What is obvious from this discovery is that some of the strings in TRITON are likely based on code used in communications libraries for Trident and Tricon controllers.

In our brief survey of the legitimate Triconex Corporation binaries, we observed a few samples with related string tables.

 Pe:dllname Compile Date Reference CPP Strings Code Lagcom40.dll 2004/11/19 $Workfile: LAGSTRS.CPP$ $Modtime: Jul 21 1999 17:17:26$ $Revision: 1.0 Tr1com40.dll 2006/08/23$Workfile:   TR1STRS.CPP  Modtime:   May 16 2006 09:55:20  Revision:   1.4 Tridcom.dll 2008/07/23 $Workfile: LAGSTRS.CPP$ $Modtime: Jul 21 1999 17:17:26$ $Revision: 1.0 Triccom.dll 2008/07/23$Workfile:   TR1STRS.CPP  Modtime:   May 16 2006 09:55:20  Revision:   1.4 Tridcom.dll 2010/09/29 $Workfile: LAGSTRS.CPP$ $Modtime: Jul 21 1999 17:17:26$ $Revision: 1.0 Tr1com.dll 2011/04/27$Workfile:   TR1STRS.CPP  Modtime:   May 16 2006 09:55:20  Revision:   1.4 Lagcom.dll 2011/04/27 $Workfile: LAGSTRS.CPP$ $Modtime: Jul 21 1999 17:17:26$ $Revision: 1.0 Triccom.dll 2011/04/27$Workfile:   TR1STRS.CPP  Modtime:   May 16 2006 09:55:20  Revision:   1.4

We extracted the CPP string tables in TR1STRS and LAGSTRS and the TS_cnames.pyc TS_names array from TRITON, and compared the 210, 204, and 212 relevant strings from each respective file.

TS_cnames.pyc TS_names and tr1com40.dll share 202 of 220 combined table strings. The remaining strings are unique to each, as seen here:

 TS_cnames.TS_names (2017 pyc) Tr1com40.dll (2006 CPP) Go to DOWNLOAD mode <200> Not set <209> Unk75 Bad message from module Unk76 Bad message type Unk77 Bad TMI version number Unk78 Module did not respond Unk79 Open Connection: Invalid SAP %d Unk81 Unsupported message for this TMI version Unk83 Wrong command

TS_cnames.pyc TS_names and Tridcom.dll (1999 CPP) shared only 151 of 268 combined table strings, showing a much smaller overlap with the seemingly older CPP library. This makes sense based on the context that Tridcom.dll is meant for a Trident controller, not a Tricon controller. It does seem as though Tr1com40.dll and TR1STRS.CPP code was based on older work.

We are not shocked to find that the threat actor reversed legitimate code to bolster development of the TRITON framework. They want to work smarter, not harder, too. But after reverse engineering legitimate software and implementing the basics of the TriStation, the threat actors still had an incomplete understanding of the protocol. In TRITON's TS_cnames.pyc we saw "Unk75", "Unk76", "Unk83" and other values that were not present in the tr1com40.DLL strings, indicating that the TRITON threat actor may have explored the protocol and annotated their findings beyond what they reverse engineered from the DLL. The gaps in TriStation implementation show us why the actors encountered problems interacting with the Triconex controllers when using TRITON in the wild.

You can see more of the Trilog and Triconex DLL files on VirusTotal.

 Item Name MD5 Description Tr1com40.dll 069247df527a96a0e048732ca57e7d3d Tricom Communcations DLL Data1.cab e6a3c93a6d433cbaf6f573b6c09d76c4 Parent of Tr1com40.dll Trilog v4.1.360R 13a3b83ba2c4236ca59aba679941c8a5 RAR Archive of TriLog TridCom.dll 5c2ed617fdec4779cb33c89082a43100 Trident Communications DLL

#### Afterthoughts

Seeing Triconex systems targeted with malicious intent was new to the world six months ago. Moving forward it would be reasonable to anticipate additional frameworks, such as TRITON, designed for usage against other SIS controllers and associated technologies. If Triconex was within scope, we may see similar attacker methodologies affecting the dominant industrial safety technologies.

Basic security measures do little to thwart truly persistent threat actors and monitoring only IT networks is not an ideal situation. Visibility into both the IT and OT environments is critical for detecting the various stages of an ICS intrusion. Simple detection concepts such as baseline deviation can provide insight into abnormal activity.

While the TRITON framework was actively in use, how many traditional ICS “alarms” were set off while the actors tested their exploits and backdoors on the Triconex controller? How many times did the TriStation protocol, as implemented in their Python scripts, fail or cause errors because of non-standard traffic? How many TriStation UDP pings were sent and how many Connection Requests? How did these statistics compare to the baseline for TriStation traffic? There are no answers to these questions for now. We believe that we can identify these anomalies in the long run if we strive for increased visibility into ICS technologies.

We hope that by holding public discussions about ICS technologies, the Infosec community can cultivate closer relationships with ICS vendors and give the world better insight into how attackers move from the IT to the OT space. We want to foster more conversations like this and generally share good techniques for finding evil. Since most of all ICS attacks involve standard IT intrusions, we should probably come together to invent and improve any guidelines for how to monitor PCs and engineering workstations that bridge the IT and OT networks. We envision a world where attacking or disrupting ICS operations costs the threat actor their cover, their toolkits, their time, and their freedom. It's an ideal world, but something nice to shoot for.

#### Thanks and Future Work

There is still much to do for TRITON and TriStation. There are many more sub-message types and nuances for parsing out the nitty gritty details, which is hard to do without a controller of our own. And although we’ve published much of what we learned about the TriStation here on the blog, our work will continue as we continue our study of the protocol.

Thanks to everyone who did so much public research on TRITON and TriStation. We have cited a few individuals in this blog post, but there is a lot more community-sourced information that gave us clues and leads for our research and testing of the framework and protocol. We also have to acknowledge the research performed by the TRITON attackers. We borrowed a lot of your knowledge about TriStation from the TRITON framework itself.

Finally, remember that we're here to collaborate. We think most of our research is right, but if you notice any errors or omissions, or have ideas for improvements, please spear phish contact: smiller@fireeye.com.

#### Appendix A: TriStation Message Type Codes

The following table consists of hex values at offset 0 in the TriStation UDP packets and the associated dictionary definitions, extracted verbatim from the TRITON framework in library TS_cnames.pyc.

 Value at 0x0 Message Type 1 Connection Request 2 Connection Response 3 Disconnect Request 4 Disconnect Response 5 Execution Command 6 Ping Command 7 Connection Limit Reached 8 Not Connected 9 MPS Are Dead 10 Access Denied 11 Connection Failed

#### Appendix B: TriStation Execution Command Function Codes

The following table consists of hex values at offset 6 in the TriStation UDP packets and the associated dictionary definitions, extracted verbatim from the TRITON framework in library TS_cnames.pyc.

#### FLARE VM Update

###### 14 Nov 2018

FLARE VM is the first of its kind reverse engineering and malware analysis distribution on Windows platform. Since its introduction in July 2017, FLARE VM has been continuously trusted and used by many reverse engineers, malware analysts, and security researchers as their go-to environment for analyzing malware. Just like the ever-evolving security industry, FLARE VM has gone through many major changes to better support our users’ needs. FLARE VM now has a new installation, upgrade, and uninstallation process, which is a long anticipated feature requested by our users. FLARE VM also includes many new tools such as IDA 7.0, radare and YARA. Therefore, we would like to share these updates, especially the new installation process.

#### Installation

We strongly recommend you use FLARE VM within a virtualized environment for malware analysis to protect and isolate your physical device and network from malicious activities. We assume you already have experience setting up and configuring your own virtualized environment. Please create a new virtual machine (VM) and perform a fresh installation of Windows. FLARE VM is designed to be installed on Windows 7 Service Pack 1 or newer; therefore, you can select a version of windows that best suits your needs. From this point forward, all installation steps should be performed within your VM.

Once you have a VM with a fresh installation of Windows, use one of the following URLs to download the compressed FLARE VM repository onto your VM:

Then, use the following steps to install FLARE VM:

1. Decompress the FLARE VM repository to a directory of your choosing.
2. Start a new session of PowerShell with escalated privileges. FLARE VM attempts to install additional software and modify system settings; therefore, escalated privileges are required for installation.
3. Within PowerShell, change directory to the location where you have decompressed the FLARE VM repository.
4. Enable unrestricted execution policy for PowerShell by executing the following command and answering “Y” when prompted by PowerShell: Set-ExecutionPolicy unrestricted
5. Execute the install.ps1 installation script. You will be prompted to enter the current user’s password. FLARE VM needs the current user’s password to automatically login after a reboot when installing. Optionally, you can specify the current user’s password by passing the “-password ” at the command line.

Figure 2: Start PowerShell as administrator

Figure 3: Ready to install FLARE VM

The rest of the installation process is fully automated. Depending upon your internet speed the entire installation may take up to one hour to finish. The VM also reboots multiple times due to the numerous software installations’ requirements. Once the installation completes, the PowerShell prompt remains open waiting for you to hit any key before exiting. After completing the installation, you will be presented with the following desktop environment:

Figure 4: FLARE VM installation completes

Congratulations! You have successfully installed FLARE VM. At this point we recommend you power off the VM, switch the VM networking mode to Host-Only, and then take a snapshot to save a clean state of your analysis VM.

#### Improvement

The biggest improvement for FLARE VM is the ability to perform a proper update and uninstallation. The older version of FLARE VM came as a PowerShell script to install many chocolatey packages, one at a time; therefore, we were unable to include new packages when updating FLARE VM. In the past, our users had to reinstall FLARE VM completely, which is time consuming, or manually install the new package, which is error prone. To solve this issue, we have converted FLARE VM itself into a chocolatey package. Whenever a new tool is available we will also release a new version of FLARE VM. With this new design we can simply execute “choco upgrade all” to get the newest version of FLARE VM along with any new packages we have released. You can also safely uninstall all FLARE VM packages by executing “choco uninstall flarevm.installer.flare”.

Our new FLARE VM is also updated to use Python 3.7 as the default Python interpreter. As a result, many python scripts may fail to execute. To maintain support for older scripts, we keep Python 2.7 installed in parallel with Python 3.7. We can easily switch between different versions by using the Python launcher. Run “py -2.7 ” to use Python 2.7, or “py ” to use the default Python 3.7 interpreter. For more details on the Python launcher, please refer to the following URL: https://docs.python.org/3/using/windows.html#launcher.

Additionally, the new FLARE VM changes the location where Fakenet-NG saves its output when launched via the shortcut in the FLARE folder or taskbar pin. Instead of saving directly to the desktop, to reduce clutter, Fakenet-NG will store all its output in “Desktop\fakenet_logs”.

Compared to older versions this version of FLARE VM comes with many new tools and software packages. Most notably, this release adds the following:

• IDA Free 7.0
• radare2 to support 64-bit disassembly
• The labs for the Practical Malware Analysis book
• pdfid, pdf-parser, and PdfStreamdumper to analyze malicious PDF documents
• The Malcode Analyst Pack
• Yara for signature matching
• The Cygwin Linux environment on windows
• PowerShell transcription and script block logging
• PowerShell transcripts can be found in “Desktop\PS_Transcripts”

#### Available Packages

While we attempt to make the tools available as shortcuts within the FLARE folder, there are several available from command-line only. Please see the online documentation for the most up to date list. Here is an incomplete list of some major tools available on FLARE VM:

• Disassemblers:
• IDA Free 5.0 and IDA Free 7.0
• Binary Ninja
• Debuggers:
• OllyDbg and OllyDbg2
• x64dbg
• Windbg
• File Format parser:
• CFF Explorer, PEView, PEStudio
• PdfStreamdumper, pdf-parser, pdfid
• ffdec
• offvis and officemalscanner
• PE-bear
• Decompilers:
• RetDec
• Jd-gui and bytecode-viewer
• dnSpy
• IDR
• VBDecompiler
• Py2ExeDecompiler
• Monitoring tools:
• SysInternal suite
• RegShot
• Utilities:
• Hex Editors (010 editor, HxD and File Insight)
• FLOSS (FireEye Labs Obfuscated String Solver)
• Fakenet-NG
• Yara
• Malware Analyst Pack

#### Conclusion

The FLARE team continues to support and improve FLARE VM to be the de facto distribution for security research, incident response, and malware analysis on Windows platform. We greatly appreciate the numerous bug reports, tool requests, and feature recommendations from everyone. We hope FLARE VM, along with many other FLARE open source projects, can help you do your work better, easier, and faster.

We are always looking for talented folks to join our team. The FLARE Team may be a good place for you if:

• You eat, sleep, and speak disassembly and malware all day long.
• You would like to push the state of the art for reverse engineering and malware analysis.

Please check out our careers page, or send us an email. Happy Reversing!

#### Phishing Campaign Leverages WOFF Obfuscation and Telegram Channels for Communication

###### 26 Jan 2021

FireEye Email Security recently encountered various phishing campaigns, mostly in the Americas and Europe, using source code obfuscation with compromised or bad domains. These domains were masquerading as authentic websites and stole personal information such as credit card data. The stolen information was then shared to cross-platform, cloud-based instant messaging applications.

Coming off a busy holiday season with a massive surge in deliveries, this post highlights a phishing campaign involving a fake DHL tracking page. While phishing attacks targeting users of shipping services is not new, the techniques used in these examples are more complex than what would be found in an off-the-shelf phishing kit.

This campaign uses a WOFF-based substitution cypher, localization specific targeting, and various evasion techniques which we unravel here in this blog.

#### Attack Flow

The attack starts with an email imitating DHL, as seen in Figure 1. The email tries to trick the recipient into clicking on a link, which would take them to a fake DHL website. In Figure 2, we can see the fake page asking for credit card details that, if submitted, would give the user a generic response while in the background the credit card data is shared with the attackers.

Figure 1: DHL phishing attempt

Figure 2: Fake website imitating DHL tracking

This DHL phishing campaign uses a rare technique for obfuscating its source page. The page source contains proper strings, valid tags, and appropriate formatting, but contains encoded text that would render gibberish without decoding prior to loading the page, as seen in Figure 3. Typically, decoding such text is done by including script functions within the code. Yet in this case, the decoding functions are not contained in the script.

Figure 3: Snippet of the encoded text on page source

The decoding is done by a Web Open Font Format (WOFF) font file, which happens upon loading the page in a browser and will not be visible in the page content itself. Figure 4 shows the substitution cipher method and the WOFF font file. The attacker does this to evade detection by security vendors. Many security vendors use static or regex signature-based rules, so this method will break those naïve-based conditions.

Figure 4: WOFF substitution cipher

Loading this custom font which decodes the text is done inside the Cascading Style Sheets (CSS). This technique is rare as JavaScript functions are traditionally used to encrypt and decrypt HTML text.

Figure 5 shows the CSS file used to load the WOFF font file. We have also seen the same CSS file, style.css, being hosted on the following domains:

• hxxps://www.lifepointecc[.]com/wp-content/sinin/style.css
• hxxps://candyman-shop[.]com/auth/DHL_HOME/style.css
• hxxps://mail.rsi-insure[.]com/vendor/ship/dhexpress/style.css
• hxxps://www.scriptarticle[.]com/thro/HOME/style.css

These legitimate-looking domains are not hosting any phishing websites as of now; instead, they appear to be a repository for attackers to use in their phishing campaigns. We have seen similar phishing attacks targeting the banking sector in the past, but this is newer for delivery websites.

#### Notable Techniques

##### Localization

The phishing page displays the local language based on the region of the targeted user. The localization code (Figure 6) supports major languages spoken in Europe and the Americas such as Spanish, English, and Portuguese.

Figure 6: Localization code

The backend contains PHP resource files for each supported language (Figure 7), which are picked up dynamically based on the user’s IP address location.

Figure 7: Language resource files

##### Evasion

This campaign employs a variety of techniques to evade detection. This will not serve up a phishing page if the request came from certain blocked IP addresses. The backend code (Figure 8) served the users with a "HTTP/1.1 403 Forbidden" response header under the following conditions:

• IP has been seen five times (AntiBomb_User func)
• IP host resolves to its list of avoided host names ('google', 'Altavista', 'Israel', 'M247', 'barracuda', 'niw.com.au' and more) (AntiBomb_WordBoot func)
• IP is on its own local blocklist csv (x.csv in the kit) (AntiBomb_Boot func)
• IP has seen POSTing three times (AntiBomb_Block func)

Figure 8: Backend evasion code

After looking at the list of blocked hosts, we could deduce that the attackers were trying to block web crawlers.

##### Data Theft

The attackers behind this phishing campaign attempted to steal credentials, credit card data, and other sensitive information. The stolen data is sent to email addresses and Telegram channels controlled by the attacker. We uncovered a Telegram channel where data is being sent using the Telegram Bot API shown in Figure 9.

Figure 9: Chat log

While using php mail() function to send stolen credentials is quite common, in the near past, encrypted instant messaging applications such as Telegram have been used for sending phished information back to command and control servers.

We were able to access one of the Telegram channels controlled by the attacker as shown in Figure 10. The sensitive information being sent in the chat includes IP addresses and credit card data.

Figure 10: Telegram channel with stolen information

#### Conclusion

Attackers (and especially phishers) are always on the hunt for new ways to evade detection by security products. Obfuscation gives the attackers an edge, and makes it harder for security vendors to protect their customers.

By using instant messaging applications, attackers get user data in real time and victims have little to respond once their personal information is compromised.

#### Indicators of Compromise (IOC)

FireEye Email Security utilizing FAUDE (FireEye Advanced URL Detection Engine) protects customers from these types of phishing threats. Unlike traditional anti-phishing techniques dependent on static inspection of phishing URL content, FAUDE uses multiple artificial intelligence (AI) and machine learning (ML) engines to more effectively thwart these attacks.

From December 2020 until the time of posting, our FAUDE detection engine saw more than 100 unique URLs hosting DHL phishing pages with obfuscated source code, including:

• hxxps://bit[.]ly/2KJ03RH
• hxxps://greencannabisstore[.]com/0258/redirect-new.php
• hxxps://directcallsolutions[.]co[.]za/CONTACT/DHL_HOME/
• hxxp://r.cloudcyberlink[.]digital/ (multiple paths using same domain)
• medmox2k@yandex[.]com
• o.spammer@yandex[.]com
• cameleonanas2@gmail[.]com
• @Saitama330
• @cameleon9
##### style.css
• Md5: 83b9653d14c8f7fb95d6ed6a4a3f18eb)
##### font-woff2
• MD5: b051d61b693c76f7a6a5f639177fb820

#### Highlights

• Perform a case study on using Transformer models to solve cyber security problems
• Train a Transformer model to detect malicious URLs under multiple training regimes
• Compare our model against other deep learning methods, and show it performs on-par with other top-scoring models
• Identify issues with applying generative pre-training to malicious URL detection, which is a cornerstone of Transformer training in natural language processing (NLP) tasks
• Introduce novel loss function that balances classification and generative loss to achieve improved performance on the malicious URL detection task

#### Introduction

Over the past three years Transformer machine learning (ML) models, or “Transformers” for short, have yielded impressive breakthroughs in a variety of sequence modeling problems, specifically natural language processing (NLP). For example, OpenAI’s latest GPT-3 model is capable of generating long segments of grammatically-correct prose from scratch. Spinoff models, such as those developed for question and answering, are capable of correlating context over multiple sentences. AI Dungeon, a single and multiplayer text adventure game, uses Transformers to generate plausible unlimited content in a variety of fantasy settings. Transformers’ NLP modeling capabilities are apparently so powerful that they pose security risks in their own right, in terms of their potential power to spread disinformation, yet on the other side of the coin, they can be used as powerful tools to detect and mitigate disinformation campaigns. For example, in previous research by the FireEye Data Science team, a NLP Transformer was fine-tuned to detect disinformation on social media sites.

Given the power of these Transformer models, it seems natural to wonder if we can apply them to other types of cyber security problems that do not necessarily involve natural language, per se. In this blog post, we discuss a case study in which we apply Transformers to malicious URL detection. Studying Transformer performance on URL detection problem is a first logical step to extending Transformers to more generic cyber security tasks, since URLs are not technically natural language sequences but share some common characteristics with NLP.

In the following sections, we outline a typical Transformer architecture and discuss how we adapt it to URLs with a character-focused tokenization. We then discuss loss functions we employ to guide the training of the model, and finally compare our training approaches to more conventional ML-based modeling options.

Our URL Transformer operates at the character level, where each character in the URL corresponds to an input token. When a URL is input to our Transformer, it is appended with special tokens—a classification token (“CLS”) that conditions the model to produce a prediction and padding tokens (“PAD”) that normalize the input to a fixed length to allow for parallel training. Each token in the input string is then projected into a character embedding space, followed by a stack of Attention and Feed-Forward Neural Network (FFNN) layers. This stack of layers is similar to the architecture introduced in the original Transformers paper. At a high level, the Attention layers allow each input to be associated with long-distance context of other characters that are important for the classification task, similar to the notion of attention in humans, while the FFNN layers provide capacity for learning the relationships among the combination of inputs and their respective contexts. An illustration of our architecture is shown in Figure 1.

Additionally, the URL Transformer employs a masking strategy in its Attention calculation, which enforces a left-to-right (L-R) dependence. This means that only input characters from the left of a given character influence that character’s representation in each layer of the attention stack. The network outputs one embedding for each input character, which captures all information learned by the model about the character sequence up to that point in the input.

Once the model is trained, we can use the URL Transformer to perform several different tasks, such as generatively predicting the next character in the input sequence by using the sequence embedding () as an input to another neural network with as softmax output over the possible vocabulary of characters. A specific example of this is shown in Figure 1, where we take the embedding of the input “firee”() and use it to predict the next most likely character, “y.” Similarly, we can use the embedding produced after the classification token to predict other properties of the input sequences, such as their likelihood of maliciousness.

Figure 1: High-level overview of the URL Transformer architecture

#### Loss Functions and Training Regimes

With the model architecture in hand, we now turn to the question of how we train the model to most effectively detect malicious URLs. Of course, we can train this model in a similar way to other supervised deep learning classifiers by: (1) making predictions on samples from a labeled training set, (2) using a loss function to measure the quality of our predictions, and (3) tune model parameters (i.e., weights) via backpropagation. However, the nature of the Transformer model allows for several interesting variations to this training regime. In fact, one of the reasons that Transformers have become so popular for NLP tasks is because they allow for self-supervised generative pre-training, which takes advantage of massive amounts of unlabeled data to help the model learn general characteristics of the input language before being fine-tuned on the ultimate task at-hand (e.g., question answering, sentiment analysis, etc.). Here, we outline some of the training regimes we explored for our URL Transformer model.

##### Direct Label Prediction (Decode-To-Label)

Using a training set of URLs with malicious and benign labels, we can treat the URL Transformer architecture as a feature extractor, whose outputs we use as the input to a traditional classifier (e.g., FFNN or even a random forest). When using a FFNN as our classifier, we can backpropagate the classification loss (e.g., binary cross-entropy) through both the classifier and the Transformer network to adjust the weights to perform classification. This training regime is the baseline for our experiments and is how most deep learning models are trained for classification tasks.

##### Next-Character Prediction Pre-Training and Fine-Tuning

Beyond the baseline classification training regime, the NLP literature suggests that one can learn a self-supervised embedding of the input sequence by training the Transformer to perform a next-character prediction task, then fine-tuning the learned representation for the classification problem. A key advantage of this approach is that data used for pre-training does not require malicious or benign labels; instead, the next characters in a URL serve as the labels to be predicted from prior characters in the sequence. This is similar to the example given in Figure 1, where the embedding output is used to predict the next character, “y,” in “fireeye.com.” Overall, this training regime allows us to take advantage of the massive amount of unlabeled data that is typically available in cyber security-related problems.

The overall structure of the architecture for this regime is similar to the aforementioned binary classification task, with FFNN layers added for classification. However, since we are now predicting multiple classes (i.e., one class per input character in the vocabulary), we must apply a softmax function to the output to induce a probability distribution over the potential output characters. Once the Transformer portion of the network is pre-trained in this way, we can swap the FFNN classification layers focused on character prediction with new layers that will be trained for the malicious URL classification problem, as in the decode-to-label case.

##### Balanced Mixed-Objective Training

Prior work has shown that imbuing the training process with additional knowledge outside of the primary task can help constrain the learning process, and ultimately result in better models. For instance, a malware classifier might train using loss functions that capture malicious/benign classification, malware family prediction, and tag prediction tasks as a mechanism to provide the classifier with broader understanding of the problem than looking at malicious/benign labels in isolation.

Inspired by these findings, we also introduced a mixed-objective training regime for our URL Transformer, where we train for binary classification and next-character prediction simultaneously. At each iteration of training, we compute a loss multiplier such that each loss contribution is fixed prior to backpropagation. This ensures that neither loss term dominates during training. Specifically, for minibatch i, let the net loss LMixed be computed as follows:

Given hyperparameters a and b, defined such that a + b: = 1, we compute constant a so that the net loss contribution of LCLS to LMixed is a and the net contribution of LNext to LMixed is b. For our evaluations, we set a := b := 0.5, effectively requiring that the model equally balance its ability to generate the next character and accurately predict malicious URLs.

#### Evaluation

To evaluate our URL Transformer model and better understand the impact of the three training regimes discussed earlier, we collected a training dataset of over 1M labeled malicious and benign URLs, which was split into roughly 700K training samples, 100K validation samples, and 200k test samples. Additionally, we also developed an unlabeled pre-training dataset of 20M URLs.

Using this data, we performed four different training runs for our Transformer model:

1. DecodeToLabel (Baseline): Using strictly the binary cross-entropy loss on the embedded classification features over the entire sequence, we trained the model for 15 epochs using the training set.
2. MixedObjective: We trained the model for 15 epochs on the training set, using both the embedded classification features and the embedded next-character prediction features.
3. FineTune: We pre-trained the model for 15 epochs on the next-character prediction task using the training set, ignoring the malicious/benign labels. We then froze weights over the first 16 layers of the model and trained the model for an additional 15 epochs using a binary cross-entropy loss on the classification labels.
4. FineTune 20M: We performed pre-training on the next-character prediction task using the 20M URL dataset, pre-training for 2 epochs. We then froze weights over the first 16 layers of the Transformer and trained for 15 epochs on the binary classification task.

The ROC curve shown in Figure 2 compares the performance of these four training regimes. Here, our baseline DecodeToLabel model (red) yielded a ROC curve with 0.9484 AUC, while the MixedObjective model (green) slightly outperformed the baseline with an AUC of 0.956. Interestingly, both of the fine-tuning models yielded poor classification results, which is counter to the established practice of these Transformer models in the NLP domain.

Figure 2: ROC curves for four URL Transformer training regimes

To assess the relative efficacy of our Transformer models on this dataset, we also fit several other types of benchmark models developed for URL classification: (1) a Random Forest model on SME-derived features, (2) a 1D Convolutional Neural Network (CNN) model on character embeddings, and (3) a Long Short-Term Memory (LSTM) neural network on character embeddings. Details of these models can be found in our white paper, however we find that our top performing Transformer model performs on-par with the best performing non-Transformer baseline (a 1D CNN model), which perhaps indicates that the long-range dependencies typically learned by Transformer models are not as useful in the case of malicious URL detection.

Figure 3: ROC curves comparing URL Transformer to other benchmark URL classification models

#### Summary

Our experiments suggest that Transformers can achieve performance comparable to or better than that of other top-performing models for URL classification, though the details of how to achieve that performance differ from common practice. Contrary to findings from the NLP domain, wherein self-supervised pre-training substantially enhances performance in a fine-tuned classification task, similar pretraining approaches actually diminish performance for malicious URL detection. This suggests that the next character prediction task has too little apparent correlation with the task of malicious/benign prediction for effective/stable transfer.

Interestingly, utilizing next-character prediction as an auxiliary loss function in conjunction with a malicious/benign loss yields improvements over training solely to predict the label. We hypothesize that while pre-training leads to a relatively poor generative model due to randomized content in the URLs within our dataset, a malicious/benign loss may serve to better condition the generative model learned by the next-character prediction task, distilling a subset of relevant information. It may also be the case that the long-distance relationships that are key to the generative pre-training task are not as important for the final malicious URL classification, as evidenced by the performance of the 1D CNN model.

Note that we did not perform a rigorous hyperparameter search for our Transformer, since this research was primarily concerned with loss functions and training regimes. Therefore, it is still an open question as to whether a more optimal architecture, specifically designed for this classification task, could substantially outperform the models described here.

While our URL dataset is not representative of all data in the cyber security space, the difficulty of obtaining a readily fine-tuned model from self-supervised pre-training suggests that this approach is unlikely to work well for training Transformers on longer sequences or sequences with lesser resemblance to natural language (e.g., PE files), but an auxiliary loss might work.

#### Emulation of Kernel Mode Rootkits With Speakeasy

###### 20 Jan 2021

In August 2020, we released a blog post about how the Speakeasy emulation framework can be used to emulate user mode malware such as shellcode. If you haven’t had a chance, give the post a read today.

In addition to user mode emulation, Speakeasy also supports emulation of kernel mode Windows binaries. When malware authors employ kernel mode malware, it will often be in the form of a device driver whose end goal is total compromise of an infected system. The malware most often doesn’t interact with hardware and instead leverages kernel mode to fully compromise the system and remain hidden.

#### Challenges With Dynamically Analyzing Kernel Malware

Ideally, a kernel mode sample can be reversed statically using tools such as disassemblers. However, binary packers just as easily obfuscate kernel malware as they do user mode samples. Additionally, static analysis is often expensive and time consuming. If our goal is to automatically analyze many variants of the same malware family, it makes sense to dynamically analyze malicious driver samples.

Dynamic analysis of kernel mode malware can be more involved than with user mode samples. In order to debug kernel malware, a proper environment needs to be created. This usually involves setting up two separate virtual machines as debugger and debugee. The malware can then be loaded as an on-demand kernel service where the driver can be debugged remotely with a tool such as WinDbg.

Several sandbox style applications exist that use hooking or other monitoring techniques but typically target user mode applications. Having similar sandbox monitoring work for kernel mode code would require deep system level hooks that would likely produce significant noise.

#### Driver Emulation

Emulation has proven to be an effective analysis technique for malicious drivers. No custom setup is required, and drivers can be emulated at scale. In addition, maximum code coverage is easier to achieve than in a sandbox environment. Often, rootkits may expose malicious functionality via I/O request packet (IRP) handlers (or other callbacks). On a normal Windows system these routines are executed when other applications or devices send input/output requests to the driver. This includes common tasks such as reading, writing, or sending device I/O control (IOCTLs) to a driver to execute some type of functionality.

Using emulation, these entry points can be called directly with doped IRP packets in order to identify as much functionality as possible in the rootkit. As we discussed in the first Speakeasy blog post, additional entry points are emulated as they are discovered. A driver’s DriverMain entry point is responsible for initializing a function dispatch table that is called to handle I/O requests. Speakeasy will attempt to emulate each of these functions after the main entry point has completed by supplying a dummy IRP. Additionally, any system threads or work items that are created are sequentially emulated in order to get as much code coverage as possible.

#### Emulating a Kernel Mode Implant

In this blog post, we will show an example of Speakeasy’s effectiveness at emulating a real kernel mode implant family publicly named Winnti. This sample was chosen despite its age because it transparently implements some classic rootkit functionality. The goal of this post is not to discuss the analysis of the malware itself as it is fairly antiquated. Rather, we will focus on the events that are captured during emulation.

The Winnti sample we will be analyzing has SHA256 hash c465238c9da9c5ea5994fe9faf1b5835767210132db0ce9a79cb1195851a36fb and the original file name tcprelay.sys. For most of this post, we will be examining the emulation report generated by Speakeasy. Note: many techniques employed by this 32-bit rootkit will not work on modern 64-bit versions of Windows due to Kernel Patch Protection (PatchGuard) which protects against modification of critical kernel data structures.

To start, we will instruct Speakeasy to emulate the kernel driver using the command line shown in Figure 1. We instruct Speakeasy to create a full memory dump (using the “-d” flag) so we can acquire memory later. We supply the memory tracing flag (“-m”) which will log all memory reads and writes performed by the malware. This is useful for detecting things like hooking and direct kernel object manipulation (DKOM).

Figure 1: Command line used to emulate the malicious driver

Speakeasy will then begin emulating the malware’s DriverEntry function. The entry point of a driver is responsible for setting up passive callback routines that will service user mode I/O requests as well as callbacks used for device addition, removal, and unloading. Reviewing the emulation report for the malware’s DriverEntry function (identified in the JSON report with an “ep_type” of “entry_point”), shows that the malware finds the base address of the Windows kernel. The malware does this by using the ZwQuerySystemInformation API to locate the base address for all kernel modules and then looking for one named “ntoskrnl.exe”. The malware then manually finds the address of the PsCreateSystemThread API. This is then used to spin up a system thread to perform its actual functionality. Figure 2 shows the APIs called from the malware's entry point.

Figure 2: Key functionality in the tcprelay.sys entry point

#### Hiding the Driver Object

The malware attempts to hide itself before executing its main system thread. The malware first looks up the “DriverSection” field in its own DRIVER_OBJECT structure. This field holds a linked list containing all loaded kernel modules and the malware attempts to unlink itself to hide from APIs that list loaded drivers. In the “mem_access” field in the Speakeasy report shown in Figure 3, we can see two memory writes to the DriverSection entries before and after itself which will remove itself from the linked list.

Figure 3: Memory write events representing the tcprelay.sys malware attempting to unlink itself in order to hide

As noted in the original Speakeasy blog post, when threads or other dynamic entry points are created at runtime, the framework will follow them for emulation. In this case, the malware created a system thread and Speakeasy automatically emulated it.

Moving on to the newly created thread (identified by an “ep_type” of “system_thread”), we can see the malware begin its real functionality. The malware begins by enumerating all running processes on the host, looking for the service controller process named services.exe. It's important to note that the process listing that gets returned to the emulated samples is configurable via JSON config files supplied at runtime. For more information on these configuration options please see the Speakeasy README on our GitHub repository. An example of this configurable process listing is shown in Figure 4.

Figure 4: Process listing configuration field supplied to Speakeasy

#### Pivoting to User Mode

Once the malware locates the services.exe process, it will attach to its process context and begin inspecting user mode memory in order to locate the addresses of exported user mode functions. The malware does this so it can later inject an encoded, memory-resident DLL into the services.exe process. Figure 5 shows the APIs used by the rootkit to resolve its user mode exports.

Figure 5: Logged APIs used by tcprelay.sys rootkit to resolve exports for its user mode implant

Once the exported functions are resolved, the rootkit is ready to inject the user mode DLL component. Next, the malware manually copies the in-memory DLL into the services.exe process address space. These memory write events are captured and shown in Figure 6.

Figure 6: Memory write events captured while copying the user mode implant into services.exe

A common technique that rootkits use to execute user mode code involves a Windows feature known as Asynchronous Procedure Calls (APC). APCs are functions that execute asynchronously within the context of a supplied thread. Using APCs allows kernel mode applications to queue code to run within a thread’s user mode context. Malware often wants to inject into user mode since much of the common functionality (such as network communication) within Windows can be more easily accessed. In addition, by running in user mode, there is less risk of being detected in the event of faulty code bug-checking the entire machine.

In order to queue an APC to fire in user mode, the malware must locate a thread in an “alertable” state. Threads are said to be alertable when they relinquish their execution quantum to the kernel thread scheduler and notify the kernel that they are able to dispatch APCs. The malware searches for threads within the services.exe process and once it detects one that’s alertable it will allocate memory for the DLL to inject then queue an APC to execute it.

Speakeasy emulates all kernel structures involved in this process, specifically the executive thread object (ETHREAD) structures that are allocated for every thread on a Windows system. Malware may attempt to grovel through this opaque structure to identify when a thread’s alertable flag is set (and therefore a valid candidate for an APC). Figure 7 shows the memory read event that was logged when the Winnti malware manually parsed an ETHREAD structure in the services.exe process to confirm it was alertable. At the time of this writing, all threads within the emulator present themselves as alertable by default.

Figure 7: Event logged when the tcprelay.sys malware confirmed a thread was alertable

Next, the malware can execute any user mode code it wants using this thread object. The undocumented functions KeInitializeApc and KeInsertQueueApc will initialize and execute a user mode APC respectively. Figure 8 shows the API set that the malware uses to inject a user mode module into the services.exe process. The malware executes a shellcode stub as the target of the APC that will then execute a loader for the injected DLL. All of this can be recovered from the memory dump package and analyzed later.

Figure 8: Logged APIs used by tcprelay.sys rootkit to inject into user mode via an APC

#### Network Hooks

After injecting into user mode, the kernel component will attempt to install network obfuscation hooks (presumably to hide the user mode implant). Speakeasy tracks and tags all memory within the emulation space. In the context of kernel mode emulation, this includes all kernel objects (e.g. Driver and Device objects, and the kernel modules themselves). Immediately after we observe the malware inject its user mode implant, we see it begin to attempt to hook kernel components. This was confirmed during static analysis to be used for network hiding.

The memory access section of the emulation report reveals that the malware modified the netio.sys driver, specifically code within the exported function named NsiEnumerateObjectsAllParametersEx. This function is ultimately called when a user on the system runs the “netstat” command and it is likely that the malware is hooking this function in order to hide connected network ports on the infected system. This inline hook was identified by the event captured in Figure 9.

Figure 9: Inline function hook set by the malware to hide network connections

In addition, the malware hooks the Tcpip driver object in order to accomplish additional network hiding. Specifically, the malware hooks the IRP_MJ_DEVICE_CONTROL handler for the Tcpip driver. User mode code may send IOCTL codes to this function when querying for active connections. This type of hook can be easily identified with Speakeasy by looking for memory writes to critical kernel objects as shown in Figure 10.

Figure 10: Memory write event used to hook the Tcpip network driver

#### System Service Dispatch Table Hooks

Finally, the rootkit will attempt to hide itself using the nearly ancient technique of system service dispatch table (SSDT) patching. Speakeasy allocates a fake SSDT so malware can interact with it. The SSDT is a function table that exposes kernel functionality to user mode code. The event in Figure 11 shows that the SSDT structure was modified at runtime.

Figure 11: SSDT hook detected by Speakeasy

If we look at the malware in IDA Pro, we can confirm that the malware patches the SSDT entry for the ZwQueryDirectoryFile and ZwEnumerateKey APIs that it uses to hide itself from file system and registry analysis. The SSDT patch function is shown in Figure 12.

Figure 12: File hiding SSDT patching function shown in IDA Pro

After setting up these hooks, the system thread will exit. The other entry points (such as the IRP handlers and DriverUnload routines) in the driver are less interesting and contain mostly boilerplate driver code.

#### Acquiring the Injected User Mode Implant

Now that we have a good idea what the driver does to hide itself on the system, we can use the memory dumps created by Speakeasy to acquire the injected DLL discussed earlier. Opening the zip file we created at emulation time, we can find the memory tag referenced in Figure 6. We quickly confirm the memory block has a valid PE header and it successfully loads into IDA Pro as shown in Figure 13.

Figure 13: Injected user mode DLL recovered from Speakeasy memory dump

#### Conclusion

In this blog post, we discussed how Speakeasy can be effective at automatically identifying rootkit activity from the kernel mode binary. Speakeasy can be used to quickly triage kernel binaries that may otherwise be difficult to dynamically analyze. For more information and to check out the code, head over to our GitHub repository.

## Malware Must Die

#### MMD-067-2021 - Recent talks on Linux process injection and shellcode analysis series at R2CON-2020, ROOTCON-14 2020 from HACK.LU-2019

###### 03 Mar 2021

Tag: Linux, LinuxSecurity, Memory Fornsics, RE, ReverseEnineering, DFIR, Fileless, ProcessInjection, Shellcode, Exploit, PostExploitation, BlueTeaming, HandsOut, Demo, Video, Slides, Presentation The background of these research and talks After HACK.LU-2019's talk in 2019 [link], I was asked a lot of questions about Linux process injection that can trigger code execution and yes, one of

#### MMD-0066-2020 - Linux/Mirai-Fbot - A re-emerged IoT threat

###### 24 Feb 2020

Chapters: [TelnetLoader] [EchoLoader] [Propagation] [NewActor] [Epilogue] Prologue A month ago I wrote about IoT malware for Linux operating system, a Mirai botnet's client variant dubbed as FBOT. The writing [link] was about reverse engineering Linux ELF ARM 32bit to dissect the new encryption that has been used by their January's bot binaries, The threat had been on vacuum state for almost

#### MMD-0065-2020 - Linux/Mirai-Fbot's new encryption explained

###### 15 Jan 2020

Prologue [For the most recent information of this threat please follow this ==> link] I setup a local brand new ARM base router I bought online around this new year 2020 to replace my old pots, and yesterday, it was soon pwned by malware and I had to reset it to the factory mode to make it work again (never happened before). When the "incident" occurred, the affected router wasn't dead but it

#### More about my 2019.HACK.LU Keynote talk

###### 28 Oct 2019

As promised, this is my additional notes and review about my Keynote talk in 2019.HACK.LU (link). My keynote talk title is very long actually, but it explained the description of the whole slides clearly. What was presented is about TODAY's Linux post exploitation, process injection, fileless execution from infrastructures and components that has been supporting those activities, based on the

#### MMD-0064-2019 - Linux/AirDropBot

###### 28 Sep 2019

Prologue There are a lot of botnet aiming multiple architecture of Linux basis internet of thing, and this story is just one of them, but I haven't seen the one was coded like this before. Like the most of other posts of our analysis reports in MalwareMustDie blog, this post has been started from a friend's request to take a look at a certain Linux executable malicious binary that was having a

#### MMD-0063-2019 - Summary of 3 years MMD research (Sept 2016-Sept 2019)

###### 21 Sep 2019

Hello, it's unixfreaxjp here. It has been a while since I wrote our own blog, and it is good to be back. Thank you for your patience for all of this time. If you want to see what we were doing during all of our silence time just click this link The background / TLDR It was in September 2016 when we decided to move our blog and since then myself and the team had a lot of fun in learning and

#### MMD-0062-2017 - Credential harvesting by SSH Direct TCP Forward attack via IoT botnet

###### 08 Mar 2017

Sticky note: We call this threat as "Strudels Attack" 1. Background In this post there is no malicious software/malware analyzed, but this is one of the impact of the malware infecting IoT devices caused by weak credentials that are utilized by the bad actors for bigger crime process. The only malicious aspect written in the post is/are individual(s) involved and participated to these attacks,

#### MMD-0061-2016 - EnergyMech 2.8 overkill mod

###### 03 Dec 2016

This is a new threat analysis report I wrote in MalwareMustDie blog (this) after we moved out from blogger, I hope you like the new blog system and design, and enjoy the post! An unattended or abandoned Linux/UNIX system with its web service online (specially with the CGI function intact) with not having recent updates can be soon be exploited and infected by Linux malware. Scanner for

#### MMD-0060-2016 - Linux/UDPfker and ChinaZ threat today

###### 30 Oct 2016

Background ChinaZ is the PRC (Public Rep of China) actor's made Linux ELF DDoS malware and its service. This threat has been covered several times in this blog post, several takedown efforts also had been taken, yet the threat is still lurking us, until now. Using specific indicators used during their infection effort, I can manage to trace the overall activity and their activity has been

#### MMD-0059-2016 - Linux/IRCTelnet (new Aidra) - A DDoS botnet aims IoT w/ IPv6 ready

###### 29 Oct 2016

It's a Kaiten/Tsunami? No.. STD?? No! It's a GayFgt/Torlus/Qbot? No!! Is it Mirai?? NO!! It's a Linux/IRCTelnet (new Aidra)! ..a new coded IoT DDoS botnet's Linux malware.. Summary This post is a report of what it seems to be a new IRC botnet ELF malware, that is obviously used for performing DDoS attack via IRC botnet. It was coded with partially is having specification as per Tsunami/Kaiten

###### 14 Oct 2016

Background Since the end of September 2016 I received a new type of attacks that aims the MIPS platform I provided to detect IoT attacks. I will call this threat as new ELF Linux/NyaDrop as per the name used by threat actor himself, for the "nyadrop" binary that is dropped in the compromised system. This is not the "really" first time we're seeing this threat actually, in this year, some

#### MMD-0057-2016 - Linux/LuaBot - IoT botnet as service

###### 06 Sep 2016

Background On Mon, Aug 29, 2016 at 5:07 PM I received this ELF malware sample from a person (thank you!). There wasn't any detail or comment what so ever just one cute little ARM ELF stripped binary file with following data: arm_lsb: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), statically linked, stripped hash: a220940db4be6878e47b74403a8079a1 This is a cleanly GCC: (GNU) 5.3.x

#### MMD-0056-2016 - Linux/Mirai, how an old ELF malcode is recycled..

###### 01 Sep 2016

Our recent analysis about Mirai is in here==>[Link] Background From August 4th 2016 several sysadmin friends were helping us by uploading this malware files to our dropbox. The samples of this particular ELF malware ware not easy to retrieve, there are good ones and also some broken ones, I listed in this post for the good ones only. This threat is made by a new ELF trojan backdoor which is now

#### MMD-0055-2016 - Linux/PnScan ; ELF worm that still circles around

###### 24 Aug 2016

Background Just checked around internet and found an interesting ELF worm distribution that may help raising awareness for fellow sysadmins. As per shown in title, it's a known ELF malware threat, could be a latest variant of "Linux/PnScan", found in platform x86-32 that it seems run around the web within infected nodes before it came to my our hand. This worm is more aiming embed platform and I

#### MMD-0054-2016 - ATMOS botnet facts you should know

###### 07 Jun 2016

The background This post is about recent intelligence and sharing information of the currently emerged credential stealer and spying botnet named "Atmos", for the purpose of threat recognizing, incident response and may help reverse engineering. This report is the third coverage of online crime toolkit analysis series that we disclose in MalwareMustDie blog, on previous posts we disclosed about

#### [Slide|Video] Kelihos & Peter Severa; the "All Out" version

###### 09 May 2016

Tag: Kelihos, Khelios, P2P, FastFlux, Botnet, CNC, C2, Clickfraud, Traffic Redirection, Spambot, DNS Poison, Botnet as Service, Affiliate, Severa, Peter Severa, Petrushakov, Saever, Saushkin We yanked this page off along with the slides & its video links from public view to support cyber crime investigation to stop the botnet for good. It's a good will from our investigation team and there's

#### MMD-0053-2016 - A bit about ELF/STD IRC Bot: x00's CBack aka xxx.pokemon(.)inc

###### 16 Apr 2016

Latest UPDATE incident of this threat is-->[link] Background I received the report of the host in Google cloud network is serving ELF malware: { "ip": "130.211.127.186", "hostname": "186.127.211.130.bc.googleusercontent.com", "prefix": "130.211.0.0/16", "org": "AS15169 Google Inc.", "city": "Mountain View", "region": "California", "country": "USA", "loc": "37.4192,

#### MMD-0052-2016 - Overview of "SkidDDoS" ELF++ IRC Botnet

###### 07 Feb 2016

Tag: kaiten, ktx, tsunami, STD, stdbot, torlus, Qbot, gayfgt, lizard, lizkebab, antichrist, sinden, sdn, \$dn, bossaline, bossabot, dtool, aidra, lightaidra, zendran, styx, Code, Robert, cod, unixcod, styxcod, irc, ircbot, ddos, elfbot, ddoser, nix, elf, linux, unix. backdoor, syn flood, ack flood, ntp flood, udp flood, dns amp, xmas attack, pan flood, x00, cback, LiGhT, Proxseas, BLJ, KaitenBot,

#### MMD-0051-2016 - Debunking a tiny ELF remote backdoor (shellcode shellshock part 2)

###### 03 Feb 2016

The background In September 2014 during the ShellShock exploitation incidents was in the rush, one of them is the case MMD-0027-2014 of two ELF malware dropped payloads via ShellShock attack, a new malware and a backconnect ELF, with the details can be read in-->[here] Today I found another interesting ELF x86-32 sample that was reported several hours back, the infection vector is also via

#### MMD-0050-2016 - Incident report: ELF Linux/Torte infection (in Wordpress)

###### 12 Jan 2016

The indicator Several hours ago, it was detected a suspicious inbound access on a Wordpress site with the below log: (Thank's for the hard work from Y) It's an unusual traffic coming from the unusual source of ip address: 37.139.47.183|37-139-47-183.clodo.ru.|56534 | 37.139.40.0/21 | PIRIX-INET | RU | comfortel.pro | Comfortel Ltd. 62.76.41.190 |62-76-41-190.clodo.ru. |57010 | 62.76.40.0/21 |

#### MMD-0049-2016 - A case of java trojan (downloader/RCE) for remote minerd hack

###### 09 Jan 2016

Background This is a short post for supporting the takedown purpose. Warning: Sorry, this time there's nothing fancy nor "in-depth analysis" :-) Yet the current hacking & infecting scheme is so bad, so I think it's best for all of us (fellow sysadmins in particular) to know this information for mitigation and hardening purpose. In this case, a bad actor was using java coded malware injected to a

#### MMD-0048-2016 - DDOS.TF = (new) ELF & Win32 DDoS service with ASP + PHP/MySQL MOF webshells

###### 05 Jan 2016

Background Linux exploitation by bad actors from People Republic of China (in short: PRC) is not a new matter. Their attacks are coming everyday and their method is also improving by days. This post is another case of the issue, except it is reporting you some improvement and new source of DoS threat from the same landscape. The unique point of this one is by combining ElasticSearch

#### MMD-0047-2015 - SSHV: SSH bruter ELF botnet malware w/hidden process kernel module

###### 24 Dec 2015

Background Apparently Linux ELF malware is becoming an interesting attraction from several actors from People Republic of China(in short: PRC). This post is one good example about it. It explains also why myself, from my team (MMD), put many effort to study Linux executable malicious scheme came from that region recently, so does our colleges professional researchers in industry started to put

#### MMD-0046-2015 - Kelihos 10 nodes CNC on NJIIX, New Jersey USA, with a known russian crook who rented them

###### 21 Dec 2015

Global variable declaration to read correctly #include int main(void) { char * email = "XXXXX$$censored$$\ data"; } Background Note2: Considering: The attack of Kelihos botnet to my country and several countries is still un-stoppable and on-going, Yet I was told to censored Kelihos investgation on 2013 without getting good follow up from law enforcement in this planet, no

#### MMD-0045-2015 - KDefend: a new ELF threat with a disclaimer

###### 04 Dec 2015

Background It's been a while not writing new analysis in our blog & this timing is just perfect. On December 1st, 2015 this sample was detected by our ELF team member @benkow_ ..and our ELF Team started to investigate the threat and come into conclusion that another new ELF malware was spotted, and post this is the report. It was calling itself "KDefend" or "KDLinux", so we call it as "Linux/