- Posted by Sergei Frankoff 20 May
- 0 Comments
Bedep Ad-Fraud Botnet Analysis – Exposing the Mechanics Behind 153.6M Defrauded Ad Impressions A Day
Following on from our post on Angler EK we are going to expose the mechanics behind the Bedep ad-fraud malware. Recently Bedep has been observed as the payload dropped by the Anger EK in a series of malvertising campaigns. These campaigns have lead to a rapid rise in the rate of Bedep infections, with Arbour Networks observing just above 80K infections over a 3-day period.
Bedep has the ability to load multiple custom modules after infecting a host. In this briefing, we will examine Bedep’s ad-fraud module and provide insight into how traffic from this bot is laundered into the advertising ecosystem and used to defraud advertisers.
Bedep’s ad-fraud module is fairly sophisticated but it’s not as advanced as some of the ad-fraud bots we have analyzed, for example Kovter. It has features to circumvent current ad-fraud detection capabilities such as user behaviour emulation and referrer spoofing. With these features, the bot is successful in defrauding advertisers on various sites, including ads from some large advertisers such as American Express, British Airways, BMO, and Ford on reputable exchanges such as Google’s doubleclick.
Bedep Traffic in The Advertising Ecosystem
While the topic of how bot traffic is introduced into the advertising ecosystem is complex, we are going to use Bedep as an example to illustrate one of the more common ways that bot traffic is laundered into the ecosystem before it is used to defraud advertisers.
Modern ad-fraud bots don’t directly click on ads as many may expect. Instead they use low quality PPC exchanges to route their traffic to publishers who pay these exchanges for traffic. Once the bot lands on the publisher’s site, it automatically defrauds all the ad impressions on the site (and in some cases the video ads as well). The publisher has little incentive to block this traffic as they are still being paid for the impressions, and likewise the low quality PPC exchange has little incentive not to pay the bot masters as the publishers are paying them. We want to be clear; neither the publishers nor the low quality PPC exchange may be knowingly supporting the bot traffic, however, as long as they are still being paid there is little incentive for them to investigate where the traffic originates.
Bedep uses multiple PPC exchanges backed by thousands of publishers to launder its traffic. We have simply chosen one example to illustrate the full laundering chain.
We start with the PPC URL that is sent from the ad-fraud module’s command and control server (C2). Here we see it redirects to the domain c.feed-xml.com which is a PPC feed for the VertaMedia PPC exchange.
Next the VirtaMedia exchange redirects the bot another PPC exchange this time hosted by eZanga. We see something interesting here, the Virta Media exchange has added the following search parameters to the eZanaga request “heavy+truck+insurance+home+carpet+cleaners”.
Finally eZanga redirects the bot to a publisher’s site virtustyle.com.
Once the Bedep Ad-Fraud bot is finally redirected to the publisher’s site virtustyle.com, it loads the site and defrauds all of the impression ads. Some of the exchanges that are serving ads on virtustyle.com are:
- Conversant Media
- Vibrant Media
In our lab we observed a single Bedep bot load, on average, one website a minute. This is due to the multiple threads that the Bedep operates, each running a seperate browser instance. If we use the botnet size observed by Arbor as 80K bots and assume that each infected machine is only operational for 8 hours a day, it means the Bedep botnet is visiting 38.4M websites each day. To better estimate the impact of this fraud we can take a conservative estimate of 4 ads per page which gives us a minimum of 153.6M defrauded ad impressions a day. In our lab we observed Bedep loading multiple websites that used ad-stuffing to load hundreds of ads at a time. As a result we would expect the real number of defrauded ad impressions to be a multiple larger than our 153.6M/day estimation.
The Bedep Bot
The loader portion of Bedep is the initial malware that is installed after a host is infected. The loader is responsible for setting up persistence on the infected host and downloading additional malware modules to monetize the infected host. The loader’s downloaded payloads come in both 32bit and 64bit versions depending on the operating system of the victim.
Like most current malware, the loader code has been designed with some light anti-analysis tricks, the most frustrating of which is the fact that all strings are encrypted. We have released a string decryption tool on our github to assist with the decryption of these strings. Once the strings have been decrypted the logic of the loader is fairly straight forward; it installs a persistence mechanism then calls out to its command and control server (C2) to download and run the monetization module (in this case an ad-fraud module).
When the loader is run it creates a hidden folder with a UUID the %PROGRAMDATA%. It then copies itself to the folder as a DLL using a benign sounding name, in this case FntCache.dll.
For persistence, the loader creates a special Shell Extension for the current user that loads this DLL.
The server to run the malicious loader DLL is located in the registry here:
The Shell Extension is registered to the current user here:
Loader C2 Communication
The Bedep loader uses a domain generation algorithm to dynamically generate the domain it is going to connect to. The folks over at Arbor have provided an excellent article explaining how the DGA works and they even provided a tool to re-generate the domains. Instead of posting the same information here we suggest you head over and read their report.
The C2 traffic itself is composed of a series of POST messages that contain encrypted messages.
After analyzing the algorithm used to encrypt the message we determined that it was a custom implementation of AES that used a 16 byte key (AES128). The key is encrypted and hard coded in the binary. The developers had also cleverly encrypted the AES s-box in an attempt to make analysis of the algorithm more difficult. The IV used in the AES encryption is randomly generated for each request and prepended to the cypher text after encryption. The IV plus cypher text is then base64 encoded and sent as the message in the POST.
We have released a tool that can be used to decrypt Bedep traffic on our Github.
The actual messages that are sent from the Bedep loader to the C2 are formatted JSON while messages that are returned by the C2 are delimitated by the ‘#’ character and are specific to the request sent by the bot. The command set used in the communication consists of the following.
|protocolVersion||The version of the communication protocol. This is hard coded in the bot. This parameter must always be present in the request.|
|rev||Bot revision. This is hardcoded in the bot. This parameter must always be present in the request.|
|buildId||Build ID for the bot. This is hardcoded in the bot. This parameter must always be present in the request.|
|botId||This is a unique ID that is assigned to the bot by the C2 on first communication. This parameter must be present in all requests after the initial communication.|
|tags||Tags is an array that contains the commands.|
|ping||This is the first command that is sent by the bot to establish communication with the C2. The C2 responds with ID assigned to the bot.|
|zoo||This command is used to download a set of DLLs. Currently we have not investigated the purpose of these DLLs.|
|update||This command is used to download and run the Bedep payload. In this case the ad-fraud module.|
|stat||This command is used to upload statistics on a specific job or command that the bot is executing.|
We recently noticed a slightly new variation on the traffic format for Bedep where the encrypted message is split over a series of parameters in the POST message. These new requests also have URL paths generated from the following strings (thanks to Moritz Kroll for posting the full list to VirusTotal).
functions_picturecomment.php functions_online.php functions_notice.php functions_newpost.php functions_misc.php functions_log_error.php functions_login.php functions_legacy.php functions_infractions.php functions_forumlist.php functions_forumdisplay.php functions_filesystemxml.php functions_file.php functions_faq.php functions_facebook.php functions_external.php functions_editor.php functions_digest.php functions_databuild.php functions_cms_layout.php functions_calendar.php functions_bigthree.php functions_banning.php functions_attach.php functions_album.php functions_ad.php functions.phÿ database_error_page.html database_error_message_ajax.html database_error_message.html class_dm_groupmessage.php class_dm_forum.php class_dm_event.php class_dm_discussion.php class_dm_deletionlog_blog.php class_dm_deletionlog.php class_dm_cms_widget.php class_dm_cms_layout.php class_dm_blog_trackback.php class_dm_blog_rate.php class_dm_blog_custom_block.php class_dm_blog_category.php class_dm_blog.php class_dm_attachment.php class_dm_album.php class_dm.php class_dbalter.php class_datastore.php class_database_slave.php class_database_explain.php class_core.php class_bootstrap_framework.php class_bootstrap.php class_blog_response.php class_blog_entry.php class_block.php class_bitfield_builder.php class_bbcode_blog.php class_bbcode_alt.php class_bbcode.php class_apiclient.php class_akismet.php class_ajax_output.php blog_init.php blog_functions_shared.php blog_functions_search.php blog_functions_post.php blog_functions_online.php blog_functions_main.php blog_functions_category.php blog_functions.php xmlsitemap.php widget.php showthread.php showpost.php sendmessage.php search.php report.php register.php profile.php postings.php posthistory.php poll.php online.php newthread.php newreply.php misc.php memberlist.php member.php login.php list.php infraction.php groupsubscription.php group.php global.php forumdisplay.php css.php converse.php content.php blog_post.php blog_attachment.php blog_ajax.php attachment.php assetmanage.php announcement.php clear.php asset.php ajax.php album.php calendar.php blog.php forum.php index.php include
To use our decryption tool with this new format of traffic you simply need to extract all the values from the parameters passed in the message body and concatenate them into a single string.
The values from the parameters in the request shown above have been extracted and combined into the base64 string shown below. Once extracted and combined in this fashion the string can be decrypted using the tool we released.
Bedep Ad-Fraud Module
The Bedep Ad-Fraud module that is downloaded via the “update” comes in both 32bit and 64bit versions. This module does not persist on the host and is downloaded when the Bedep loader is started after a reboot. The Bedep loader uses process hallowing to inject the Ad-Fraud module into benign processes.
During operation, the Ad-Fraud module receives URLs of sites to browse from its command and control (C2) server. It then opens a hidden instance of an embedded Internet Explorer browser and programatically directs the browser to load the URLs it has received. We will explain this process in detail below.
Hidden Browser Setup
In order to hide its activity from users of the infected host the module creates a virtual desktop that is uses to house all of its windows. More information on virtual desktops and a tool used to display them can be found in this blog post. In this case the module creates a desktop called Default IME and then moves its process to this desktop.
In addition to using a virtual desktop to hide the presence of its browser windows the Ad-Fraud module also sets a number of in-line hooks in its process space to suppress events that might notify a user that there is a hidden browser running on their host.
The PlaySound and waveOutOpen hooks simply null out the functions and serve to suppress any sounds that might be generated by the Ad-Fraud module’s browsing, such as the audio from videos. The SetFocus and DialogBox hooks also null out the functions but they serve to suppress any visual queues that might indicate the presence of a hidden browser, such as pop-ups.
In parallel with the hooks the module also sets a number of registry keys to configure the behaviour of the embedded Internet Explorer browsers that it controls. This configuration is both an attempt to suppress events that might notify a user of the operation of the browsers and also an attempt to make the embed Internet Explorer browser behave similar to a real Internet Explorer browsers. When the Internet Explorer browser is embedded and controlled programatically some of its features differ from those of a real Internet Explorer browser. A subset of these features can be detected by anti-fraud solutions as a way to detect bot traffic. To circumvent these detection methods the Bedep Ad-Fraud module sets these registry keys to ensure its embedded browsers have the same features as a real Internet Explorer browser.
Ad-Fraud C2 Communication
One of the more peculiar aspects of the Bedep Ad-Fraud module is that its C2 communications are not encrypted. Given the level of sophistication in the Bedep loader C2 encryption it is surprising to see that the Ad-Fraud C2 traffic is sent in clear text.
As shown above the C2 sends a URL to load along with some attributes for the ad-fraud module to spoof in its embedded Internet Explorer browser.
|Spoof User Agent||Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)|
Click URL Navigation
Before navigating to the click URL the Ad-Fraud module creates another set of hooks to spoof the Language/Location and other attributes.
Once the click URL has been loaded in the embedded browser the Ad-Fraud module uses some simple browsing behaviour emulation techniques in an attempt to trick any fraud detection service. An example of this behaviour is some random mouse movement and left clicking.
While not the most advanced ad-fraud bot available Bedep represents a new type of advance ad-fraud threat that is actively circumventing traditional ad-fraud detection and defrauding high quality impressions from well established exchanges.
While we are actively working with our partners to identify and eliminate Bedep ad-fraud traffic from the advertising ecosystem, we need help from security vendors and incident responders to detect and remove this threat from endpoints. Our hope is that our report and accompanying tools will be of assistance.
We have intentionally removed some of our analysis from this report because it;
- may have given the malware developers insight into how we are able to detect this bot and/or,
- we believe it was not necessary information from an incident response perspective.
That being said, if you have any questions about our analysis or would like to know more feel free to contact us. We are happy to share more information privately.
Prior and Parallel Research
Kafeine has an excellent overview of Angler and Bedep. Also contains a note on Bedep’s persistence mechanism.
Arbor have an excellent analysis of the Bedep DGA and they also released a tool to replicate it.
Spider labs have a report detailing some of the ad-fraud traffic they observed.
Malwarebytes have a post explaining how Bedep is tied to recent malvertising attacks.
Zscaler also have a post covering Bedep related malvertising attacks.
Reference MD5 Hashes
Click Module: 2faf2044e18837d23aa325cb21f17c4b
Loader DLL (persistence): 46df78cf0eea2915422d84928dbc2462
Loader DLL (from Angler EK): 854646bdcf4da69c975dd627f5635037