One of the key functionalities of an ad server is to determine how many ad impressions and clicks have been logged and confirmed. This is important because billing is dependent on that. A “confirmed” ad is one that we’ve verified as actually seen by a user. This means the ad was visible on the page and not blocked by an ad-block software or something else on the user’s end.
Counting and confirming impressions
In general, whenever ad server receives a request, it has all the user-related information from the cookie such as age, gender, etc. Depending on user profile, ad server returns ad to the user and logs it, which is counted as one impression. Key point to note here is that ad server has counted an impression, but it might not be confirmed because of several reasons. Therefore to confirm, ad server sends a blank GIF file (1×1) in the ad response with some unique ID, which is used by ad server to mark it as confirmed. Customer billing is done on the basis of confirmed impressions only. Similar is the case with clicks, in which ad server verifies the source is registered and valid and is not a robot.
Delayed impressions
There are some scenarios wherein ad server sends initial ad to the user but impression counting is not done till the ad server receives another request for the asset itself. An example is video ad, wherein you will see the video ad with “skip ad” options. If you skip ad, it will end displaying ad, else it will send request to ad server to continue the ad.
Delayed impressions are used with:
• Prefetched ads
• Out-of-page ads
• Video ads
• Mobile ads
• Ad Exchange ads
Counting clicks
When an ad is displayed to the user and user clicks on it, a request is sent to the ad servers. Whenever ad server receives the request, click is counted and in parallel redirects the user to the landing page.
Discarded impressions and clicks
Sometimes there are impressions and clicks that are not generated by actual people browsing the web. Such impressions and clicks are neglected by ad servers. Invalid impressions and clicks can come from a variety of sources, including:
• Web crawlers and spiders
• Impressions and clicks from sources that are not registered and considered to be robots.
Different ad servers can use different logic to discard impression and clicks. A Popular logic is to discard requests are coming from unknown sources such as robots. Robots IP are usually filtered in three ways.
1. The first is based on known user-agents, which is straight forward. All entries in the log files where the browser is robot, are considered to be not confirmed.
2. The second way of filtering robots is based on a known list of robot IPs and hostnames. The list is maintained in a configuration file and updated by the system administrators.
3. The third way of robot filtering involves identifying robots based on behavior by analyzing a sample of the ad logs.
Robots are identified by click activity. An IP/host which has clicked on more than THRESHOLD_TOTAL clicks that day, or has clicked on more than THRESHOLD_HOURLY clicks in any hour of that day, is considered to be a robot. THRESHOLD_TOTAL and THRESHOLD_HOURLY are configurable.
This was an overview of ad impression and clicks counting. If anyone is interested to know more, feel free to send me an email. In another post, I will discuss technical details of impression and click confirmation and different scenarios related to ad impression and click counting. Stay tuned!