Archive for March, 2010

Magic Behind Ad Targeting Not so Magical

This morning the NY Times posted an article titled “Instant Ads Set the Pace on the Web“, and it specifically discussed a new company, AppNexus, and their real-time ad targeting offering.  This article struck me because it combined statements that as far as I know are only marginally true regarding the current state of ad targeting, raised some privacy issues, and made all of this targeting appear somewhat mystic.  As a privacy advocate, I’m always keen to point out when offerings or services thread closely to violating our privacy, but by the same token when the privacy flag is being waved unnecessarily I feel it’s important to clear the air.

The reason I feel comfortable discussing this, is that for two years I ran a company focused on classifying videos to facilitate contextual and interest-based ad targeting.  As a result, I was forced to familiarize myself with the various ad networks and their various targeting methodologies in order to determine what sorts of data would be most helpful to making their ad targeting methodologies most effective with videos.  It was during this time that I learned some surprising things about areas like demographics and behavioral targeting, and how sites were determining who there visitors were.

Before going there, let me start by saying that there are two ways to gather information about users online, explicitly or implicitly.  On the explicit side of things, this is where a visitor tells a site about themselves.  You may have experienced sites that ask you for nothing more than your what login information you want to use, versus others that may ask more prying questions about your sex, age, income, job function, etc., and of course there’s lots in between, before providing unfettered access.  Sites that ask for little but still offer advertisers the ability to target their users on a demographic basis, do so by either looking at their reader distribution across their physical world publications (BusinessWeek.com was one who seemed to do this when I checked them out some time back, as were some of trade magazine, TV and radio sites) and projecting that online.  This of course tends to be represented in terms of percentages so in fact, they don’t really know that a specific user is male or female, just that they are 65% more likely to be male and 35% female.  Some sites will use comScore or Nielsen data, both of which manage panels of users that have agreed to be tracked, to extrapolate the visitor distribution information.  By the way, that’s how polls work too.  On any given poll, only 1,000 to 5,000 people are called, but because enough is known about these people to supposedly represent a good cross-section of America, the numbers are then projected across our entire population.  Any site using these methods for demographic targeting doesn’t really know who or what you are, they’re just trying to do a best guess analysis.

Sites that ask you to fill out a form with more detailed information, are the ones most likely to know more about who you are since like all sites, they will likely have deposited an identifying cookie on you.  Even if you erase it, the next time you login to the site they will know as much demographic information as you chose (or was required) to provide since they have this associated with your identity on their site.  This may not be a big issue given that you agreed to enter into a business relationship with that site.  Where things get murky is if the site is working with a cookie exchange and they’re in effect acting as a proxy for obtaining your information to allow the exchange’s network to track your activities.  Of course, when you clear your browser cookies, any ad network or exchange cookies go away.  

In reviewing both BlueKai and eXelate (also mentioned in the NYT article), I found that while they do enable age and gender to be tracked, they never seemed to have it for me.  It’s actually a useful exercise to go check what these cookie exchanges know about you so you can get a feel for what’s being tracked.  eXelate’s Preference Manager, from which you can also opt-out of, can be found here.  BlueKai’s Registry (also enables opt-out) can be found here.  In my case, BlueKai had nothing because I haven’t frequented any sites recently, that they work with.  They also prefer fresh information so they only keep your info alive for a limited amount of time.  eXelate had nothing for my age and gender, but did have me classified under “Auto Buyers”, “Auto Buyers – New Cars”, “Auto Buyers – SUV”.  This is because when I bought my Honda Element almost a year ago, I went to various auto sites to read reviews.  In other words, any advertiser buying cookies for prospective auto buyers and gets mine as part of that batch, is getting ripped off because I haven’t been in the market for a year.  eXelate also had me under the categories of “Guys & Gear”, “IT Professionals” and “Sports”.  As you can see from the tone of these categories, they are hardly privacy fear inducing ones.  By the way, if you want see the various ad or cookie networks that are tracking you and opt-out of any, you can check out the Network Advertising Initiative site for this info.  There are ad networks that do not participate in the Network Advertising Initiative, but the more reputable ones do.

In learning more about behavioral ad networks, which use implicit information about the sites you are visiting, what became clear is that here too visitors’ behaviors were being classified in big swaths often less accurate than these companies’ marketing campaigns would have you think.  For example, targeting networks like AudienceScience bucket people into “specific interests and intent”.  Heck, they even have something akin to a periodic table of targeting segments.  How a user makes it into a bucket is generally determined by their visit to a site level (ie. XYZ site is travel site) or by their visit to a section within a site (ie. travel section of ABC site).  I have visited sites that raised a mild curiosity but were not an indicator of any particular interest of mine.

As of today, I don’t know of any ad networks that track a person’s actual demographic (not a statistically derived one).  Oh yeah, and one other thing.  In order to be able to do interest-based or behavioral targeting, all of the sites visited have to be part of the same network.  For example, eXelate knew about my auto buying interest, but clearly the travel sites I have visited were not part of their network so they have no way of knowing that information about me.  In other words, our personae are very fragmented across all of these ad and exchange networks and with perhaps the exception of Google, no one else has a very full picture of my activities.  Google’s Privacy Center provides access to your ads preferences where you can see what they know about you for ad targeting purposes.  While I have opted out of this service, the last time I looked at it, only half of the info was actually accurate or relevant to me.  They have other ways of getting implicit targeting information, but as we’re seeing, there are limits to that and its accuracy.

All this to say, that for now at least, I’m not overly worried about these networks nor their ability to supposedly target me in milliseconds, nor do I expect the quality of the ads to be that much better as a result.  However, I’m sure advertisers will pay more if they feel they can get some sort of edge.