SEM Development

December 14, 2007

AdWords API, Reports, and Python (The Decent, the Ugly and the Good, respectively)

Filed under: Google AdWords, Python — Bob @ 8:28 am

I’ve always been annoyed with the way the ReportService in the AdWords API works with most interpreted languages. Yes, you could say it’s a problem with SOAP (and it is), but I don’t buy that. Google engineers use Python enough that you’d think they would be more sensitive to this problem. The sample code given for reports is … disgusting. If you decide it’s too terrible to look at directly, let me fill you in. They build the XML manually in a string and then do something akin to an sprintf to fill in values. I know it’s just an example, but come on, that’s terrible.

I decided I couldn’t live like this and went about finding a better way. A few quick explanations:

  • There may be a way to do this better in ZSI. I haven’t tried, I’m using SOAPpy.
  • I’ll be using my AdWords client, but you can use any one you want, it’s just the conversion of the report data structure into a SOAPpy type that matters.

Alright, let’s get started, first I’ll set up the client:

import adwords.client as client
 
aw = client.AdWordsClient(
  email='...',
  password='...',
  client_email='...',
  developer_token='...',
  application_token='...',
  user_agent='...'
  )

Now to build the report data structure. Of course, this is just an example so any of the data in this report could be different:

reportJob = {'selectedReportType': 'Account'}
reportJob['aggregationTypes'] = ['Daily']
reportJob['startDay'] = ‘2007-10-31′
reportJob['endDay'] = ‘2007-11-05′
reportJob['selectedColumns'] = [
  'CustomerName',
  'AdWordsType',
  'CPC',
  'CPM',
  'Clicks'
]

I’ll now build a SOAPpy representation of the data. With most API calls you would be able to just plop that data structure into the call and SOAPpy would take care of it. With reports, however, Google makes a small statement at the top of the DefinedReportJob page:

DefinedReportJob   - V11 (AdWords API Developer's Guide)

To do this, you need to create a SOAPpy type and the set some attributes on that type. Here’s how I do it:

reportJob = SOAPpy.Types.structType(reportJob)
reportJob._setAttr('xmlns:impl', 'https://adwords.google.com/api/adwords/v11')
reportJob._setAttr('xsi:type', 'impl:DefinedReportJob')

As you can see, SOAPpy offers up some types. The structType is the best for dict objects, but if you run into other typing problems it supplies array types, string types, numerical types, etc. Now the only thing left to do is schedule the job:

jobId = int(aw.scheduleReportJob(reportJob))

It’s actually better practice to first validate your job with the new validateReportJob function in v11 of the API, but you get the picture.

There’s actually one additional problem I ran into when checking the status of a job. The Java service expects a long variable but if I cast it to long in Python, the API returns an error because it’s passed as a BigInt. I have no idea why it does this, but to get around it, I used some of SOAPpy’s types again:

status = aw.getReportJobStatus(SOAPpy.Types.longType(jobId))

Make sure your jobId variable is an int or SOAPpy will throw an exception.

So, I was able to get reports up and running and Python and I didn’t have to resort to archaic search and replace functionality in my code. I’ll probably be adding this to my AdWords client soon so I don’t have to think about it every time I create a report.

November 15, 2007

Extending the Python AdWords API client to understand “unbounded”

Filed under: Google AdWords, Programming, Python, pyadwords-client — Bob @ 4:48 pm

I’ve been adding some functionality to the sample AdWords API client. After using it for a couple of minutes I found that it does not understand when a list should be returned. In other words, if I call an API method that should return a list, getAdGroupList for example, and only one ad group is returned, it is returned as a single instance, not a list of length one. So now instead of:

for adgroup in service.getAdGroupList(listofAdGroupIds):
  print adgroup.name

I have to do some duck typing on each return value:

adgroups = service.getAdGroupList(listofAdGroupIds)
if not hasattr(adgroups, 'sort'):
  adgroups = [adgroups]
for adgroup in adgroups:
  print adgroup.name

It’s more than doubled the lines of code needed just to call a simple API method. Now, I could just overload the service to turn all returned values into a list. The problem, however, is that some methods aren’t supposed to return a list, updateAdGroup, for example. To make the client understand when a list is needed I have to deconstruct the WSDL. I’ve already been loading the WSDLs to extract API method names so I just need to find where the return value was defined. For the AdWords API, it’s listed pretty far down. Here’s an look at the WSDL where it defines the getAdgroupList response element:

<element name="getAdGroupListResponse">
  <complextype>
    <sequence>
      <element name="getAdGroupListReturn" maxOccurs="unbounded" type="impl:AdGroup"/>
    </sequence>
  </complextype>
</element>

In the getAdgroupListReturn element you’ll see the maxOccurs value is unbounded. This means the return value should be a list. I’m not sure why, but the SOAPpy module doesn’t seem to understand this. To fix the behavior, I first had to find out where SOAPpy was putting the *methodName*Return element in the data structure. This was pretty much trial and error with a lot of dir() commands until I found the culprit. I’m not going to post the convoluted code here but, if you’re interested, you can look through it yourself. The method name is getPluralMethods.

Ok, so I’ve gotten a list of method names that need to return a list. What now? Well, I created a method named expectsList that takes another method as an argument:

def expectsList(self, fn):
  ""Decorator that guarantees that the
  return value of a function is a list
 
  Args:
    fn: function
  Returns:
    function
  """
  def returnList(*args, **kwargs):
    out = fn(*args, **kwargs)
 
    #Quack, quack: duck typing
    if not hasattr(out, 'reverse'):
      if not hasattr(out, 'id'):
        #Empty return? Return an empty list
        return []
      #Single return element? Return it in a list
      return [out]
    #Otherwise, it must already be a list
    return out
  return returnList

In the comments, I call this a decorator although it doesn’t technically follow the decorator syntax in the next bit of code where I wrap it around plural API methods:

plurals = self.getPluralMethods(wsdl)
for meth in wsdl.methods.keys():
  methFn = getattr(service, meth)
  if meth in plurals:
    methFn = self.expectsList(methFn)
  setattr(self, meth, methFn)

So, as you can see I’m wrapping the methods that expect a list in the expectsList function. This will make sure that all data returned from these methods is in the correct format and it’s been working so far.

November 12, 2007

Extending the python AdWords client sample code

Filed under: Google AdWords, Programming, Python, pyadwords-client — Bob @ 5:12 pm

I’ve recently started using Python and I’m really starting to love it. Now, with my reintroduction to SEM, I’ve had to figure out whether I should port my old Perl clients over to Python or just see if there’s already something written in Python. For AdWords, there’s a whole site of code samples in Python. One of these samples is actually a small Python client for AdWords. I loaded it up and started playing around with it. It’s a good starting point but, as usual, I want more.

First, and this is pretty much a port of some Perl code I had, I wanted to be able to access all the API methods from one object. I mean ALL of them. I don’t want to load each service separately. I want to be able to access the getAllAdWordsCampaigns method from the same object I access getAllAdGroups. I know, I know, “What about method name collisions?” or “That’s not a strict interpretation of OO design.” is what I’m hearing. And both are valid concerns, however here’s what I think:

Method name collisions: I haven’t seen any evidence that Google will start overlapping method names in different services. If they had that intention they would have used getAll as a method name in different services, not getAllAdWordsCampaigns, getAllAdGroups, getAllCriteria, and so on and so forth. If, for some reason, they decide to switch things up, I’ll have to rethink, but I’m willing to bet they won’t.

OO Design: This is easy. I don’t care. I don’t like SOAP. I see it as a play by Sun and MS to create such a complicated protocol that people are forced to use their language to implement it completely and correctly. I’ll stick with the view that the API is one big object when I’m replicating between my local db and the remote API. I’ll employ OO techniques when I’m using my local DB as a model layer.

So, I’ve created a small python AdWords API client in Google Code. I’ll be writing about it now and again. Here are some of the first changes I’ve made:

All API methods are loaded as actual methods in the API objects. By that I mean you can just do:

awclient = AdWordsClient(**loginParams)
campaigns = awclient.getAllAdWordsCampaigns(1)

Yup, you don’t need to get any services or call any wrapper “call” methods. Just call the method directly from the client object. I was able to do that by loading all the WSDLs and extracting the methods names. I use the setattr command to add the API call as an actual method. Also, the WSDLs are cached so you don’t have to keep grabbing them remotely.

This project is in its infancy and really just tailored to what I want right now. Of course, I’m always open to suggestions. Again, the code is at http://pyadwords-client.googlecode.com. It’s under the “Source” tab.

May 23, 2006

Google AdWords releases Version 4 of its API

Filed under: Google AdWords — Bob @ 3:00 pm

Some highlights of the recent version upgrade of Google AdWords:

Local Time Zone Support
This is nice.  Now I don’t have to figure out the timezone myself.  I think it makes my code a little more portable since it’s one less thing I have to configure.  I’m not sure if it was part of this release but it looks like the ReportService uses endDay (format YYYY-MM-DD) instead of requiring the time as well in endDate (format YYYY-MM-DDTHH:MM:SS).  Now I don’t have to specify that I want the report from midnight to midnight.

Traffic Estimator

Another change in this service.  I think it highlights how hard forecasting is.  They replace avgPosition with lowerAvgPosition and upperAvgPosition.  Same goes for clickPerDay and cpc estimations.  Next step … changing the values to include insideTheBallpark and wayTheHellOff.

Zero Impression Reporting
This is fantastic.  Now I can see if Google has changed the status of any of my low traffic keywords without grabbing them all with getKeywordList.  A nice savings on my quota.  Unfortunately, it’s not active yet.

Unique Request ID
This may be good for internal auditing but not much else.  I doubt Google is going to use it as a reference number for tech support.

April 25, 2006

AdWords goes to a pay-per-use API system

Filed under: Google AdWords, Optimization, Quota — Bob @ 8:50 am

The AdWords announcement says the change will happen on July 1st, 2006 and will include a $0.25/1000 units charge.  There’s been plenty of speculation about whether this was going to happen but now we finally have an answer.

This was certainly the best thing for Google.  They gave four reasons for the change:

  • Flexibility and scalability
  • Commercialization
  • Standardization
  • Efficiency incentive

I have to say flexibility is the biggest win for me.  My quota has run out before and led to some major headaches.  I’ve run a few numbers and the quota charge will be around 0.5% to 1% of the total revenue for a campaign.  This is negligible but I’d still like to see it fall.  This could be accomplished two ways:

  1. Work harder at reducing quota units used
  2. Work harder on increasing revenue

I think everyone would be better served by me working on #2.

April 13, 2006

AdWords pumping up minimum bid for new keywords

Filed under: Google AdWords, annoyances — Bob @ 10:05 am

For the past two months I’ve seen that when I add new keywords to Google AdWords they will start with a min bid of $5. This, of course, disables all of the newly added keywords and, since it would be daft to increase the bid to $5, they stay disabled. This ridiculously high min bid usually drops after about two days.

I’m not the only one to have noticed this Google AdWords trickery. I’ve always assumed that it was a sneaky way to introduce an editorial review process. By jacking up prices, Google makes it nearly impossible to traffic these keywords therefore giving them time to review the content of the destination page or creative. I’ve seen some other reasonable explanations for this phenomenom as well. The AdWords API group has a thread discussing this problem as well but I disagree that the min bid drops after adding creative. It still stays at $5 for around two days.

Whatever the reason, I wish Google would be a bit clearer about this policy.

January 25, 2006

destinationUrl in KeywordService and CriteriaService behaves oddly

Filed under: Google AdWords, annoyances — Bob @ 6:44 pm

Maybe I’m the only one that sees it this way but I think you should be able to omit the destinationUrl when updating a Keyword or Criteria record. Currently, it is required, and thus omitting it results in a <destinationUrl xsi:nil="true"> being sent. This means your current destinationUrl is erased in favor of the ad group’s default URL. I just think that’s annoying. If I wanted to erase the destinationUrl I’d send a blank or NULL field. If I just don’t want to waste bandwidth by including it in the call, I’d like to be able to omit it. Apparently, that isn’t allowed.

December 2, 2005

Changes to the TrafficEstimatorService

Filed under: Google AdWords, annoyances — Bob @ 6:08 pm

Google’s TrafficEstimatorService has been a joke for a while now. In theory, it’s a great idea. Put in a keyword, get back the estimated impressions, clicks, and even an estimated average rank. The problem lies in the way Google ranks ads. Here’s an explanation right from the horse’s mouth:

Your keyword-targeted ad is ranked on search results and content pages based on its maximum cost-per-click (CPC) - or maximum cost-per-impression (CPM) for site-targeted ads - and Quality Score. Having relevant ad text, a high CPC (or for site-targeted ads, a high CPM), and a strong CTR will result in a higher position for your ad. Because this ranking system uses well-targeted, relevant ads to help determine your ad’s position, your ad can’t be locked out of the top position based solely on price.

Since your rank, and the number of times your ad is shown, is based on your ad text there’s a large area of doubt when Google tries to estimate the number of impressions you’re going to get with a completely new keyword. This and the lackluster display of accuracy so far, I’m assuming, is why they’ve changed it. Here’s the lowdown (or if you prefer to read the whole thread):

Gone:

  • impressions - The estimated number of impressions for a given
    keyword
  • ctr - The estimated click-through-rate for a given keyword.
  • notShownPerDay - The estimated number of times that the ad would not be shown, despite a keyword match

Added:

  • clicksPerDay - The estimated number of clicks generated
    per day for a keyword in a given ad group

I have to say, if this makes the system more accurate, I like this change. It keeps the main reason I use the estimation service intact, that is, to forecast the amount of money it’ll take to traffic a new keyword (or a few million). What it doesn’t do is try to estimate the relevancy of a keyword before it’s added to the campaign. That’s fine with me. I like to determine the relevance of a keyword by the amount of money it’s profiting, not by its CTR.

I’ll get back to you on whether the new system is actually more accurate …

November 29, 2005

addKeyword v. updateKeyword

Filed under: Google AdWords, Optimization, Quota — Bob @ 4:41 pm

According to Google’s quota allowances addKeyword consumes 50 quota units while updateKeyword only consumes 10 units. On many underperforming keywords I’ve begun to lower the maxCpc value to 10,000 microns (.01 cents) instead of altering the status to “Deleted”. In most cases this will cause Google to deactivate the keyword. If the keyword’s minCpc is low enough that even a .01 cent maxCpc doesn’t deactivate it and it’s still losing money, it’s unlikely it will ever turn a profit. In this boundary case, deleting it may not be a bad idea.

At some future date, if research determines that these deactivated keywords will be profitable at a higher maxCpc you can activate them for 1/5th of the quota that you would expend if you had deleted them.

A quick note: Even when a keyword is marked “Inactive” it will be active in the contextual search. To avoid this scenario, turn off contextual search.

Another note: Using the aggregate functions updateKeywordList or addKeywordList will not save your quota. Both use the same number of quota units as their singular brethren.