Hacker Newsnew | past | comments | ask | show | jobs | submit | conductrics's commentslogin

A simple Zero analysis Multi-Armed Bandit Reproposing Pearson-Neyman sample size calculations to create simple bandits with no need for dedicated software. Obviously not optimal but often a very good option.


Are you suggesting that revenues fell after bailing on SMB. That is hard to imagine, but I guess possible. Given that the core of the industry is about asking counterfactual questions, I would think the appropriate question would be 'would they be more valuable now if they had not gone to the enterprise - would they even have the deal they did wind up getting?'rather than are they more valuable now then they were 4 years ago. Hard to know, but my guess is that they wouldn't. A simple web editor with a random number generator isn't going to be of interest to anyone looking to buy. The larger problem is that statistical inference is hard - it just is. And, unlike analytics, where you are just placing sensors into an existing system, here you need to also place actuators, so implementation is much more complicated. That means that at any scale, both the marketing team, and the dev/IT teams need to be involved for anyone to get value and not wind up breaking systems all the time. Software like theirs, and ours, isn't magic, and without good editorial and hard work by the client, the entire exercise is more statistical theater than science. And that material fact was always in conflict with the rhetoric that they make AB Testing easy for everyone. It lends itself to a particular type of solutionism that VCs and the larger industry are prone to be seduced by. Self service to SMB clients might be a profitable biz, but perhaps not enough to service such a large amount of venture funding. FWIW VWO also tries compete at the enterprise.


When thinking about what type of approach is best, first think about the nature of the problem. First is it a real optimization problem, IOW are you more concerned with learning an optimal controller for your marketing application? If so then ask: 1) Is the problem/information perishable - for example Perishable: picking headlines for News articles; Not Perishable: Site redesign. If Perishable then Bandit might give you real returns. 2) Complexity: Are you using covariates (contextual bandits, Reinforcement learning with function approximation) or not. If you are, then you might want your targeting model to serve up best the predicted options in subspaces (frequent user types) that it has more experiences in and for it to explore more in less frequently visited areas (less common user types). 3) Scale/Automation: You have tons of transactional decision problems, and it just doesn't scale to have people running many AB Tests.

Often it is a mix - you might use a bandit approach with your predictive targeting, but you also should A/B tests the impact of your targeting model approach vs a current default and/or a random draw. see slides 59-65: http://www.slideshare.net/mgershoff/predictive-analytics-bro...

For a quick bandit overview check out: http://www.slideshare.net/mgershoff/conductrics-bandit-basic...


I read GEB as my subway book, back in 1999 (during a work stint in Paris), during my daily commute to work. Years later, when I was looking for a change, I started thinking about GEB. Even though I had no CS background,I decided to apply to a few AI programs and wound up going to the University of Edinburgh for an Msc. in AI in '05. Now I am a co-founder of software start up that applies reinforcement learning, a method from ai -lower case ;-) to conversion optimization. Obviously, I can't say if you will find it a worthwhile read, but I do look back on it as a significant influence on a major pivot in my life.


I think what makes Hinton surprising is that he has a long established academic lab and so many current top researchers went through that lab. Yann LeCun (ANNs/Deep Learning), Chris Williams, (GPs), Carl Rasmussen (GPs), Peter Dayan (NeuroScince and TD-Learning), Sam Roweis (RIP). As you note, industrial research labs (with Nobel prize winning researchers) have been around at IBM, NEC, and ATT Bell etc. One thing that I think about, is what happens to the quality of research as top folks who have an established record of producing new researchers are pulled from that role? Also not sure about startups having anything to do with with making technology real. Is Google still a startup?


I reached out to Yann LeCun and he emailed me a couple more recent links. I updated the deep learning section of the post to include them. Feel free to check them out.


Thanks! Sure, I didn't mean to imply that once you learn Linear Algebra you are all done - just that it you really need it and will make your life much easier once you do. Yeah, I didn't include a ton of stuff - nothing on EM, trees, boosting, etc. Prob the biggest thing missing is something explicitly about regularization. Maybe I will add something there. Please feel free to add stuff you think others might be interested in down in the comments section of the post. Thanks again for the comments and taking the time to read through it and don't forget to sign up for a free account!


Huh, looks like all of the companies are out of NY and mostly fashion related. Its too bad it was more of a PR piece for these companies, rather than a deeper look at gender and class.


Disclaimer: I also have software for running AB/MVT as well as adaptive control problems (so bandits as well as extended sequential decisions) at www.conductrics.com.

I wouldn't sweat too much UCB methods vs e-greedy or other heuristics for balancing explore/exploit. E-greedy (and e-greedy decreasing) is nice because it is simple. Softmax/Boltzman, is interesting since it is satisfying in that it selects arms weighted by the estimated means, and UCB-Tuned and UCB-Normal are nice because, like AB testing, they take variance measures directly into account when selecting an arm. Take a look at this paper from Doina Precup (who is super nice BTW) and Volodymyr Kuleshov from 2000 http://www.cs.mcgill.ca/~vkules/bandits.pdf they have comparisons between various methods. Guess what - the simple methods work just fine. Of course there are various Bayesion versions - esp of UCB. Nando de Freitas over at UBC has a recent ICML paper on using the Gaussian Process for Bandits (based on a form of UCB). See http://www.cs.ubc.ca/~nando/papers/BayesBandits.pdf I have not given it a tight read, but not sure what the practical return would be. Plus you have to fiddle with picking a Kernel function, and I imagine length scales and the rest of hyper parameters associated with GPs. I did read a working paper from Nando a few years back that used a random forest as a prior - I can't seem to find it now. BTW - John Langford is program chair of this year’s ICML over in Edinburgh. If you are in the UK might be worth it to pop up and attend. Plus Chris Williams is there at Edinburgh, so maybe you can corner him about GPs. Although he has moved on from GPs - he still wrote (well, co-wrote) the book and is one of the smartest people I have ever met.


This is a good coversation. So the Disclaimer, my company Conductrics.com allows you to use algos similar to bandits as well as AB/MVT.

AB can be thought of as more of a form of epoch- greedy - you play uniform random then play greedy. One advantage of e-greedy is that if your environment is not stationary - you are still sampling from the arms - its sort of an insurance premium. To address the differences in reward signals based on the environment, there is the option to model the environment with features - since it maybe that Fridays require a different selection than Sundays - not sure why you would a priori assume that the relative values, or even just the arm rankings are independent of environmental variables.

One other point- if you just want a super quick hack to pick winners (or at least pick from te set of higher performing arms) you can just optimistically seed your estimates for each arm - then just pick greedy. Not claiming it is optimal or anything but it requires almost no coding overhead. Of course you need the problem to be stationary.

Regardless of which algo approach you use, I do think it is useful to at least think in terms of bandits or reimforcemt learning when tackling these problems.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: