Project 1999 - View Single Post

bcbrown · #5 08-11-2023, 07:16 PM

Quote:

Originally Posted by Troxx [You must be logged in to view images. Log in or Register.]

I’m glad you said all that. I’m just sad I had such a hard time understanding it. Lots of big words and unfamiliar terminology I had to google.

Optimization needs:

Some outcome you want to minimize (or maximize), and a way to measure it
A set of parameters to fiddle with

Multi-armed bandits: Imagine you're at a casino with a row of slot machines, which are sometimes called "one-armed bandits". Each slot machine has a different unknown payout rate. How should you choose which machine to play (exploit), and when should you choose to stick with the current machine versus move to another machine (explore)? Epsilon-greedy is one strategy for how to choose when to explore and when to exploit.

Recursive gradient descent: Imagine the space of possible outcomes is a valley, and you're somewhere on the side of that valley. Play an iterated game: from where you are currently, figure out which direction is "most downhill"; take a step in that direction, then repeat.

Simulated annealing: imagine there's a small pond or low spot or "false valley" on the hillside of the valley. To avoid getting stuck there and never finding the actual valley, change the size of your step, and each time add a small random jump. That way, you have a chance of getting out of the false valley and eventually finding the real valley. The metaphor is with metallurgy, where the final crystal structure is a function of the rate of cooling of the hot metal; you add a "temperature" parameter that automatically lowers over the course of the iterations, so that you take smaller and smaller steps with less random jumps. Instead of metallurgy you can also think about tempering chocolate when cooking.

Banana problem: Say you want to recommend some food item to a shopper on your grocery site. You might start by saying, based on the current item the shopper is looking at, what is the item that has been historically bought most often by people who bought that item? This is the "people who bought this also bought that" set of recommendations on Amazon. The problem is, bananas are so popular that no matter what item you're looking at, it's likely that the item most commonly paired with that item is bananas. So you need to adjust the ranking to account for the overall popularity of each item. So the new question is: given this item, what items are bought along with it at a rate above the base rate.

Collaborative recommendation systems: We will recommend something for you based on the past behavior of other shoppers.