Ridho Akbar (@rdmakbar) Bsky

#TIL pymc.Metropolis always assume symmetric proposal distribution (default to normal), therefore does not implement Hastings correction. For a custom asymmetric proposal distribution, PyMC's out-of-the-box Metropolis algorithm will give incorrect sampling. We need to write a custom step_method.

1 year ago 0 0 0 0

Mathematically deriving Metropolis-Hastings algorithm from Markov-Chain's detailed balance equation makes me feel like going back to school. It's thermodynamics.

1 year ago 0 0 0 0

GitHub - ridhoma/monte-cafe: My learning journal studying Markov-Chain Monte Carlo methods My learning journal studying Markov-Chain Monte Carlo methods - ridhoma/monte-cafe

Making my repository, which is my journal in studying MCMC public. Theory, equations, and interpretations as well as code and animations/visualization can be found there.

github.com/ridhoma/mont...

1 year ago 1 0 0 0

GitHub - ridhoma/monte-cafe: My learning journal studying Markov-Chain Monte Carlo methods My learning journal studying Markov-Chain Monte Carlo methods - ridhoma/monte-cafe

There is another stronger condition that is the fundamental of Metropolis-Hastings algorithm called Detailed Balance, which will be my next topic.

All codes are available in
github.com/ridhoma/mont...

1 year ago 0 0 0 0

This is a very fundamental concept for developing MCMC sampling algorithm. Because we want the sampling process to converge to our target distribution, we must make sure the sampling process;
1. Satisfy Markov property / be a Markov-Chain
2. Irreducible
3. Aperiodic

1 year ago 0 0 1 0

Moreover, Markov-Chain must be irreducible and aperiodic to guarantee reaching a Stationary Distribution. Irreducible: every state can be reached from any other state. Aperiodic: the chain does not cycle through states in a fixed period.

1 year ago 0 0 1 0

This interesting properties only applies if the random walk follows a Markov process, that is the probability of transition to the next node/state depends only to the current state. Such graph we may call a Markov-Chain

1 year ago 0 0 1 0

No matter which node we started at, with long enough simulation time, the distribution of visit frequency will converge to the same values called Stationary Distribution. This also applies with larger graphs.

1 year ago 0 0 1 0

The simplest illustration of Markov Chain is a random walk on a graph. Given edge Qij as probability of moving from node-i to node-j, where Qii represents probability of staying at the same node-i. Simulate the random walk, and at each time step, count how many times each node has been visited

1 year ago 0 0 1 0

Moreover, it took only 3 minutes to run this. Soo.. i'm satisfied.

In the scheme of learning MCMC, this exercise is a good familiarization to the concept of rejection sampling, a simple example for Markov-Chain based state proposal, and good introduction to Metropolis-Hasting algorithm

1 year ago 0 0 0 0

Finally, compare the final solution with greedy algorithm and the result is quite good. It gives 12% shorter distance than greedy. According to ChatGPT, greedy usually gives sub-optimal solution that is 10%-20% global optimum. So our solution might be close 😜

1 year ago 0 0 1 0

During the first 1000 iterations, Temperature is kept high to maximize the search space. Then it's gradually lowered to start the optimization. Finally temperature is kept low to fine tune the final solution

1 year ago 0 0 1 0

It took me quite a number of trials to fine-tune the To and the annealing function until I'm satisfied with the outcome. I ended up with this annealing schedule.

1 year ago 0 0 1 0

This cooling down process is called annealing in material science, hence the algorithm is called Simulated Annealing. The success of finding the optimal solution depend on the initial temperature (To) and the annealing schedulefunction, T(t) = f(To, t)

1 year ago 0 0 1 0

We repeat the process and after each iteration we will slowly reduce the temperature (slowly cool down the system). Until the temperature reach ~0 we stop the process and whatever the solution we have is our final solution.

1 year ago 0 0 1 0

We can control how likely we accept the worse solution using temperature parameter T. The higher the temperature, the more likely we accept route even though it is worse than the current route. This encourage exploration of the search space and help escaping local minima

1 year ago 0 0 1 0

Here "energy" is just a fancy term for "total travel distance" of the TSP route. We will accept the proposal if it gives lower energy than the current solution (distance improves). Otherwise, we may still accept it but with probability exponentially drop as the energy difference increase.

1 year ago 0 0 1 0

The algorithm basically starts with a random initial state/solution. Then we change the solution with some perturbation function, we call the the proposed next solution. We either accept/reject the proposal depending the energy of the proposed state compared to current state.

1 year ago 0 0 1 0

Long time no update.. Quite busy lately in the work. But here it is. Applying Markov-Chain Monte Carlo to estimate large Traveling Salesman Problem (N=100).

It's quite fascinating that a very simple algorithm relying on random search can be used to approximate an NP-Hard optimization problem.

1 year ago 1 0 1 0

Studying MCMC from scratch, so far everything is basically physics ???

1 year ago 0 0 0 0

Wkwk siap suhu

1 year ago 0 0 0 0

Monte Carlo Methods and Applications

Bayesian Analysis generally consists of 2 essentials:
1. Assumption/priors building
2. Numerical method to get the posterior

I find this course from CMU a good starting point to learn #2 fundamental levels. I will use this as my guide
gi1242.codeberg.page/cmu-math-cs-...

1 year ago 1 0 0 0

I'd also like to make my github repo public (with all the codes, visualization, equations, and comments), but probbaly later. Dirapihin dulu, malu kalau berantakan

1 year ago 0 0 1 0

However, this randomness behavior is not bad at all. Using a technique called Simmulated Annealing, we can take advantage of this behavior and use Monte-Carlo to efficiently solve optimization problem, e.g. Traveling Salesman Problem. Something I will post next time

1 year ago 0 0 1 0

However, Monte-Carlo is not the best method to estimate pi because it converges very slowly, with rate of 1/√n. Not only that, as more random points are added we might randomly ruin a previously better estimation and end up getting a worse estimation.

1 year ago 0 0 1 0

In the circle-in-a-square model, the reality to reconstruct is a "circle in a square". But with only 1000 points, it barely looks like a circle. With 10,000 now can see some circle. Finally with 100,000 the circle looks clear. Notice as the circle becomes clearer, the more accurate is our pi

1 year ago 0 0 1 0

The Law of Large Number is the theoretical foundation of Monte Carlo method. This law states that as the number of random samples increases, the average of the samples will be closer to reality (math term: true value / expected value). The closer to the reality the better our estimation accuracy

1 year ago 0 0 1 0

In the case of estimating pi above, we throw bunch of points randomly into a 1x1 square., accept (blue) them if they fall inside the quadrant (quarter-circle), otherwise reject them (gray). The number of accepted points is proportionate to the area of the circle, allowing us to estimate pi.

1 year ago 0 0 1 0

Monte Carlo is essentially trying to estimate certain variables/parameters by generating a large number of random samples and accepting/rejecting each based on some criteria. The proportion of accepted samples will reveal the information needed to estimate the parameters.

1 year ago 0 0 1 0

First post in BlueSky. This place seems like perfect to share my learning process of MCMC methods for Bayesian Analysis.

I'll start with the very basics, the first chapter of any books: Stochastically estimating pi using the circle-in-a-square model.

1 year ago 3 0 1 0

Posts by Ridho Akbar